Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration with Amazon CloudWatch logs and metrics #110

Merged
merged 12 commits into from
Nov 20, 2019

Conversation

willarmiros
Copy link
Contributor

@willarmiros willarmiros commented Nov 14, 2019

Overview

The AWS X-Ray team had several requests from customers to gain deeper insights into their X-Ray traces and link those traces to relevant logs for quicker root cause analysis. We are announcing a suite of features that allow users of this SDK to more efficiently parse logging and trace data and view metrics on aggregations of segments. These features are designed to improve customers’ experience when using X-Ray and drive down their mean time to resolution.

The below features will be released in version 2.4.0 of the Java SDK. Supporting documentation and onboarding instructions will be released at that time as well. As always, we welcome any feedback on these new features!

Features

Segment-Level Metrics

This feature automatically generates metrics about X-Ray segments that you can view on your CloudWatch dashboard like any other metric. Metrics to be recorded for each segment are latency in milliseconds as well as throttle, fault, error, and OK rates. This feature requires the CloudWatch Agent to be enabled in your environment.

Trace ID Injection into Logs

The X-Ray recorder injects the current trace ID into each logging statement that happens while that trace is open. This makes logs easy to query if a trace that represents a problem with your application is known. Simply search for the problematic trace’s ID in a CloudWatch log group and all logging events relevant to that trace will be returned. This feature will be compatible with SLF4J and Log4J 2 logging frontends.

Log Group Correlation in Traces

AWS X-Ray uses the CloudWatch Agent to automatically include the log group(s) that your application writes to in traces. The log groups will be specific to the environment they originated from and viewable in the “Raw Data” tab of trace view on the AWS X-Ray console. This feature requires the correct service plugin to be attached to your X-Ray recorder. Currently, only native EC2 instances with the CloudWatch agent enabled and EKS clusters using Container Insights will automatically be able to reflect their log groups in your traces.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

c1tadel and others added 10 commits November 14, 2019 14:22
* Create AWSLogReference type
* Changed Plugin interface to include log references
* Update RecorderBuilder to use these plugins
* Create EKS plugin w/ support for Container Insights logging
* Add list of log references to Recorder
* Publish log references to metadata
* Entity and Plugin testing

Signed-off-by: William Armiros <armiros@amazon.com>
* Added SegmentListener interface
* Updated recorderBuilder to include listeners
* Inserted calls to listener during segment lifecycle events
* Created SLF4J logging module
* Implemented SegmentListener for SLF4J
* Added unit test and javadocs for new entities

Signed-off-by: William Armiros <armiros@amazon.com>
Signed-off-by: William Armiros <armiros@amazon.com>
* Create a Metrics Formatter which generates structured log messages based on Segments
* Implement the EMF format for CloudWatch logs
* Emit EMF over stdout for use with Lambda
* Update listener plugin to capture segments after they are finalized
* Update existing listeners to use new method names

Signed-off-by: William Armiros <armiros@amazon.com>
* Implemented populateLogReferences in EC2
* added logic to detect CW agent on EC2 in Windows or Linux environments
* Documented changes

Signed-off-by: William Armiros <armiros@amazon.com>
* Add a UDPMetricEmitter
* Select stdout emitter vs UDP based on AWS_EXECUTION_ENV
* Allow configuration of UDP emitter via system properties and environment variables
* Modify dummy segment to emit metrics even when segments aren’t sampled
* Update Recorder to set DummySegment name
* Update servlet filter to set DummySegment name
* Add debug logging around Dummy Segment
* Add debug logging around metrics formatting

Signed-off-by: William Armiros <armiros@amazon.com>
* Added isEnabled() to plugin interface
* Gave each plugin a cheap, reliable implementation of isEnabled()
* Changed recorder builder to use enabled logic
* Added origin resolution order and supporting documentation
* Added withDefaultPlugins method

Signed-off-by: William Armiros <armiros@amazon.com>
* Added full container ID for containerized plugins

Signed-off-by: William Armiros <armiros@amazon.com>
* Update namespace for metrics
* Move metrics to separate Maven component
* Add 5s timeout to ContainerInsights HTTP client
* Wrap log group and context discovery in exception handling
* Fix Javadoc formatting

Signed-off-by: William Armiros <armiros@amazon.com>
* Fix resource leak in ContainerInsightsUtil
* Update version in POMs
* Javadoc editing

Signed-off-by: William Armiros <armiros@amazon.com>
@c1tadel
Copy link
Contributor

c1tadel commented Nov 14, 2019

@willarmiros I've added @chanchiem for review since we've been working together to author this.

@willarmiros willarmiros merged commit c183a3a into aws:master Nov 20, 2019
@mouse256
Copy link

Wondering, why is the MDC key also prepended to the value in the logger? Seems strange to me:
MDC.put(TRACE_ID_KEY, TRACE_ID_KEY + ": " + segment.getTraceId().toString());

@willarmiros
Copy link
Contributor Author

Hi @mouse256,
This decision was made to easily query all logging messages that contain a trace ID. Since it is possible that logging events occurred while no segment was active, we wanted to provide a way to easily separate those log events from the ones that do include a trace ID.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants