Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure AWS SDK spans suppress client spans all the time. #1837

Merged
merged 6 commits into from Dec 7, 2020

Conversation

anuraaga
Copy link
Contributor

@anuraaga anuraaga commented Dec 6, 2020

Also explicitly populate aws header instead of global propagators.

A lot to process here.

Currently, AWS SDK instrumentation always breaks the signature because transport spans aren't suppressed.

When separating out library / javaagent instrumentation ages ago, I was still very green in this repo and don't think I understood the point of the testImplementation dependencies on downstream instrumentation and didn't port them over. So for a while we haven't verified suppression of transport spans.

Sync: This was working in 0.10.0 and stopped working in 0.11.0. I haven't dug into why but presume some refactoring around client tracers is related. We would have caught the issue if I hadn't broken the tests in the first place.

Async: I suspect this has been broken for a long time, but none of our (e.g., aws-otel-java) integration tests use async clients yet. The docs from age-old code in the instrumentation seemed to indicate context is propagated to Netty properly, though extremely hackily, so I removed the even more hacky amz-sdk-invocation-id header check that we used to have to prevent signature breakage. But after adding the downstream instrumentation to the tests, I found that this wasn't the case. Using an IntelliJ debugger, I could confirm beforeTransmission is run on the application thread, not Netty thread, so the complicated instrumentation hack was having no effect (our executor instrumentation does not apply to netty event loops so I wouldn't expect those to ever have been helping here. Anyways, it's too deep to rely on IMO.). I went ahead and removed the hacky instrumentation and moved the scope management to the interceptor for use in synchronous clients only. For the netty spans, I restored the fun amz-sdk-invocation-id as no alternative seems clearly better.

I also went ahead and had the SDK explicitly use aws propagation instead of the global ones. Unlike lambda, there is no use case for any other header, probably not for the next several years really, so this will allow a user's app to be more efficient, they don't have to enable aws globally (meaning on their own ingestion and extraction), and still get the benefit it provides on AWS SDK calls. This makes the integration test confirming that headers haven't been added (breaking signature, the worst thing our instrumentation could ever do) more robust in the process.

public static class ScopeHolder {
public static final ThreadLocal<Scope> CURRENT = new ThreadLocal<>();
}

private final ExecutionInterceptor delegate;

public TracingExecutionInterceptor() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think after #1643 I may try to remove this class, instead just service loading the shaded library instrumentation.

anuraaga and others added 2 commits December 7, 2020 15:38
…ntelemetry/javaagent/instrumentation/netty/v4_1/client/NettyHttpClientTracer.java

Co-authored-by: Trask Stalnaker <trask.stalnaker@gmail.com>
Comment on lines 73 to 80
// The AWS SDK uses Netty for asynchronous clients but constructs a request signature before
// beginning transport. This means we MUST suppress Netty spans we would normally create or
// they will inject their own trace header, which does not match what was present when the
// signature was computed, breaking the SDK request completely We have not found how to
// cleanly propagate context from the SDK instrumentation, which executes on an application
// thread, to Netty instrumentation, which executes on event loops. If it's possible, it may
// require instrumenting internal classes. Using a header which is more or less guaranteed to
// always exist is arguably more stable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

OpenTelemetry.getGlobalPropagators()
.getTextMapPropagator()
.inject(otelContext, builder, AwsSdkInjectAdapter.INSTANCE);
AwsXRayPropagator.getInstance().inject(otelContext, builder, AwsSdkInjectAdapter.INSTANCE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if you are sure that AWS SDK calls are always terminal and that nobody uses some alternative implementations of AWS services (like local S3 server?), we at least have to document this behaviour very clearly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got a bit of extra README out of it too :)

@anuraaga anuraaga merged commit c5a2576 into open-telemetry:master Dec 7, 2020
bhautikpip pushed a commit to bhautikpip/opentelemetry-java-instrumentation that referenced this pull request Dec 18, 2020
…emetry#1837)

* Make sure AWS SDK spans suppress client spans all the time.

* Make more consistent with other instrumentation

* Update instrumentation/netty/netty-4.1/javaagent/src/main/java/io/opentelemetry/javaagent/instrumentation/netty/v4_1/client/NettyHttpClientTracer.java

Co-authored-by: Trask Stalnaker <trask.stalnaker@gmail.com>

* more dragons

* Grammar

* README

Co-authored-by: Trask Stalnaker <trask.stalnaker@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants