Enable strict context check and fix some context issues. #2637

anuraaga · 2021-03-26T05:18:25Z

No description provided.

anuraaga · 2021-03-26T05:18:36Z

.../src/main/java/io/opentelemetry/instrumentation/awssdk/v2_2/TracingExecutionInterceptor.java

@@ -168,6 +164,10 @@ public void onExecutionFailure(
  }

  private void clearAttributes(ExecutionAttributes executionAttributes) {
+    Scope scope = executionAttributes.getAttribute(SCOPE_ATTRIBUTE);


I don't get it. Both why this change is significant and your reaction to it :D

It's a real context leak with big damage - if sync AWS SDK request failed and a subsequent client request of some form was made, it would parent to the AWS SDK request (well it'd be suppressed as a result usually)! Hooray for strict context checking

anuraaga · 2021-03-26T05:19:13Z

...y/src/main/java/io/opentelemetry/instrumentation/rocketmq/TracingConsumeMessageHookImpl.java

-    Context traceContext = tracer.startSpan(Context.current(), context.getMsgList());
-    ContextAndScope contextAndScope = new ContextAndScope(traceContext, traceContext.makeCurrent());
-    context.setMqTraceContext(contextAndScope);
+    Context otelContext = tracer.startSpan(Context.current(), context.getMsgList());


Like similar netty libraries, there's no thread guarantee for before and end. Since there's no downstream instrumentation it's not a big deal anyways

What about e.g. user-provided message hooks that want to augment existing span?

We will need to provide something like RocketMqTracing.getOpenTelemetryContext(ConsumeMessageContext) to let them extract it I guess. In the meantime, if someone had such code using Context.current() in consumeMessageAfter it would generally be buggy with the chance of Context.current() not actually corresponding to the hook's message due to context leakage so this fix is important first.

I hope such hooks will be able to use our instrumentation-api instead of that though.

anuraaga · 2021-03-26T05:49:52Z

.../io/opentelemetry/javaagent/instrumentation/api/concurrent/ExecutorInstrumentationUtils.java

+      return false;
+    }
+
+    // Wrapper of tasks for dispatch - the wrapped task should have context already and this doesn't


No idea what I'm talking about! But seems like a reasonable hypothesis for now, how akka flakes.

trask · 2021-03-26T05:41:16Z

instrumentation/akka-http-10.0/javaagent/akka-http-10.0-javaagent.gradle

@@ -62,3 +62,7 @@ compileVersion101TestGroovy {
  classpath = classpath.plus(files(compileVersion101TestScala.destinationDir))
  dependsOn compileVersion101TestScala
 }
+
+tasks.withType(Test) {
+  jvmArgs "-Dio.opentelemetry.javaagent.shaded.io.opentelemetry.context.enableStrictContext=false"


nice to see where this fails

trask · 2021-03-26T05:43:31Z

.../src/main/java/io/opentelemetry/instrumentation/awssdk/v2_2/TracingExecutionInterceptor.java

@@ -168,6 +164,10 @@ public void onExecutionFailure(
  }

  private void clearAttributes(ExecutionAttributes executionAttributes) {
+    Scope scope = executionAttributes.getAttribute(SCOPE_ATTRIBUTE);


trask · 2021-03-26T05:52:25Z

.../io/opentelemetry/javaagent/instrumentation/api/concurrent/ExecutorInstrumentationUtils.java

@@ -14,6 +14,43 @@
 /** Utils for concurrent instrumentations. */
 public class ExecutorInstrumentationUtils {

+  private static final ClassValue<Boolean> NOT_INSTRUMENTED_RUNNABLE_ENCLOSING_CLASS =


nice, i'm trying to remember to look out for opportunities to use also 😄

anuraaga · 2021-03-26T06:42:51Z

Sorry for the churn, couldn't help but fix some of the leaks. The remaining will take some deeper dives so I think this is it for now

iNikem · 2021-03-26T06:43:50Z

.../src/main/java/io/opentelemetry/instrumentation/awssdk/v2_2/TracingExecutionInterceptor.java

@@ -168,6 +164,10 @@ public void onExecutionFailure(
  }

  private void clearAttributes(ExecutionAttributes executionAttributes) {
+    Scope scope = executionAttributes.getAttribute(SCOPE_ATTRIBUTE);


I don't get it. Both why this change is significant and your reaction to it :D

iNikem · 2021-03-26T06:45:36Z

...y/src/main/java/io/opentelemetry/instrumentation/rocketmq/TracingConsumeMessageHookImpl.java

-    Context traceContext = tracer.startSpan(Context.current(), context.getMsgList());
-    ContextAndScope contextAndScope = new ContextAndScope(traceContext, traceContext.makeCurrent());
-    context.setMqTraceContext(contextAndScope);
+    Context otelContext = tracer.startSpan(Context.current(), context.getMsgList());


What about e.g. user-provided message hooks that want to augment existing span?

anuraaga · 2021-03-26T08:30:50Z

Unfortunately not able to repro CI failures well but it's nice that it finds them. As far as I can tell there might be something fundamentally wrong with our netty instrumentation, will examine separately first.

mateuszrzeszutek · 2021-03-26T09:12:55Z

...on/src/main/groovy/io/opentelemetry/instrumentation/test/InstrumentationSpecification.groovy

@@ -36,6 +37,13 @@ abstract class InstrumentationSpecification extends Specification {
    testRunner().clearAllExportedData()
  }

+  def cleanup() {


Hmm, perhaps we should consider adding an afterTest() method to InstrumentationTestRunnerand calling it here and in InstrumentationExtension?

Thanks for pointing out InstrumentationExtension - adding afterTest seemed a bit weird since it's the same code for agent and library. For now I just copied the code since it isn't so much and I think we can revisit later.

mateuszrzeszutek · 2021-03-26T09:15:57Z

...y/src/main/java/io/opentelemetry/instrumentation/rocketmq/TracingConsumeMessageHookImpl.java

@@ -27,20 +27,18 @@ public void consumeMessageBefore(ConsumeMessageContext context) {
    if (context == null || context.getMsgList() == null || context.getMsgList().isEmpty()) {
      return;
    }
-    Context traceContext = tracer.startSpan(Context.current(), context.getMsgList());
-    ContextAndScope contextAndScope = new ContextAndScope(traceContext, traceContext.makeCurrent());


Can ContextAndScope be removed now?

Yup, thanks

mateuszrzeszutek · 2021-03-26T09:26:14Z

.../io/opentelemetry/javaagent/instrumentation/api/concurrent/ExecutorInstrumentationUtils.java

+        protected Boolean computeValue(Class<?> enclosingClass) {
+          // Avoid context leak on jetty. Runnable submitted from SelectChannelEndPoint is used to
+          // process a new request which should not have context from them current request.
+          if (enclosingClass.getName().equals("org.eclipse.jetty.io.nio.SelectChannelEndPoint")) {


Just an idea, but maybe those class names should be specified by instrumentation modules? Each instrumented library may bring its own set of excluded classes (also see AdditionalLibraryIgnoresMatcher) and maybe it'd better to define them in the lib instrumentation.

Yeah! That's the approach Datadog has implemented, it would be nice for us to follow it at some point.

https://github.com/DataDog/dd-trace-java/blob/master/dd-java-agent/agent-tooling/src/main/java/datadog/trace/agent/tooling/ExcludeFilterProvider.java

Nice - I originally thought about adding some void excludeClasses(ExcludeBuilder) method to InstrumentationModule

…nstrumentation into strict-context

jkwatson · 2021-03-28T00:50:34Z

instrumentation/ratpack-1.4/javaagent/ratpack-1.4-javaagent.gradle

@@ -20,3 +20,7 @@ dependencies {
    testImplementation group: 'com.sun.activation', name: 'jakarta.activation', version: '1.2.2'
  }
 }
+
+tasks.withType(Test) {
+  jvmArgs "-Dio.opentelemetry.javaagent.shaded.io.opentelemetry.context.enableStrictContext=false"


does this mean that our ratpack instrumentation actually leaks context, or is it an artifact of something else? If we are really leaking contexts, should we create issues to fix all of these, so we make sure to circle back and fix them?

Good point, filed an issue

Enable strict context check and fix some context issues.

813fd73

anuraaga requested review from iNikem, jkwatson, mateuszrzeszutek, pavolloffay, trask and tylerbenson as code owners March 26, 2021 05:18

Drift

8a6fc39

anuraaga commented Mar 26, 2021

View reviewed changes

Anuraag Agrawal added 2 commits March 26, 2021 14:23

Drift and cache

55aa488

Exclude grizzly include akka

b97c671

anuraaga commented Mar 26, 2021

View reviewed changes

trask approved these changes Mar 26, 2021

View reviewed changes

Anuraag Agrawal added 4 commits March 26, 2021 15:08

Grizzly, scala

be61373

ForkJoin worker

b2232ce

webflux comment, grizzly typo

ecaa951

Give up on akka for now

e49eef7

iNikem approved these changes Mar 26, 2021

View reviewed changes

Anuraag Agrawal added 2 commits March 26, 2021 15:59

threadpool

9d5adcd

Fallback on grizzly, wait for completion in executor cancellation tests

be8bfe4

mateuszrzeszutek approved these changes Mar 26, 2021

View reviewed changes

Anuraag Agrawal added 3 commits March 27, 2021 17:00

Merge branch 'main' of github.com:open-telemetry/opentelemetry-java-i…

33a9fe6

…nstrumentation into strict-context

Hystrix

2d32b4e

ratpack

dd99146

jkwatson reviewed Mar 28, 2021

View reviewed changes

Cleanups

b4b89ef

iNikem merged commit dcd316d into open-telemetry:main Mar 29, 2021

trask mentioned this pull request Apr 3, 2021

Enable strict context check in tests #2709

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable strict context check and fix some context issues. #2637

Enable strict context check and fix some context issues. #2637

anuraaga commented Mar 26, 2021

anuraaga Mar 26, 2021

trask Mar 26, 2021

iNikem Mar 26, 2021

anuraaga Mar 26, 2021

anuraaga Mar 26, 2021

iNikem Mar 26, 2021

anuraaga Mar 26, 2021

anuraaga Mar 26, 2021

trask Mar 26, 2021

trask Mar 26, 2021

trask Mar 26, 2021

anuraaga commented Mar 26, 2021

iNikem Mar 26, 2021

iNikem Mar 26, 2021

anuraaga commented Mar 26, 2021

mateuszrzeszutek Mar 26, 2021

anuraaga Mar 29, 2021

mateuszrzeszutek Mar 26, 2021

anuraaga Mar 29, 2021

mateuszrzeszutek Mar 26, 2021

anuraaga Mar 27, 2021

anuraaga Mar 27, 2021

mateuszrzeszutek Mar 29, 2021

jkwatson Mar 28, 2021

anuraaga Mar 29, 2021

Enable strict context check and fix some context issues. #2637

Enable strict context check and fix some context issues. #2637

Conversation

anuraaga commented Mar 26, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anuraaga commented Mar 26, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anuraaga commented Mar 26, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment