Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle lazily/eagerly started coroutines differently #3

Conversation

monosoul
Copy link
Owner

@monosoul monosoul commented Jan 8, 2023

This changes the way we capture active scope.

  • For eagerly started coroutines the scope will be captured on instantiation
  • For lazily started coroutines the scope will be captured on start

This doesn't yet include changes done to ScopeState here: 542e6a8

But I'll merge that if you think this approach is fine.

@bantonsson

@monosoul monosoul mentioned this pull request Jan 8, 2023
@bantonsson
Copy link

I'm not completely sure about if the difference in behavior will be confusing or if this is more correct.

@monosoul
Copy link
Owner Author

monosoul commented Jan 16, 2023

@bantonsson Tbh, I think with the fact that there is no guarantee a lazily started coroutine will ever be started/cancelled there isn't much we can do about it. Or rather, there might be solutions, but they are gonna be complicated. I personally would rather have a predictable behavior that I might not be very happy about than a behavior that works in some cases and doesn't work in others. And that could be delivered sooner 🙂
In my 4 years of writing backend apps in Kotlin I have never used lazily started coroutines tbh, maybe this use case is more relevant for Android apps, where they won't gonna use a java agent anyway.
So I suggest to go with this approach as an MVP and see if the users are happy with it. If they don't then we might think of a way to try to make the behavior consistent across different start types.

Upd: just read your message that it's okay to capture a span and never start the coroutine. So I guess this can be disregarded. 🙂

Copy link

@bantonsson bantonsson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great work. Only some minor comments.

implements Instrumenter.ForTypeHierarchy {

public AbstractCoroutineInstrumentation() {
super("kotlin-abstract-coroutine");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having completely different names for all the different parts of the kotlin_coroutine instrumentation is only needed if you want to enable/disable the individual parts by themselves, which I don't think we want here. Since we're not completely sure about how this works in real applications yet, we should probably name it kotlin_coroutine.experimental, and also override the defaultEnabled() method in the instrumentation classes and return false, so it is an opt-in for now.

Copy link
Owner Author

@monosoul monosoul Jan 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: d3962bc , d3c9c57

@Nullable private ContinuationHandler continuationHandler;
@Nullable private AgentScope.Continuation continuation;
@Nullable private AgentScope continuationScope;
private Boolean isInitialized = false;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick, this will be a Boolean object instead of a boolean.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for noticing, done: a0fb62f

private final ScopeState coroutineScopeState;
@Nullable private ContinuationHandler continuationHandler;
@Nullable private AgentScope.Continuation continuation;
@Nullable private AgentScope continuationScope;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that these variables can still be accessed concurrently by multiple threads and should be using AtomicReference

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're right, I'll change that

Copy link
Owner Author

@monosoul monosoul Jan 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bantonsson
Altho, now that I went through the code again and through thread context element docs (mostly the comments here), I really don't see how there could be any concurrent writes there.

First call to updateThreadContext will always be "synchronous" it can't happen in parallel with restoreThreadContext. We initialize those fields on the first call only, later on there are only reads.

Then there's also coroutineScopeState, but we don't access it at all on call to restoreThreadContext, instead we write the ScopeStack to the thread local variable using ScopeState.

The only possible reason why we can have concurrent calls to updateThreadContext/restoreThreadContext is if there's a bug somewhere else. And there was one! 🙂

When you call withTimeout internally it creates a new coroutine of type TimoutCoroutine and it turns out it's handled a bit differently than others. First of all when it is created, there's no invocation of CoroutineContextKt.newCoroutineContext (so we weren't creating an instance of ScopeStateCoroutineContext for it and it was inheriting this item from the parent context, probably this is what was causing concurrent access to the context element). Second problem was that it is handled a bit differently from others, in a way that it's not guaranteed to start, in some cases it just runs the code block without dispatch. In such cases the on completion callback won't run. Luckily, there's another method available in JobSupport class that we can instrument to guarantee on completion callback execution even when TimeoutCoroutine hasn't been dispatched - onCompletionInternal.

So here 27e144e I did a few changes:

  • ScopeStateCoroutineContext is now created on invocation of AbstractCoroutine constructor instead of CoroutineContextKt.newCoroutineContext, this way we guarantee every coroutine started will have a new instance of ScopeStateCoroutineContext and will not inherit it from the parent coroutine, so we shouldn't have any concurrent access to ScopeStateCoroutineContext instance.
  • maybeCloseScopeAndCancelContinuation is now guaranteed to be invoked when coroutine transitions to a terminal state (Cancelled or Completed)
  • also removed the optional scope propagation I added here (48922a7) before I saw your comment 🙂

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch with the TimeoutCoroutine and newCoroutineContext. That explains why multiple threads were accessing the same instance.

if (!isInitialized) {
final AgentScope activeScope = AgentTracer.get().activeScope();
if (activeScope != null) {
activeScope.setAsyncPropagation(true);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we shouldn't force this on the active scope, but only capture it iff it isAsyncPropagating().

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should make it an option? I.e. those who'd like to propagate scopes to coroutines by default will just enable this option, while others might have a more granular control over it?

Copy link

@bantonsson bantonsson Jan 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@monosoul Automatic async propagation is already the default. If the active scope has its async propagation set to false it is (hopefully) for a good reason. The other big switch will be the kotlin_coroutine.experimental name that enables/disables the integration completely, and it becomes available as dd.integration.kotlin_coroutine.experimental.enabled and DD_INTEGRATION_KOTLIN_COROUTINE_EXPERIMENTAL_ENABLED.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, I see. Okay, cool, thanks for the explanation! 👍

Signed-off-by: monosoul <Kloz.Klaud@gmail.com>
Signed-off-by: monosoul <Kloz.Klaud@gmail.com>
Signed-off-by: monosoul <Kloz.Klaud@gmail.com>
Signed-off-by: monosoul <Kloz.Klaud@gmail.com>
…en coroutine completes

Signed-off-by: monosoul <Kloz.Klaud@gmail.com>
Signed-off-by: monosoul <Kloz.Klaud@gmail.com>
@monosoul monosoul merged commit 2f062e6 into feature/kotlin-coroutines-instrumentation Jan 26, 2023
@monosoul monosoul deleted the feature/kotlin-coroutines-instrumentation-2 branch December 18, 2023 12:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants