Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coroutines will not nest properly past one level when using the global hub #1822

Closed
7 tasks done
acdcjunior opened this issue Nov 30, 2021 · 9 comments
Closed
7 tasks done
Labels
enhancement New feature or request performance Performance API issues Platform: Java ThreadLocalHubProblem Issues caused by Hub being ThreadLocal and no or no easy way to clone the hub

Comments

@acdcjunior
Copy link

Platform:

  • Java -> 11/11
  • Kotlin -> 1.5.21 (target jdk 11)

IDE:

  • IntelliJ -> 2021.2.2

Build system:

  • Gradle -> 6.8.3

Android Gradle Plugin:

  • No

Sentry Android Gradle Plugin:

  • No

Platform installed with:

  • Maven Central

The version of the SDK:
5.4.2


I have the following issue:

Coroutines don't nest properly.

From what I could see in the debugger, any coroutine (even when using SentryContext(), which means it has its own Hub) when creating a Span creates in the main transaction, therefore the nesting makes no sense.

It seems this issue happens because structure used to track spans is flat and not a tree.

Steps to reproduce:

import io.sentry.ITransaction
import io.sentry.Sentry
import io.sentry.kotlin.SentryContext
import kotlinx.coroutines.delay
import kotlinx.coroutines.launch
import kotlinx.coroutines.runBlocking


fun main() {
    Sentry.init { options ->
        options.dsn = "https://5f54809ec38146f19f0eb3e49d1016d4@o533347.ingest.sentry.io/6088339"
        options.tracesSampleRate = 1.0
        options.setDebug(true)
    }

    val transaction: ITransaction = Sentry.getCurrentHub().startTransaction("my-coroutines-transaction5", "outer op")
    Sentry.configureScope { scope ->
        scope.transaction = transaction
    }

    runBlocking(SentryContext()) {
        launch(SentryContext()) {
            val level1a = Sentry.getCurrentHub().span!!.startChild("level-1a")
            launch(SentryContext()) {
                val level1a1 = Sentry.getCurrentHub().span!!.startChild("level-1a-1")
                delay(250)
                level1a1.finish()
            }
            launch(SentryContext()) {
                val level1a2 = Sentry.getCurrentHub().span!!.startChild("level-1a-2")
                delay(200)
                level1a2.finish()
            }

            level1a.finish()
        }
        launch(SentryContext()) {
            val level1b = Sentry.getCurrentHub().span!!.startChild("level-1b")

            launch(SentryContext()) {
                val level1b1 = Sentry.getCurrentHub().span!!.startChild("level-1b-1")
                delay(100)
                level1b1.finish()
            }
            launch(SentryContext()) {
                val level1b2 = Sentry.getCurrentHub().span!!.startChild("level-1b-2")
                delay(50)
                level1b2.finish()
            }

            level1b.finish()
        }
    }
    transaction.finish()

}

Actual result:
image

Expected result:

  • I would expect a better nesting in the left-side tree.

I faced this issue when configuring sentry-openfeign. In my code, there are some coroutines that call feign clients (which are run wrapped in launch), this all happens in parallel. This makes the nesting get all mixed up. It's a spring-boot app.

Perhaps you could argue I shouldn't be getting the span from the "global hub" (Sentry.getCurrentHub().span), but notice the snippet above is just a minimal reproduction of the actual case. In the actual case, the issue arises when SentryFeignClient creates mixed nested entries. Therefore the snippet aboves use the same method to obtain the current span as SentryFeignClient.

The problem seems that SentryContext() clones the Hub, but all cloned hubs have the same transaction, thus when one coroutine adds a span, if other coroutine (which is running in parallel) attempts to add a new span, it is added to the last active span, and not the span of when the hub was cloned (or other level nested to such span).

@marandaneto
Copy link
Contributor

cc @maciejwalkowiak

@maciejwalkowiak
Copy link
Contributor

This is a limitation coming from the design that there can be only single active span bound to the scope. In your example, you would need to be explicit on what is the parent span for new child spans:

fun main() {
    Sentry.init { options ->
        options.dsn = "https://5f54809ec38146f19f0eb3e49d1016d4@o533347.ingest.sentry.io/6088339"
        options.tracesSampleRate = 1.0
        options.isDebug = true
    }

    val transaction: ITransaction = Sentry.getCurrentHub().startTransaction("my-coroutines-transaction5", "outer op")

    runBlocking(SentryContext()) {
        launch(SentryContext()) {
            val level1a = transaction.startChild("level-1a")
            launch(SentryContext()) {
                val level1a1 = level1a.startChild("level-1a-1")
                delay(250)
                level1a1.finish()
            }
            launch(SentryContext()) {
                val level1a2 = level1a.startChild("level-1a-2")
                delay(200)
                level1a2.finish()
            }

            level1a.finish()
        }
        launch(SentryContext()) {
            val level1b = transaction.startChild("level-1b")

            launch(SentryContext()) {
                val level1b1 = level1b.startChild("level-1b-1")
                delay(100)
                level1b1.finish()
            }
            launch(SentryContext()) {
                val level1b2 = level1b.startChild("level-1b-2")
                delay(50)
                level1b2.finish()
            }

            level1b.finish()
        }
    }
    transaction.finish()

}

@acdcjunior
Copy link
Author

acdcjunior commented Dec 1, 2021

@maciejwalkowiak

Hey, Maciej, thanks for the feedback.

Yes, if I did it manually, that's what I would have done. But this scenario I'm facing is actually taking place within sentry's Feign client implementation, see my comment:

Perhaps you could argue I shouldn't be getting the span from the "global hub" (Sentry.getCurrentHub().span), but notice the snippet above is just a minimal reproduction of the actual case. In the actual case, the issue arises when SentryFeignClient creates mixed nested entries. Therefore the snippet aboves use the same method to obtain the current span as SentryFeignClient.

I'm invoking the feign client from a coroutine. SentryContext() clones the Hub, but it drags the same transaction which is used in all coroutines, meaning when one coroutine creates a span (SentryFeignClient is the one creating the spans) it gets mixed with all other coroutines.

@acdcjunior
Copy link
Author

acdcjunior commented Dec 2, 2021

In other words, I can't do the suggested change because the startChild is actually invoked by SentryFeignClient, not me, so it is as if my code was:

fun main() {
    Sentry.init { options ->
        //...
    }

    val transaction: ITransaction = Sentry.getCurrentHub().startTransaction("my-coroutines-transaction5", "outer op")

    runBlocking(SentryContext()) {
        launch(SentryContext()) {
            launch(SentryContext()) {
                feignClientA.someMethod() // starts child span internally
            }
            launch(SentryContext()) {
                feignClientB.someMethod() // starts child span internally
            }
        }
        launch(SentryContext()) {
            launch(SentryContext()) {
                feignClientC.someMethod() // starts child span internally
            }
            launch(SentryContext()) {
                feignClientD.someMethod() // starts child span internally
            }
        }
    }
    transaction.finish()
}

All of this in a spring-boot app.

@maciejwalkowiak
Copy link
Contributor

@acdcjunior got it. You're right, thats the limitation coming from having only single active span bound to the scope at the same time. To get it working, Sentry.getSpan would need to be contextual and be able to return different span depending on the thread/coroutine - there is some design work that would need to be done.

I think there's currently no way to workaround it, perhaps @marandaneto has some ideas.

@maciejwalkowiak maciejwalkowiak added enhancement New feature or request and removed waiting for feedback labels Dec 2, 2021
@marandaneto
Copy link
Contributor

I'll discuss internally but I second @maciejwalkowiak , we're limited by the API design.
the hierarchy might be wrong but the timeline on the UI dashboard is correct since it's based on the timestamp.

@acdcjunior
Copy link
Author

Yeah, I actually got a problematic timeline as well (some calls being registered as taking 6ms when they actually took 100+ms), but the server I noticed that was running 21.2.0. Just changed to sentry.io, hopefully that gets fixed.

@adinauer
Copy link
Member

Similar to #2340

@adinauer adinauer added the ThreadLocalHubProblem Issues caused by Hub being ThreadLocal and no or no easy way to clone the hub label Feb 7, 2023
@romtsn
Copy link
Member

romtsn commented Oct 18, 2023

I believe this is resolved by #2838, please try it out and comment here if it's still doesn't work. Thank you!

@romtsn romtsn closed this as completed Oct 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance Performance API issues Platform: Java ThreadLocalHubProblem Issues caused by Hub being ThreadLocal and no or no easy way to clone the hub
Projects
Archived in project
Development

No branches or pull requests

7 participants