-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enrich the structure of traces emitted by Tokio #5792
Comments
This sounds reasonable to me. You should probably coordinate with @carllerche's scheduler refactor, which may change the same parts of the codebase. |
There are a number of cases in which being able to identify a runtime is useful. When instrumenting an application, this is particularly true. For example, we would like to be able to add traces for runtimes so that tasks can be differentiated (#5792). It would also allow a way to differentiate runtimes which are have their tasks dumped. Outside of instrumentation, it may be useful to check whether 2 runtime handles are pointing to the same runtime. This change adds an opaque `runtime::Id` struct which serves this purpose, initially behind the `tokio_unstable` cfg flag. It follows the same pattern as the `task::Id` struct. The Id can be compared for equality with another `runtime::Id` and implements `Debug` and `Display` so that it can be output as well. Internally the Id is a `u64`, but that is an implementation detail. There is a degree of code duplication, but that is necessary to ensure that the Ids are not used to compare against one another. The Id is added within the scope of working towards closing #5545.
There are a number of cases in which being able to identify a runtime is useful. When instrumenting an application, this is particularly true. For example, we would like to be able to add traces for runtimes so that tasks can be differentiated (#5792). It would also allow a way to differentiate runtimes which are have their tasks dumped. Outside of instrumentation, it may be useful to check whether 2 runtime handles are pointing to the same runtime. This change adds an opaque `runtime::Id` struct which serves this purpose, initially behind the `tokio_unstable` cfg flag. It follows the same pattern as the `task::Id` struct. The Id can be compared for equality with another `runtime::Id` and implements `Debug` and `Display` so that it can be output as well. Internally the Id is a `u64`, but that is an implementation detail. There is a degree of code duplication, but that is necessary to ensure that the Ids are not used to compare against one another. The Id is added within the scope of working towards closing #5545.
There are a number of cases in which being able to identify a runtime is useful. When instrumenting an application, this is particularly true. For example, we would like to be able to add traces for runtimes so that tasks can be differentiated (#5792). It would also allow a way to differentiate runtimes which are have their tasks dumped. Outside of instrumentation, it may be useful to check whether 2 runtime handles are pointing to the same runtime. This change adds an opaque `runtime::Id` struct which serves this purpose, initially behind the `tokio_unstable` cfg flag. It follows the same pattern as the `task::Id` struct. The Id can be compared for equality with another `runtime::Id` and implements `Debug` and `Display` so that it can be output as well. Internally the Id is a `u64`, but that is an implementation detail. There is a degree of code duplication, but that is necessary to ensure that the Ids are not used to compare against one another. The Id is added within the scope of working towards closing #5545.
There are a number of cases in which being able to identify a runtime is useful. When instrumenting an application, this is particularly true. For example, we would like to be able to add traces for runtimes so that tasks can be differentiated (#5792). It would also allow a way to differentiate runtimes which are have their tasks dumped. Outside of instrumentation, it may be useful to check whether 2 runtime handles are pointing to the same runtime. This change adds an opaque `runtime::Id` struct which serves this purpose, initially behind the `tokio_unstable` cfg flag. The inner value of the ID is taken from the `OwnedTasks` or `LocalOwnedTasks` struct which every runtime and local set already has. This will mean that any use of the ID will align with the task dump feature. The ID is added within the scope of working towards closing #5545.
In the Tokio instrumentation, a tracing span is created for each task which is spawned. Since the new span is created within the context of where `tokio::spawn()` (or similar) is called from, it gets a contextual parent attached. In tracing, when a span has a child span (either because the child was created in the context of the parent, or because the parent was set explicitly) then that span will not be closed until the child has closed. The result in the console subscriber is that a task which spawns another task won't have a `dropped_at` time set until the spawned task exits, even if the parent task exits much earlier. This causes Tokio Console to show an incorrect lost waker warning (#345). It also affects other spans that are entered when a task is spawned (#412). The solution is to modify the instrumentation in Tokio so that task spans are explicit roots (`parent: None`). This will be done as part of enriching the Tokio instrumentation (tokio-rs/tokio#5792). This change adds functionality to the test framework within `console-subscriber` so that the state of a task can be set as an expectation. The state is calculated based on 4 values: * `console_api::tasks::Stats::dropped_at` * `console_api::tasks::Stats::last_wake` * `console_api::PollStats::last_poll_started` * `console_api::PollStats::last_poll_ended` It can then be tested that a task that spawns another task and then ends actually goes to the `Completed` state, even if the spawned task is still running. As of Tokio 1.33.0, this test fails, but the PR FIXME:TBD fixes this and the test should pass from Tokio 1.34 onwards.
In Tokio, tasks are optionally instrumented with tracing spans to allow analysis of the runtime behavior to be performed with tools like tokio-console. The span that is created for each task gets currently follows the default tracing behavior and has a contextual parent attached to it based on the span that is actual when `tokio::spawn` or similar is called. However, in tracing, a span will remain "alive" until all its children spans are closed. This doesn't match how spawned tasks work. A task may outlive the context in which is was spawned (and frequently does). This causes tasks which spawn other - longer living - tasks to appear in `tokio-console` as having lost their waker when instead they should be shown as completed (tokio-rs/console#345). It can also cause undesired behavior for unrelated tracing spans if a subscriber is receiving both the other spans as well as Tokio's instrumentation. To fix this mismatch in behavior, the task span has `parent: None` set on it, making it an explicit root - it has no parent. The same was already done for all spans representing resources in #6107. This change is made within the scope of #5792. Due to a defect in the currently available `tracing-mock` crate, it is not possible to test this change at a tracing level (tokio-rs/tracing#2440). Instead, a test for the `console-subscriber` has been written which shows that this change fixes the defect as observed in `tokio-console` (tokio-rs/console#490).
In the Tokio instrumentation, a tracing span is created for each task which is spawned. Since the new span is created within the context of where `tokio::spawn()` (or similar) is called from, it gets a contextual parent attached. In tracing, when a span has a child span (either because the child was created in the context of the parent, or because the parent was set explicitly) then that span will not be closed until the child has closed. The result in the console subscriber is that a task which spawns another task won't have a `dropped_at` time set until the spawned task exits, even if the parent task exits much earlier. This causes Tokio Console to show an incorrect lost waker warning (#345). It also affects other spans that are entered when a task is spawned (#412). The solution is to modify the instrumentation in Tokio so that task spans are explicit roots (`parent: None`). This will be done as part of enriching the Tokio instrumentation (tokio-rs/tokio#5792). This change adds functionality to the test framework within `console-subscriber` so that the state of a task can be set as an expectation. The state is calculated based on 4 values: * `console_api::tasks::Stats::dropped_at` * `console_api::tasks::Stats::last_wake` * `console_api::PollStats::last_poll_started` * `console_api::PollStats::last_poll_ended` It can then be tested that a task that spawns another task and then ends actually goes to the `Completed` state, even if the spawned task is still running. As of Tokio 1.34.0, this test fails, but the PR tokio-rs/tokio#XXXX fixes this and the test should pass from Tokio 1.35 onwards.
In the Tokio instrumentation, a tracing span is created for each task which is spawned. Since the new span is created within the context of where `tokio::spawn()` (or similar) is called from, it gets a contextual parent attached. In tracing, when a span has a child span (either because the child was created in the context of the parent, or because the parent was set explicitly) then that span will not be closed until the child has closed. The result in the console subscriber is that a task which spawns another task won't have a `dropped_at` time set until the spawned task exits, even if the parent task exits much earlier. This causes Tokio Console to show an incorrect lost waker warning (#345). It also affects other spans that are entered when a task is spawned (#412). The solution is to modify the instrumentation in Tokio so that task spans are explicit roots (`parent: None`). This will be done as part of enriching the Tokio instrumentation (tokio-rs/tokio#5792). This change adds functionality to the test framework within `console-subscriber` so that the state of a task can be set as an expectation. The state is calculated based on 4 values: * `console_api::tasks::Stats::dropped_at` * `console_api::tasks::Stats::last_wake` * `console_api::PollStats::last_poll_started` * `console_api::PollStats::last_poll_ended` It can then be tested that a task that spawns another task and then ends actually goes to the `Completed` state, even if the spawned task is still running. As of Tokio 1.34.0, this test fails, but the PR tokio-rs/tokio#6158 fixes this and the test should pass from Tokio 1.35 onwards.
In Tokio, tasks are optionally instrumented with tracing spans to allow analysis of the runtime behavior to be performed with tools like tokio-console. The span that is created for each task gets currently follows the default tracing behavior and has a contextual parent attached to it based on the span that is actual when `tokio::spawn` or similar is called. However, in tracing, a span will remain "alive" until all its children spans are closed. This doesn't match how spawned tasks work. A task may outlive the context in which is was spawned (and frequently does). This causes tasks which spawn other - longer living - tasks to appear in `tokio-console` as having lost their waker when instead they should be shown as completed (tokio-rs/console#345). It can also cause undesired behavior for unrelated tracing spans if a subscriber is receiving both the other spans as well as Tokio's instrumentation. To fix this mismatch in behavior, the task span has `parent: None` set on it, making it an explicit root - it has no parent. The same was already done for all spans representing resources in #6107. This change is made within the scope of #5792. Due to a defect in the currently available `tracing-mock` crate, it is not possible to test this change at a tracing level (tokio-rs/tracing#2440). Instead, a test for the `console-subscriber` has been written which shows that this change fixes the defect as observed in `tokio-console` (tokio-rs/console#490).
@hds What about
Looks like him span should have |
This is something we should consider. We've taken it out for now, but a subscriber could still get the context for that span and store it if it needed to. |
Is your feature request related to a problem? Please describe.
The traces produced by Tokio are great, however there are some areas in which the hierarchy in which the traces are emitted could be improved or doesn't make sense.
Specific issues are:
Describe the solution you'd like
I would like to add some additional traces to Tokio and modify others.
The proposed hierarchy is the following. Sub-items have their span's
parent
set. Values in parentheses are related fields which would be set. Note that all spans have a span ID implicitly, these are used for internal referencing.Further explanation of the structure:
Ideally all these changes, or at least all the breaking ones would make it into the same Tokio release. Preferably with the addition of spans for Runtimes. This would allow Tokio Console to determine the version tokio tracing variant being used and read the traces accordingly.
Describe alternatives you've considered
Parts of this hierarchy can be implemented without the rest. But it makes sense to discuss the entirety.
Additional context
This is mostly considering being consumed by Tokio Console. However it would be interesting to hear from anyone who is using tokio traces in other ways.
The text was updated successfully, but these errors were encountered: