Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Resource Instrumentation #3954

Open
zaharidichev opened this issue Jul 11, 2021 · 2 comments
Open

RFC: Resource Instrumentation #3954

zaharidichev opened this issue Jul 11, 2021 · 2 comments
Labels
A-tokio Area: The main tokio crate M-tracing Tracing support in Tokio

Comments

@zaharidichev
Copy link
Contributor

Instrumenting resources

Background

Tokio Console is an async monitoring tool for the tokio runtime. One of the near future goals for the project is to be able to provide a way for users to see information about resources that their async code interacts with. More precisely, there is the need to associate tasks with resources and be able to visualize what resources are used by which tasks. The way to provide this data is through emitting tracing information in the form of spans and events.

Points of instrumentation

For the purpose of a concrete example, we can use the TcpStream and Sleep types.

Instrumenting the resource

In order to give it an identifier and track its lifecycle, we can have a span that is attached to the resource and lives as long as the resource. We can enter this span anytime a resource op is called. Such a span should contain data about the resource concrete type as well as data about the kind of resource that is instrumented. In the particular case that will be TcpStream for the type and net for the kind.

Instrumenting resource operations

A poll operation should in theory be an instant operation. That being said instrumenting such operations can be done via an event that is emitted at the end of the operation. That event can contain metadata with information about the type of operation as well as the result. Each of these events needs to be emitted in the context of the resource span. A few examples of such events:

  • poll operation - that would be for example calling poll_read_priv. The event can carry the result of the operation, namely whether it returned Pending or Ready and how many bytes have been read.

  • config operation - such an operation, albeit not being an async one can be captured by also emitting an event that in its metadata reflects the config change that has been made. Examples of that are Sleep::reset and TcpStream::set_ttl

  • async operation - instrumenting individual op calls gives us quite a lot of data especially around changing some state on the resource (e.g. Sleep::reset). When it comes to instrumenting operation such as poll_read_priv that are being called in the context of an async operation, it cannot tell us much apart from the fact that a certain task is using this resource at some point in time. We cannot really say that this is happening in the context of a particular async operation. Ideally we would want to be able to infer that a task is awaiting on a resource op to finish. This can be done by instrumenting the actual future that represents the async operation. This will create a span per future that will be entered and exited on poll invocation and will be dropped when the future is dropped. This will allow us to capture all individual poll events in the context of the lifecycle of a particular async operation that has been performed by a task.

  • cancellation - canceling an async operation will be represented by dropping the future that this operation is associated with.

To put all of that into the concrete TcpStream example, we will have some of the following instrumentation points:

Main resource span:

    pub struct TcpStream {
        io: PollEvented<mio::net::TcpStream>,
        resource_span: tracing::Span, // name = "resource", kind = net, fields.concrete_type = TcpStream
    }

Poll method event

pub(crate) fn poll_read_priv(
    &self,
    cx: &mut Context<'_>,
    buf: &mut ReadBuf<'_>,
) -> Poll<io::Result<()>> {
    let op_name = "poll_read_priv";
    let _span_guard = self.resource_span.enter();
    // Safety: `TcpStream::read` correctly handles reads into uninitialized memory
    let remaining_before = buf.remaining();
    let result = unsafe { self.io.poll_read(cx, buf) };
    match result {
        Poll::Ready(Ok(())) => {
            let num_read = remaining_before - buf.remaining();
            tracing::trace!(op_name, result = "Ready", bytes_read = num_read);
        },
        Poll::Ready(Err(error)) => {
            tracing::error!(op_name, %error);
        },
        Poll::Pending => tracing::trace!(op_name, result = "Pending"),
    }

    return result;
}

Config method event

pub fn set_ttl(&self, ttl: u32) -> io::Result<()> {
    let _span_guard = self.resource_span.enter();
    tracing::trace!(op_name = "set_ttl", ttl);
    self.io.set_ttl(ttl)
}

Ingesting and visualizing the data

Given we ingest these instrumentation events we can visualize the relationships between tasks and resources in the console.

Views

We can add or alter some views to accommodate these relationships

Main resource view

This view can display all the resources from their initialization to their destruction. Much like the tasks view, this can be a table that displays rows for each resource and some common information shared across resources such as resource kind, concrete type, number of total operations called, number of distinct operations called, number of tasks that are currently awaiting on the resource as we as the total number of tasks distinct that have awaited on it.

Resource details view

When a particular resource is selected, we can show a details view that contains more specific information about the resource. For example if this is a Sleep future, we can show the time it will expire. If it is a TcpStream we can show total bytes read,written or any other resource specific information. In addition to that we can show a list of operations that have been called on this resource - a trace of all poll_*or config methods methods and their results.

Task details view

There is already a task details view that shows some useful stats about the task. We can augment that view with a list of resources that this task is currently awaiting on. This list can be selectable, so selecting a resource from the list should take us to the Resource details view.

Tasks

In this already existing views, we can add a column per task that shows how many resources the task is awaiting on.

Constructing the relationships from tracing data

Resource lifecycle

Whenever we see resource span created, we know that a resource has been initialized. In a similar fashion when this span is dropped, we know that this resource is not around anymore.

Resource specific events

All the events that are emitted from poll_* methods and config methods or any other methods called on a resource, should be done so in the context of the main resource span. That being said, we should be able to figure out which resource a particular event is coming from by looking at the stack of currently entered spans when receiving the event.

Tasks awaiting on a resource

As we will be instrumenting the futures representing the async operations with their own spans, we should be able to tell that a task is awaiting on a resource by looking at the entered spans when a future's poll method returns Pending. Essentially this way, we can tie the lifecycle of the async operation to a particular resource and a particular task by inspecting the stack of currently entered spans.

Blockers

Currently tracing does not provide a way to get the stack of all entered spans from the registry. This is actually what makes this design impossible to implement at the moment. There is an issue for that in the tracing repo.

Open questions

  • Does emiting events from poll_* method actually make sense if we want to instrument the async operation. As pointed out in this comment , this instrumentation does not give us clear start/end points for the async operation for AsyncRead/AsyncWrite operations. Do we still want these events emitted?
  • How do we deal with async resources that are compose of other async resources. One such example would be a Mutex that is under the hood utilizing a Semaphore?

Implementation steps

  • Provide instrumentation for the main resource span for a selected group of resources (Sleep and TcpStream)
  • Resolve the blocker preventing us to take the stack of all entered spans
  • Ingest resource data in the console and show a basic resource view with some stats that can be obtained by the instrumentation already present (total number of method invocations per resource, etc)
  • Provide instrumentation of async operations for resources. For Sleep that would be instrumenting the Sleep future while for TcpStream it would be instrumenting the futures returned by AsyncReadExt/AsyncWriteExt
  • Modify the task detail view to show a list of resources that a task is awaiting on.
  • Add instrumentation to individual poll and config methods.
  • Introduce a resource details view that shows information specific to the resource.

Related discussions

Issue for resource instrumentation in the console repo
Tokio PR providing per resource span
Issue in the tracing repo regarding the full stack of entered spans

@zaharidichev zaharidichev added A-tokio Area: The main tokio crate M-tracing Tracing support in Tokio labels Jul 11, 2021
@carllerche
Copy link
Member

This is great and makes a lot of sense. I didn't initially consider the resource span as the mechanism to track the resource's usage across multiple tasks, but I think that is good.

I wonder if "entering" the resource span when performing an operation makes sense vs. using "follows_from" and creating a causal dependency between the resource span and the operation span. However, starting w/ "entering" the resource span when performing an operation is probably a good place to start.

Currently tracing does not provide a way to get the stack of all entered spans from the registry. This is actually what makes this design impossible to implement at the moment.

Could the console subscriber maintain a stack itself? Thoughts @hawkw

Does emiting events from poll_* method actually make sense if we want to instrument the async operation.

Yes, I think it does. We would still instrument async AsyncReadExt::read (and others) w/ a span. The key will be to annotate the async fn read function and the poll_* methods appropriately so the console can display a helpful view.

How do we deal with async resources that are compose of other async resources. One such example would be a Mutex that is under the hood utilizing a Semaphore?

That is also fine I think. The console will have to filter out internal resources from the view by default to avoid clutter.

@zaharidichev
Copy link
Contributor Author

@carllerche Thanks for the comments. Wrt to the follows_from comment. I am not entirely sure which way is better. One thing that stands out from the docs for follows_from:

/// This relationship differs somewhat from the parent-child relationship: a
/// span may have any number of prior spans, rather than a single one; and
/// spans are not considered to be executing _inside_ of the spans they
/// follow from. This means that a span may close even if subsequent spans
/// that follow from it are still open...

I assume that if at some point we rely on the fact that all ops spans are closed the moment the resource span is closed, then follows_from does not encode this assumption correctly. It seems to me that it modals the causality correctly but does not capture the lifecycle dependency (e.g: an async op call on a resource cannot live longer than the resource itsefl). Not sure this is all correct. Maybe @hawkw can comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate M-tracing Tracing support in Tokio
Projects
None yet
Development

No branches or pull requests

2 participants