Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/trace: implement user events #16619

Open
FiloSottile opened this issue Aug 5, 2016 · 15 comments

Comments

@FiloSottile
Copy link
Member

commented Aug 5, 2016

https://golang.org/s/go15trace specifies user events:

Runtime/pprof package also exposes a set of functions for emission of user events:

package pprof
func TraceEvent(id string)
func TraceScopeStart(id string)
func TraceScopeEnd(id string)

But there is no implementation at this time.

Those would be invaluable to properly debug latency problems.

(According to @dvyukov there is no one actively working on it, so I might give it a try.)

@dvyukov

This comment has been minimized.

Copy link
Member

commented Aug 5, 2016

@quentinmit quentinmit added this to the Go1.8 milestone Aug 8, 2016

@aclements

This comment has been minimized.

Copy link
Member

commented Aug 9, 2016

For reference, there's been a lot of discussion about the user events API since the original proposal here: https://go-review.googlesource.com/c/8552/3/src/runtime/trace/trace.go

Also, user events came up in today's meeting with @hyangah.

@hyangah

This comment has been minimized.

Copy link
Contributor

commented Aug 10, 2016

Yes, I plan to work on it for go1.8 but it's possible we end up with slightly different API.
Will update this issue soon with the plan we are thinking about. @matloob

@quentinmit quentinmit added the NeedsFix label Oct 10, 2016

@quentinmit quentinmit modified the milestones: Go1.9Early, Go1.8 Oct 13, 2016

@quentinmit

This comment has been minimized.

Copy link
Contributor

commented Oct 13, 2016

This is too late for 1.8.

@hyangah hyangah self-assigned this Feb 8, 2017

@bradfitz bradfitz modified the milestones: Go1.10Early, Go1.9Early May 3, 2017

@bradfitz bradfitz modified the milestones: Go1.10Early, Go1.10 Jun 14, 2017

@rakyll

This comment has been minimized.

Copy link
Member

commented Jul 15, 2017

@hyangah, what's the status of this? This probably needs a proposal first, I already have background and opinions of this one and can create a proposal if no one is actively working on it.

@aclements

This comment has been minimized.

Copy link
Member

commented Jul 15, 2017

I have an old prototype of user events at https://go-review.googlesource.com/c/34031/. No effort was made to define a good API, but that could be a useful starting point for an implementation if we decide on an API.

Earlier today @heschik, @rhysh, a few others who's usernames I don't know, and I were talking about the possibility of simply using the existing profiler labels as the user spans, since they often contain exactly the request information you might want to include in a user span. (I'm not personally convinced this is a good idea, but writing it down for the record.)

@rakyll

This comment has been minimized.

Copy link
Member

commented Jul 15, 2017

I were talking about the possibility of simply using the existing profiler labels as the user spans

I don't think this is a good idea. There is an obvious possibility to annotate spans with profiler labels but we shouldn't assume they are representing the spans' themselves. In tracing, one of the fundamental features is to give to user the flexibility to be able to mark the sections they are interested in and find statistically meaningful.

@dvyukov

This comment has been minimized.

Copy link
Member

commented Jul 15, 2017

FWIW is would be useful if we could understand span type, i.e. these spans are of type A (say, request processing), while these are of type B (say, database accesses). Then we could automatically infer cases of tail latency and give explanation as to why a particular span takes so long. For example, this span is the longest of its type, it has comparable with other spans of the same type cpu time, scheduler latency and network wait time, but was blocked on mutexes 10x longer.
That's what we do with goroutines today, using goroutine start function as type.

@heschik

This comment has been minimized.

Copy link
Contributor

commented Jul 17, 2017

I think I get why having profiler labels create spans is undesirable, but I don't see any reason not to propagate the profiler labels to the execution trace. Concretely, just emit a trace event whenever a goroutine's labels change, and the trace viewer can attach them on all of the slices (rectangles in the UI) it emits to the JSON.

@hyangah

This comment has been minimized.

Copy link
Contributor

commented Aug 24, 2017

I compiled my thought in https://goo.gl/iqJfJ3
Comments are welcome.

@gopherbot

This comment has been minimized.

Copy link

commented Aug 28, 2017

Change https://golang.org/cl/59572 mentions this issue: runtime/trace/trace: user annotation API skeleton

@gopherbot

This comment has been minimized.

Copy link

commented Feb 14, 2018

Change https://golang.org/cl/63274 mentions this issue: runtime/trace: user annotation API

gopherbot pushed a commit that referenced this issue Feb 15, 2018
runtime/trace: user annotation API
This CL presents the proposed user annotation API skeleton.
This CL bumps up the trace version to 1.11.

Design doc https://goo.gl/iqJfJ3

Implementation CLs are followed.

The API introduces three basic building blocks. Log, Span, and Task.

Log is for basic logging. When called, the message will be recorded
to the trace along with timestamp, goroutine id, and stack info.

   trace.Log(ctx, messageType message)

Span can be thought as an extension of log to record interesting
time interval during a goroutine's execution. A span is local to a
goroutine by definition.

   trace.WithSpan(ctx, "doVeryExpensiveOp", func(ctx context) {
      /* do something very expensive */
   })

Task is higher-level concept that aids tracing of complex operations
that encompass multiple goroutines or are asynchronous.
For example, an RPC request, a HTTP request, a file write, or a
batch job can be traced with a Task.

Note we chose to design the API around context.Context so it allows
easier integration with other tracing tools, often designed around
context.Context as well. Log and WithSpan APIs recognize the task
information embedded in the context and record it in the trace as
well. That allows the Go execution tracer to associate and group
the spans and log messages based on the task information.

In order to create a Task,

   ctx, end := trace.NewContext(ctx, "myTask")
   defer end()

The Go execution tracer measures the time between the task created
and the task ended for the task latency.

More discussion history in golang.org/cl/59572.

Update #16619

R=go1.11

Change-Id: I59a937048294dafd23a75cf1723c6db461b193cd
Reviewed-on: https://go-review.googlesource.com/63274
Reviewed-by: Austin Clements <austin@google.com>

@ianlancetaylor ianlancetaylor modified the milestones: Go1.11, Unplanned Jun 28, 2018

@nhooyr

This comment has been minimized.

Copy link
Contributor

commented Sep 11, 2018

Was this done in Go 1.11?

I see https://godoc.org/runtime/trace#hdr-User_annotation

@ammario

This comment has been minimized.

Copy link

commented Sep 23, 2018

Why isn't there an average for each task?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.