Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/trace: execution tracer overhaul #60773

Open
mknyszek opened this issue Jun 13, 2023 · 3 comments
Open

runtime/trace: execution tracer overhaul #60773

mknyszek opened this issue Jun 13, 2023 · 3 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@mknyszek
Copy link
Contributor

mknyszek commented Jun 13, 2023

Execution tracer overhaul

Authored by mknyszek@google.com with a mountain of input from others.

In no particular order, thank you to Felix Geisendorfer, Nick Ripley, Michael Pratt, Austin Clements, Rhys Hiltner, thepudds, Dominik Honnef, and Bryan Boreham for your invaluable feedback.

Background

Original design document.

Go execution traces provide a moment-to-moment view of what happens in a Go program over some duration. This information is invaluable in understanding program behavior over time and can be leveraged to achieve significant performance improvements. Because Go has a runtime, it can provide deep information about program execution without any external dependencies, making traces particularly attractive for large deployments.

Unfortunately limitations in the trace implementation prevent widespread use.

For example, the process of analyzing execution traces scales poorly with the size of the trace. Traces need to be parsed in their entirety to do anything useful with them, making them impossible to stream. As a result, trace parsing and validation has very high memory requirements for large traces.

Also, Go execution traces are designed to be internally consistent, but don't provide any way to align with other kinds of traces, for example OpenTelemetry traces and Linux sched traces. Alignment with higher level tracing mechanisms is critical to connecting business-level tasks with resource costs. Meanwhile alignment with lower level traces enables a fully vertical view of application performance to root out the most difficult and subtle issues.

Lastly, the implementation of the execution tracer has evolved organically over time and it shows. The codebase also has many old warts and some age-old bugs that make collecting traces difficult, and seem broken. Furthermore, many significant decision decisions were made over the years but weren't thoroughly documented; those decisions largely exist solely in old commit messages and breadcrumbs left in comments within the codebase itself.

Thanks to work in Go 1.21 cycle, the execution tracer's run-time overhead was reduced from about -10% throughput and +10% request latency in web services to about 1% in both for most applications. This reduced overhead in conjunction with making traces more scalable enables some exciting and powerful new opportunities for traces.

Goals

The goal of this document is to define an alternative implementation for Go execution traces that scales up to large Go deployments.

Specifically, the design presented aims to achieve:

  • Make trace parsing require a small fraction of the memory it requires today.
  • Streamable traces, to enable analysis without storage.
  • Partially self-describing traces, to reduce the upgrade burden on trace consumers.
  • Fix age-old bugs and present a path to clean up the implementation.

This document also describes the existing state of the tracer in detail and explains how we got there.

Design

Link to design document.

CC @felixge @nsrip-dd @prattmic @aclements @rhysh @dominikh @bboreham @thepudds

@mknyszek mknyszek added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jun 13, 2023
@mknyszek mknyszek added this to the Go1.22 milestone Jun 13, 2023
@mknyszek mknyszek self-assigned this Jun 13, 2023
@gopherbot
Copy link

Change https://go.dev/cl/503038 mentions this issue: design/60773-execution-tracer-overhaul.md: add design

@mknyszek mknyszek changed the title runtime/trace: revamp execution traces for scalability runtime/trace: execution tracer overhaul Jun 13, 2023
@mknyszek mknyszek added the NeedsFix The path to resolution is known, but the work has not been done. label Jun 13, 2023
@gopherbot
Copy link

Change https://go.dev/cl/494187 mentions this issue: runtime: add execution tracer v2 experiment

@gopherbot
Copy link

Change https://go.dev/cl/515635 mentions this issue: runtime: refactor runtime->tracer API to appear more like a lock

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done.
Projects
Status: In Progress
Development

No branches or pull requests

2 participants