Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: trace2: GoStatus at end of generation uses m.id while other events use m.procid #65196

Closed
dominikh opened this issue Jan 22, 2024 · 3 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@dominikh
Copy link
Member

When emitting GoStatus events for goroutines that didn't have any activity in a generation, we use m.id to set the event's M

ug.mid = s.g.m.id

but for all other events with an M we use m.procid instead, for example

mID = uint64(w.mp.procid)

w = w.writeGoStatus(uint64(gp.goid), int64(tl.mp.procid), goStatus, gp.inMarkAssist)

This affects the M reported for goroutines that are already in syscalls when tracing begins.

/cc @mknyszek

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jan 22, 2024
@mknyszek mknyszek self-assigned this Jan 22, 2024
@mknyszek mknyszek added this to the Go1.22 milestone Jan 22, 2024
@mknyszek mknyszek added the NeedsFix The path to resolution is known, but the work has not been done. label Jan 22, 2024
@mknyszek
Copy link
Contributor

Thanks for reporting this! Yeah, that's definitely wrong. I'll send a CL and get it in for the 1.22 release. I think this is backport-worthy anyway (small safe fix for an issue that can create broken traces).

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/557456 mentions this issue: runtime: use the correct M ID for syscalling goroutines in traces

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/557436 mentions this issue: [release-branch.go1.22] runtime: use the correct M ID for syscalling goroutines in traces

gopherbot pushed a commit that referenced this issue Jan 22, 2024
…goroutines in traces

Earlier in the development of the new tracer, m.id was used as a the
canonical ID for threads. Later, we switched to m.procid because it
matches the underlying OS resource. However, in that switch, we missed a
spot.

The tracer catches and emits statuses for goroutines that have remained
in either waiting or syscall across a whole generation, and emits a
thread ID for the latter set. The ID being used here, however, was m.id
instead of m.procid, like the rest of the tracer.

This CL also adds a regression test. In order to make the regression
test actually catch the failure, we also have to make the parser a
little less lenient about GoStatus events with GoSyscall: if this isn't
the first generation, then we should've seen the goroutine bound to an
M already when its status is getting emitted for its context. If we emit
the wrong ID, then we'll catch the issue when we emit the right ID when
the goroutine exits the syscall.

Fixes #65196.

Change-Id: I78b64fbea65308de5e1291c478a082a732a8bf9f
Reviewed-on: https://go-review.googlesource.com/c/go/+/557456
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
(cherry picked from commit c469666)
Reviewed-on: https://go-review.googlesource.com/c/go/+/557436
ezz-no pushed a commit to ezz-no/go-ezzno that referenced this issue Feb 18, 2024
Earlier in the development of the new tracer, m.id was used as a the
canonical ID for threads. Later, we switched to m.procid because it
matches the underlying OS resource. However, in that switch, we missed a
spot.

The tracer catches and emits statuses for goroutines that have remained
in either waiting or syscall across a whole generation, and emits a
thread ID for the latter set. The ID being used here, however, was m.id
instead of m.procid, like the rest of the tracer.

This CL also adds a regression test. In order to make the regression
test actually catch the failure, we also have to make the parser a
little less lenient about GoStatus events with GoSyscall: if this isn't
the first generation, then we should've seen the goroutine bound to an
M already when its status is getting emitted for its context. If we emit
the wrong ID, then we'll catch the issue when we emit the right ID when
the goroutine exits the syscall.

Fixes golang#65196.

Change-Id: I78b64fbea65308de5e1291c478a082a732a8bf9f
Reviewed-on: https://go-review.googlesource.com/c/go/+/557456
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

3 participants