Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: SIGSEGV in runtime.deltimer on linux-mips-rtrk during ReadMemStats [1.15 backport] #43833

Closed
gopherbot opened this issue Jan 21, 2021 · 4 comments
Labels
CherryPickApproved FrozenDueToAge
Milestone

Comments

@gopherbot
Copy link

@gopherbot gopherbot commented Jan 21, 2021

@mknyszek requested issue #43712 to be considered for backport to the next 1.15 minor release.

It turns out that this is also a problem in Go 1.15.

@gopherbot Please open a backport issue for Go 1.15.

@gopherbot gopherbot added the CherryPickCandidate label Jan 21, 2021
@gopherbot gopherbot added this to the Go1.15.8 milestone Jan 21, 2021
@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented Jan 21, 2021

It's a rare crash in the runtime with no workaround, hence I think it satisfies the backport criteria. The fix is also fairly safe.

@toothrot
Copy link
Contributor

@toothrot toothrot commented Jan 26, 2021

This is a serious issue with no workaround. Approved.

@toothrot toothrot added the CherryPickApproved label Jan 26, 2021
@gopherbot gopherbot removed the CherryPickCandidate label Jan 26, 2021
@gopherbot
Copy link
Author

@gopherbot gopherbot commented Jan 27, 2021

Change https://golang.org/cl/287092 mentions this issue: [release-branch.go1.15] runtime: don't adjust timer pp field in timerWaiting status

@gopherbot
Copy link
Author

@gopherbot gopherbot commented Feb 3, 2021

Closed by merging 3171f48 to release-branch.go1.15.

gopherbot pushed a commit that referenced this issue Feb 3, 2021
…Waiting status

Before this CL, the following sequence was possible:

* GC scavenger starts and sets up scavenge.timer
* GC calls readyForScavenger, but sysmon is sleeping
* program calls runtime.GOMAXPROCS to shrink number of processors
* procresize destroys a P, the one that scavenge.timer is on
* (*pp).destroy calls moveTimers, which gets to the scavenger timer
* scavenger timer is timerWaiting, and moveTimers clears t.pp
* sysmon wakes up and calls wakeScavenger
* wakeScavengers calls stopTimer on scavenger.timer, still timerWaiting
* stopTimer calls deltimer which loads t.pp, which is still nil
* stopTimer tries to increment deletedTimers on nil t.pp, and crashes

The point of vulnerability is the time that t.pp is set to nil by
moveTimers and the time that t.pp is set to non-nil by moveTimers,
which is a few instructions at most. So it's not likely and in
particular is quite unlikely on x86. But with a more relaxed memory
model the area of vulnerability can be somewhat larger. This appears
to tbe the cause of two builder failures in a few months on linux-mips.

This CL fixes the problem by making moveTimers change the status from
timerWaiting to timerMoving while t.pp is clear. That will cause
deltimer to wait until the status is back to timerWaiting, at which
point t.pp has been set again.

For #43712
Fixes #43833

Change-Id: I66838319ecfbf15be66c1fac88d9bd40e2295852
Reviewed-on: https://go-review.googlesource.com/c/go/+/284775
Trust: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
(cherry picked from commit d2d155d)
Reviewed-on: https://go-review.googlesource.com/c/go/+/287092
Run-TryBot: Carlos Amedee <carlos@golang.org>
@golang golang locked and limited conversation to collaborators Feb 3, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CherryPickApproved FrozenDueToAge
Projects
None yet
Development

No branches or pull requests

3 participants