Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: SIGSEGV in runtime.deltimer on linux-mips-rtrk during ReadMemStats [1.15 backport] #43833

Closed
gopherbot opened this issue Jan 21, 2021 · 4 comments

Comments

@gopherbot
Copy link

@gopherbot gopherbot commented Jan 21, 2021

@mknyszek requested issue #43712 to be considered for backport to the next 1.15 minor release.

It turns out that this is also a problem in Go 1.15.

@gopherbot Please open a backport issue for Go 1.15.

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented Jan 21, 2021

It's a rare crash in the runtime with no workaround, hence I think it satisfies the backport criteria. The fix is also fairly safe.

@toothrot
Copy link
Contributor

@toothrot toothrot commented Jan 26, 2021

This is a serious issue with no workaround. Approved.

@gopherbot
Copy link
Author

@gopherbot gopherbot commented Jan 27, 2021

Change https://golang.org/cl/287092 mentions this issue: [release-branch.go1.15] runtime: don't adjust timer pp field in timerWaiting status

@gopherbot
Copy link
Author

@gopherbot gopherbot commented Feb 3, 2021

Closed by merging 3171f48 to release-branch.go1.15.

@gopherbot gopherbot closed this Feb 3, 2021
gopherbot pushed a commit that referenced this issue Feb 3, 2021
…Waiting status

Before this CL, the following sequence was possible:

* GC scavenger starts and sets up scavenge.timer
* GC calls readyForScavenger, but sysmon is sleeping
* program calls runtime.GOMAXPROCS to shrink number of processors
* procresize destroys a P, the one that scavenge.timer is on
* (*pp).destroy calls moveTimers, which gets to the scavenger timer
* scavenger timer is timerWaiting, and moveTimers clears t.pp
* sysmon wakes up and calls wakeScavenger
* wakeScavengers calls stopTimer on scavenger.timer, still timerWaiting
* stopTimer calls deltimer which loads t.pp, which is still nil
* stopTimer tries to increment deletedTimers on nil t.pp, and crashes

The point of vulnerability is the time that t.pp is set to nil by
moveTimers and the time that t.pp is set to non-nil by moveTimers,
which is a few instructions at most. So it's not likely and in
particular is quite unlikely on x86. But with a more relaxed memory
model the area of vulnerability can be somewhat larger. This appears
to tbe the cause of two builder failures in a few months on linux-mips.

This CL fixes the problem by making moveTimers change the status from
timerWaiting to timerMoving while t.pp is clear. That will cause
deltimer to wait until the status is back to timerWaiting, at which
point t.pp has been set again.

For #43712
Fixes #43833

Change-Id: I66838319ecfbf15be66c1fac88d9bd40e2295852
Reviewed-on: https://go-review.googlesource.com/c/go/+/284775
Trust: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
(cherry picked from commit d2d155d)
Reviewed-on: https://go-review.googlesource.com/c/go/+/287092
Run-TryBot: Carlos Amedee <carlos@golang.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants