Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: memory corruption on Linux 5.2+ #35777

Closed
aclements opened this issue Nov 22, 2019 · 74 comments
Assignees
Milestone

Comments

@aclements
Copy link
Member

@aclements aclements commented Nov 22, 2019

We've had several reports of memory corruption on Linux 5.3.x (or later) kernels from people running tip since asynchronous preemption was committed. This is a super-bug to track these issues. I suspect they all have one root cause.

Typically these are "runtime error: invalid memory address or nil pointer dereference" or "runtime: unexpected return pc" or "segmentation violation" panics. They can also appear as self-detected data corruption.

If you encounter a crash that could be random memory corruption, are running Linux 5.3.x or later, and are running a recent tip Go (after commit 62e53b7), please file a new issue and add a comment here. If you can reproduce it, please try setting "GODEBUG=asyncpreemptoff=1" in your environment and seeing if you can still reproduce it.

Duplicate issues (I'll edit this comment to keep this up-to-date):

runtime: corrupt binary export data seen after signal preemption CL (#35326): Corruption in file version header observed by vet. Medium reproducible. Strong leads.

cmd/compile: panic during early copyelim crash (#35658): Invalid memory address in cmd/compile/internal/ssa.copyelim. Not reproducible. Nothing obvious in stack trace. Haven't dug into assembly.

runtime: SIGSEGV in mapassign_fast64 during cmd/vet (#35689): Invalid memory address in runtime.mapassign_fast64 in vet. Stack trace includes random pointers. Some assembly decoding work.

runtime: unexpected return pc for runtime.(*mheap).alloc (#35328): Unexpected return pc. Stack trace includes random pointers. Not reproducible.

cmd/dist: I/O error: read src/xxx.go: is a directory (#35776): Random misbehavior. Not reproducible.

runtime: "fatal error: mSpanList.insertBack" in mallocgc (#35771): Bad mspan next pointer (random and unaligned). Not reproducible.

cmd/compile: invalid memory address or nil pointer dereference in gc.convlit1 (#35621): Invalid memory address in cmd/compile/internal/gc.convlit1. Evidence of memory corruption, though no obvious random pointers. Not reproducible.

cmd/go: unexpected signal during runtime execution (#35783): Corruption in file version header observed by vet. Not reproducible.

runtime: unexpected return pc for runtime.systemstack_switch (#35592): Unexpected return pc. Stack trace includes random pointers. Not reproducible.

cmd/compile: random compile error running tests (#35760): Compiler data corruption. Not reproducible.

@aclements aclements added this to the Go1.14 milestone Nov 22, 2019
@aclements aclements self-assigned this Nov 22, 2019
@aclements aclements pinned this issue Nov 22, 2019
@mvdan

This comment has been minimized.

Copy link
Member

@mvdan mvdan commented Nov 22, 2019

@aclements for your records, #35328 and #35776 might be related as well. Those two were on the same Linux 5.3.x machine of mine.

@aclements

This comment has been minimized.

Copy link
Member Author

@aclements aclements commented Nov 22, 2019

Thanks @mvdan. I've folded those into the list above.

@zikaeroh

This comment has been minimized.

Copy link

@zikaeroh zikaeroh commented Nov 22, 2019

#35621 from me. One time, no repro.

@myitcv

This comment has been minimized.

Copy link
Member

@myitcv myitcv commented Nov 22, 2019

@aclements just saw #35783 for the record.

If you think we have enough "evidence" please say and I'll stop creating issues for now 😄

@bradfitz

This comment has been minimized.

Copy link
Member

@bradfitz bradfitz commented Nov 22, 2019

Have we roughly bisected which Linux versions are affected? Looking at the kernel changes in that region might yield a clue about where and whose the bug is.

5.3 = bad.
5.2 = ?

@zikaeroh

This comment has been minimized.

Copy link

@zikaeroh zikaeroh commented Nov 22, 2019

In #35326 (comment), I used Arch's 4.19 LTS and could not reproduce the bexport corruption. However, the kernel configuration differs between 4.19 and 5.3, so that may be unscientific. (I'm letting my machine rebuild 5.3 without PREEMPT set to see if that's the problem, but I have doubts. EDIT: It was not PREEMPT, so setting up a builder with a newer kernel would likely be good regardless.)

What set of kernels do the current Linux builders use? That might provide a lower bound, as I've never seen the issue there.

(I'd bring up #9505 to advocate for an Arch builder, but that issue is more about everything but the kernel version. I feel like there should be some builder which is at the latest Linux kernel, whatever that may be.)

@bradfitz

This comment has been minimized.

Copy link
Member

@bradfitz bradfitz commented Nov 22, 2019

The existing Go Linux builders use Container Optimized OS with a Linux kernel 4.19.72+.

@aclements

This comment has been minimized.

Copy link
Member Author

@aclements aclements commented Nov 22, 2019

Thanks @myitcv, I think we have enough reports. If you do happen to find another one that's reproducible, that would be very helpful, though.

@dr2chase

This comment has been minimized.

Copy link
Contributor

@dr2chase dr2chase commented Nov 25, 2019

To recap experiments last Friday (and I rechecked the test for the more mystifying of these Sunday afternoon), Cherry and I tried the following:

Double the size of the sigaltstack, just in case. Also sanity check the bounds within gdb, they were okay.

Modified the definition of fpstate to conform to what is defined in the linux header files.
Modified sigcontext to use the new Xstate:

fpstate *Xstate // *fpstate1

Wrote a method to allow us to store the ymm registers that were supplied (as registers) to the signal handler,

  1. tried an experiment in the assembly language handler to trash the YMM registers (not the data structures) before return. We never saw any sign of the trash but this seemed to raise the rate of the failures (running "go vet all"). The trashing string stored was "This_is_a_test. "

  2. tried printing the saved and current ymm registers in sigtrampgo.
    The saved ones looked like memmove artifacts (source code while running vet all), and the current ones were always zero. The memmove artifacts stayed unchanged, a lot, between signals.
    I rechecked the code that did this earlier today, just in case we got it wrong.

  3. made a copy of the saved xmm and ymm registers on sigtrampgo entry, then checked the copy against the saved registers, to see if our code ever somehow modified them. That never fired.

I spent some time Saturday looking for "interesting" comments in the Linux git log, I have some to review. What I am wondering is if there was some attempt to optimize saving of the ymm registers and that got fouled up. One thing I wonder a little about was what they are doing for power management with AVX use, I saw some mention of that.
(I.e., what triggers AVX use, can they "save power" if they don't touch the registers, if they believe AVX is not being used? Suppose they rely on some hardware bit that isn't set under exactly the expected conditions?)

type Xstate struct {
   Fpstate Fpstate
   Hdr Header
   Ymmh Ymmh_state
}

type Fpstate struct {
   Cwd uint16
   Swd uint16
   Twd uint16
   Fop uint16
   Rip uint64
   Rdp uint64
   Mxcsr uint32
   Mxcsr_mask uint32
   St_space [32]uint32
   Xmm_space [64]uint32
   Reserved2 [12]uint32
   Reserved3 [12]uint32
}

type Header struct {
   Xfeatures uint64
   Reserved1 [2]uint64
   Reserved2 [5]uint64
}

type Ymmh_state struct {
   Space [64]uint32
}
TEXT runtime·getymm(SB),NOSPLIT,$0
    MOVQ    0(FP), AX
    c Y0,0(AX)
    VMOVDQU Y1,(1*32)(AX)
    VMOVDQU Y2,(2*32)(AX)
    VMOVDQU Y3,(3*32)(AX)
    VMOVDQU Y4,(4*32)(AX)
    VMOVDQU Y5,(5*32)(AX)
    VMOVDQU Y6,(6*32)(AX)
    VMOVDQU Y7,(7*32)(AX)
    VMOVDQU Y8,(8*32)(AX)
    VMOVDQU Y9,(9*32)(AX)
    VMOVDQU Y10,(10*32)(AX)
    VMOVDQU Y11,(11*32)(AX)
    VMOVDQU Y12,(12*32)(AX)
    VMOVDQU Y13,(13*32)(AX)
    VMOVDQU Y14,(14*32)(AX)
    VMOVDQU Y15,(15*32)(AX)
    RET
@aclements

This comment has been minimized.

Copy link
Member Author

@aclements aclements commented Nov 25, 2019

An update from over in #35326: I've bisected the issue to kernel commit torvalds/linux@d9c9ce3, which happened between v5.1 and v5.2. It also requires the kernel to be built with GCC 9 (GCC 8 does not reproduce the issue).

@dr2chase

This comment has been minimized.

Copy link
Contributor

@dr2chase dr2chase commented Nov 26, 2019

Not sure where Austin's reporting this or if he had time today, but:

  • he has a C program demonstrating the bug in Linux 5.3 (built with gcc 9) for purposes of filing a bug soonish;
  • there is a workaround on the Go implementation side (be sure the signal stack is mapped);
  • I managed to create a failing Linux 5.3 where the entire kernel is compiled with gcc 8, except for arch/x86/kernel/fpu/signal.c.
@zikaeroh

This comment has been minimized.

Copy link

@zikaeroh zikaeroh commented Nov 26, 2019

All of the progress updates have been going on #35326. (Most recently, #35326 (comment).)

@dvyukov

This comment has been minimized.

Copy link
Member

@dvyukov dvyukov commented Nov 26, 2019

There is this commit that clams to be fixing something in the culprit commit:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b81ff1013eb8eef2934ca7e8cf53d553c1029e84
I don't know if it will help or not, but @aclements if you have test setup ready, may be worth cherry-picking and trying.

@cherrymui

This comment has been minimized.

Copy link
Contributor

@cherrymui cherrymui commented Nov 26, 2019

I think that commit is already included in 5.2 and 5.3 kernel, which still has the problem.

@aclements

This comment has been minimized.

Copy link
Member Author

@aclements aclements commented Nov 26, 2019

Thanks @dvyukov. I just re-confirmed that I can still reproduce it in the same way on 5.3, which includes that commit. I'll double check that I can still reproduce right at that commit, just in case it was somehow re-introduced later.

@aclements

This comment has been minimized.

Copy link
Member Author

@aclements aclements commented Nov 26, 2019

Reproduced at torvalds/linux@b81ff10, as well as v5.4, which was just released.

I've filed the upstream kernel bug here: https://bugzilla.kernel.org/show_bug.cgi?id=205663

@dr2chase

This comment has been minimized.

Copy link
Contributor

@dr2chase dr2chase commented Dec 3, 2019

Prefer touching the signal stack because it reduces the need to explain things to users, though I hope we're talking about a small number of people anyway (latency-sensitive Go users on latest-N-greatest Linux for the next few months). And, also, it will be trivial to remove if/when we get around to doing that.

@bradfitz

This comment has been minimized.

Copy link
Member

@bradfitz bradfitz commented Dec 3, 2019

Why not both? 🤷‍♂️

@mdempsky

This comment has been minimized.

Copy link
Member

@mdempsky mdempsky commented Dec 3, 2019

Since we've isolated the issue to 5.2–5.4, and 5.2 is EOL while 5.3 and 5.4 should have fixes soon... Will there still be a need to work around this issue in 2 months once Go 1.14 is released?

@aclements

This comment has been minimized.

Copy link
Member Author

@aclements aclements commented Dec 3, 2019

I think we need a workaround for the beta, which is imminent. Beyond that, my inclination is to keep a workaround in place for 1.14 because the workaround is low cost and the impact of this bug is so subtle. But I'm okay with removing the workaround in 1.15, given the kernel release cycles.

@bradfitz

This comment has been minimized.

Copy link
Member

@bradfitz bradfitz commented Dec 3, 2019

@mdempsky, if we were confident that the kernels won't be in use then, maybe. But if there's a chance people will be running Go on them, we really don't want more of these bug reports.

@mdempsky

This comment has been minimized.

Copy link
Member

@mdempsky mdempsky commented Dec 3, 2019

@bradfitz Can we change the GitHub issue template to supply us with Linux kernel version? That would at least make it as easy as recognizing they're using an affected kernel, and suggesting to update.

I think both of Austin's CLs are fine. I just think if it's impossible to completely workaround the issue from userspace, then we should really push on getting it fixed in the kernel and for systems to get those fixes.

@bradfitz

This comment has been minimized.

Copy link
Member

@bradfitz bradfitz commented Dec 3, 2019

@mdempsky, the kernel version is already in the output of go bug, but people don't use that often as the environment where the Go binary is deployed and hitting problems likely a) doesn't have the "go" command available, b) likely has a different kernel version than their dev machine.

I suppose we could update https://github.com/golang/go/blob/master/.github/ISSUE_TEMPLATE but the longer we make that, the more often people get overwhelmed and just delete it all.

@networkimprov

This comment has been minimized.

Copy link

@networkimprov networkimprov commented Dec 3, 2019

Runtime (+ stdlib?) panics could print the OS version?

@myitcv

This comment has been minimized.

Copy link
Member

@myitcv myitcv commented Dec 3, 2019

For anyone running Ubuntu, the mainline daily kernel build now contains the relevant fix:

https://kernel.ubuntu.com/~kernel-ppa/mainline/daily/2019-12-03/

As expected, a variant of @mvdan's reproducer fails to trigger the issue reported in #35326:

while true; do td=$(mktemp -d); export GOCACHE=$td; go clean -testcache -cache; go vet encoding/... |& tee out; grep bexport out && break; rm -rf $td; done

I'll now run with this kernel and tip, upgrading to 5.4.2 when it's released on Ubuntu.

@mpx

This comment has been minimized.

Copy link
Contributor

@mpx mpx commented Dec 4, 2019

The patch is now in 5.3.15-rc1, so 5.3.15 will have the fix.

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Dec 4, 2019

Change https://golang.org/cl/209899 mentions this issue: runtime: mlock top of signal stack on Linux 5.2–5.4.1

@aclements

This comment has been minimized.

Copy link
Member Author

@aclements aclements commented Dec 4, 2019

Okay, one more shot at the workaround. @cherrymui pointed out the issue also almost certainly affects profiling signals, and neither of the workarounds I posted can be applied to profiling signals. So I went ahead and wrote the mlock solution: CL 209899. It's only slightly more complex than the workaround to disable asynchronous preemption, since the complexity of both is dominated by getting and parsing the kernel version.

@aclements

This comment has been minimized.

Copy link
Member Author

@aclements aclements commented Dec 4, 2019

And the patch was just merged into Linux 5.4.2.

gopherbot pushed a commit that referenced this issue Dec 5, 2019
This will be used to parse the Linux kernel versions, but this code is
generic and can be tested on its own.

For #35777.

Change-Id: If1df48d07250e5855dde45bc9d57c66f777b9fb4
Reviewed-on: https://go-review.googlesource.com/c/go/+/209597
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
gopherbot pushed a commit that referenced this issue Dec 5, 2019
Linux 5.2 introduced a bug that can corrupt vector registers on return
from a signal if the signal stack isn't faulted in:
https://bugzilla.kernel.org/show_bug.cgi?id=205663

This CL works around this by mlocking the top page of all Go signal
stacks on the affected kernels.

Fixes #35326, #35777

Change-Id: I77c80a2baa4780827633f92f464486caa222295d
Reviewed-on: https://go-review.googlesource.com/c/go/+/209899
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: David Chase <drchase@google.com>
@aclements

This comment has been minimized.

Copy link
Member Author

@aclements aclements commented Dec 5, 2019

Fixed by commit 8174f7f (I messed up the magic commit syntax)

@aclements

This comment has been minimized.

Copy link
Member Author

@aclements aclements commented Dec 5, 2019

I've filed a tracking bug to remove the workaround for Go 1.15: #35979.

@aclements

This comment has been minimized.

Copy link
Member Author

@aclements aclements commented Dec 5, 2019

To close with a summary:

Linux 5.2 introduced a bug when compiled with GCC 9 that could cause vector register corruption on amd64 on return from a signal handler where the top page of the signal stack had not yet been paged in. This can affect any program in any language (assuming it uses at least SSE registers), including versions of Go before 1.14, and generally results in arbitrary memory corruption. It became significantly more likely in Go 1.14 because the addition of signal-based non-cooperative preemption significantly increased the number of asynchronous signals received by Go processes. It's also somewhat more likely in Go than other languages because Go regularly creates new OS threads with alternate signal stacks that are likely not to be paged in.

The kernel bug was fixed in Linux 5.3.15 and 5.4.2, and the fix will appear in all 5.5 and future releases. 5.4 is a long-term support release, and 5.4.2 was released with the fix just 10 days after 5.4.

For Go 1.14, we introduced a workaround that mlocks the top page of each signal stack on the affected kernel versions to ensure it is paged in and remains paged in.

Thanks everyone for helping track this down!

I'll keep this issue pinned until next week for anyone running a tip from before the workaround.

@lmb

This comment has been minimized.

Copy link
Contributor

@lmb lmb commented Dec 5, 2019

Do you have an idea how much memory is going to be mlocked? Distros have different values for RLIMIT_MEMLOCK, some of them are pretty low.

@mdempsky

This comment has been minimized.

Copy link
Member

@mdempsky mdempsky commented Dec 5, 2019

Looks like the workaround CL only applies to linux/amd64. Shouldn't it apply to linux/386 too?

@bradfitz

This comment has been minimized.

Copy link
Member

@bradfitz bradfitz commented Dec 5, 2019

Looks like the workaround CL only applies to linux/amd64. Shouldn't it apply to linux/386 too?

Elsewhere @aclements said he's been unable to reproduce the problem on 386.

@dr2chase

This comment has been minimized.

Copy link
Contributor

@dr2chase dr2chase commented Dec 5, 2019

@lmb How low is "pretty low"? Expected number of pages locked is O(threads) (not goroutines) pages, since it is one per page. Unless you have a lot of goroutines tied to threads, ought to be GOMAXPROCS pages, plus a few for bad luck.

and this is also tied to just a few versions of Linux that we hope nobody will be using a year from now.

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Dec 5, 2019

Change https://golang.org/cl/210098 mentions this issue: runtime: give useful failure message on mlock failure

@mdempsky

This comment has been minimized.

Copy link
Member

@mdempsky mdempsky commented Dec 5, 2019

Elsewhere @aclements said he's been unable to reproduce the problem on 386.

Hm, where was that? I looked through this issue and #35326, and didn't notice any comments to that effect.

@aclements did mention that it also affects XMM registers, which are available on 386. The linux kernel fix looks generic to all of x86, not amd64-specific.

I'm willing to believe it doesn't affect 386 executables, but then I'm curious why not.

@zikaeroh

This comment has been minimized.

Copy link

@zikaeroh zikaeroh commented Dec 5, 2019

@mdempsky In the comments on CL 209899.

@mdempsky

This comment has been minimized.

Copy link
Member

@mdempsky mdempsky commented Dec 5, 2019

@zikaeroh Hiding in plain sight. Thanks.

@aclements

This comment has been minimized.

Copy link
Member Author

@aclements aclements commented Dec 5, 2019

@mdempsky: https://go-review.googlesource.com/c/go/+/209899/3/src/runtime/os_linux_386.go#7 (it's a little buried)

It may just be harder to reproduce. But we do use the X registers in memmove on 386, so I would still have expected to see it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.