Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: mlock of signal stack failed: 12 #37436

Closed
karalabe opened this issue Feb 25, 2020 · 128 comments
Closed

runtime: mlock of signal stack failed: 12 #37436

karalabe opened this issue Feb 25, 2020 · 128 comments
Milestone

Comments

@karalabe
Copy link
Contributor

@karalabe karalabe commented Feb 25, 2020

What version of Go are you using (go version)?

$ go version
go version go1.14rc1 linux/amd64

Does this issue reproduce with the latest release?

I hit this with golang:1.14-rc-alpine docker image, the error does not happen in 1.13.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOENV="/root/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build968395959=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Clone https://github.com/ethereum/go-ethereum, replace the builder version in Dockerfile to golang:1.14-rc-alpine (or use the Dockerfile from below), then from the root build the docker image:

$ docker build .

FROM golang:1.14-rc-alpine

RUN apk add --no-cache make gcc musl-dev linux-headers git

ADD . /go-ethereum
RUN cd /go-ethereum && make geth

What did you expect to see?

Go should run our build scripts successfully.

What did you see instead?

Step 4/9 : RUN cd /go-ethereum && make geth
 ---> Running in 67781151653c
env GO111MODULE=on go run build/ci.go install ./cmd/geth
runtime: mlock of signal stack failed: 12
runtime: increase the mlock limit (ulimit -l) or
runtime: update your kernel to 5.3.15+, 5.4.2+, or 5.5+
fatal error: mlock failed

runtime stack:
runtime.throw(0xa3b461, 0xc)
	/usr/local/go/src/runtime/panic.go:1112 +0x72
runtime.mlockGsignal(0xc0004a8a80)
	/usr/local/go/src/runtime/os_linux_x86.go:72 +0x107
runtime.mpreinit(0xc000401880)
	/usr/local/go/src/runtime/os_linux.go:341 +0x78
runtime.mcommoninit(0xc000401880)
	/usr/local/go/src/runtime/proc.go:630 +0x108
runtime.allocm(0xc000033800, 0xa82400, 0x0)
	/usr/local/go/src/runtime/proc.go:1390 +0x14e
runtime.newm(0xa82400, 0xc000033800)
	/usr/local/go/src/runtime/proc.go:1704 +0x39
runtime.startm(0x0, 0xc000402901)
	/usr/local/go/src/runtime/proc.go:1869 +0x12a
runtime.wakep(...)
	/usr/local/go/src/runtime/proc.go:1953
runtime.resetspinning()
	/usr/local/go/src/runtime/proc.go:2415 +0x93
runtime.schedule()
	/usr/local/go/src/runtime/proc.go:2527 +0x2de
runtime.mstart1()
	/usr/local/go/src/runtime/proc.go:1104 +0x8e
runtime.mstart()
	/usr/local/go/src/runtime/proc.go:1062 +0x6e

...
make: *** [Makefile:16: geth] Error 2
@josharian
Copy link
Contributor

@josharian josharian commented Feb 25, 2020

That is the consequence of trying to work around a kernel bug that significantly impacts Go programs. See #35777. The error message suggests the only two known available fixes: increase the ulimit or upgrade to a newer kernel.

@karalabe
Copy link
Contributor Author

@karalabe karalabe commented Feb 25, 2020

The error message suggests the only two known available fixes: increase the ulimit or upgrade to a newer kernel.

Well, I'm running the official alpine docker image, the purpose of which is to be able to build a Go program. Apparently it cannot. IMHO the upstream image should be the one fixed to fulfill its purpose, not our build infra to hack around a bug in the upstream image.

@josharian
Copy link
Contributor

@josharian josharian commented Feb 25, 2020

Is the Alpine image maintained by the Go team? (Genuine question. I don’t know about it.) Either way, yes, the image should be fixed, ideally with a kernel upgrade.

@karalabe
Copy link
Contributor Author

@karalabe karalabe commented Feb 25, 2020

I'm not fully sure who and how maintains the docker images (https://hub.docker.com/_/golang), but the docker hub repo is an "Official Image", which is a super hard to obtain status, so I assume someone high enough the food chain is responsible.

@networkimprov
Copy link

@networkimprov networkimprov commented Feb 25, 2020

It's "maintained by the Docker Community". Issues should be filed at

https://github.com/docker-library/golang/issues

EDIT: the problem is the host kernel, not the Docker library image, so they can't fix it.

@karalabe
Copy link
Contributor Author

@karalabe karalabe commented Feb 25, 2020

So, the official solution to Go crashing is to point fingers to everyone else to hack around your code? Makes sense.

@josharian
Copy link
Contributor

@josharian josharian commented Feb 25, 2020

@karalabe I would like to remind you of https://golang.org/conduct. In particular, please be respectful and be charitable.

@karalabe
Copy link
Contributor Author

@karalabe karalabe commented Feb 25, 2020

Please answer the question

@josharian
Copy link
Contributor

@josharian josharian commented Feb 25, 2020

It is standard practice to redirect issues to the correct issue tracking system.

There is an extensive discussion of possible workarounds and fixes in the issue I linked to earlier, if you would like to see what options were considered on the Go side.

@karalabe
Copy link
Contributor Author

@karalabe karalabe commented Feb 25, 2020

This issue does not happen with Go 1.13. Ergo, it is a bug introduced in Go 1.14.

Saying you can't fix it and telling people to use workarounds it is dishonest, because reverting a piece of code would actually fix it. An alternative solution would be to detect the problematic platforms / kernels and provide a fallback mechanism baked into Go.

Telling people to use a different kernel is especially nasty, because it's not as if most people can go around and build themselves a new kernel. If alpine doesn't release a new kernel, there's not much most devs can do. And lastly if your project relies on a stable infrastructure where you can't just swap out kernels, you're again in a pickle.

It is standard practice to redirect issues to the correct issue tracking system.

The fact that Go crashes is not the fault of docker. Redirecting a Go crash to a docker repo is deflection.

@networkimprov
Copy link

@networkimprov networkimprov commented Feb 25, 2020

You could also disable preemptive scheduling at runtime

$ GODEBUG=asyncpreemptoff=1 ./your_app

@ianlancetaylor we have a suggestion to do this when running on an affected kernel; is that viable?

BTW, It's a known problem that Docker library modules don't get timely updates, which is a security liability. Caveat emptor.

@josharian
Copy link
Contributor

@josharian josharian commented Feb 25, 2020

The kernel bug manifested as random memory corruption in Go 1.13 (both with and without preemptive scheduling). What is new in Go 1.14 is that we detect the presence of the bug, attempt to work around it, and prefer to crash early and loudly if that is not possible. You can see the details in the issue I referred you to.

Since you have called me dishonest and nasty, I will remind you again about the code of conduct: https://golang.org/conduct. I am also done participating in this conversation.

@networkimprov
Copy link

@networkimprov networkimprov commented Feb 25, 2020

@karalabe, I misspoke, the issue is your host kernel, not the Docker image. Are you unable to update it?

@karalabe
Copy link
Contributor Author

@karalabe karalabe commented Feb 25, 2020

I'm on latest Ubuntu and latest available kernel. Apparently all available Ubuntu kernels are unsuitable for Go 1.14 https://packages.ubuntu.com/search?keywords=linux-image-generic based on the error message.

@networkimprov
Copy link

@networkimprov networkimprov commented Feb 25, 2020

Can you add the output of $ uname -a to the main issue text? And maybe remove the goroutine stack traces?

I've posted a note to golang-dev.

cc @aclements

@mwhudson
Copy link
Contributor

@mwhudson mwhudson commented Feb 25, 2020

When you say you are on the latest ubuntu and kernel what exactly do you mean (i.e. output of dpkg -l linux-image-*, lsb_release -a, uname -a, that sort of thing) because as far as I can see the fix is in the kernel in the updates pocket for both 19.10 (current stable release) and 20.04 (devel release). It's not in the GA kernel for 18.04 but is in the HWE kernel, but otoh those aren't built with gcc 9 and so shouldn't be affected anyway.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Feb 26, 2020

@networkimprov Disabling signal preemption makes the bug less likely to occur but it is still present. It's a bug in certain Linux kernel versions. The bug affects all programs in all languages. It's particularly likely to be observable with Go programs that use signal preemption, but it's present for all other programs as well.

Go tries to work around the bug by mlocking the signal stack. That works fine unless you run into the mlock limit. I suppose that one downside of this workaround is that we make the problem very visible, rather than occasionally failing due to random memory corruption as would happen if we didn't do the mlock.

At some point there is no way to work around a kernel bug.

@myitcv
Copy link
Member

@myitcv myitcv commented Feb 26, 2020

@karalabe

I'm on latest Ubuntu and latest available kernel

$ docker pull -q ubuntu:latest
docker.io/library/ubuntu:latest
$ docker run --rm -i -t ubuntu
root@e2689d364a25:/# uname -a
Linux e2689d364a25 5.4.8-050408-generic #202001041436 SMP Sat Jan 4 19:40:55 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

which does satisfy the minimum version requirements.

Similarly:

$ docker pull -q golang:1.14-alpine
docker.io/library/golang:1.14-alpine
$ docker run --rm -i -t golang:1.14-alpine
/go # uname -a
Linux d4a35392c5b8 5.4.8-050408-generic #202001041436 SMP Sat Jan 4 19:40:55 UTC 2020 x86_64 Linux

Can you clarify what you're seeing?

@karalabe
Copy link
Contributor Author

@karalabe karalabe commented Feb 26, 2020

@mwhudson

$ dpkg -l linux-image-*
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                   Version      Architecture Description
+++-======================================-============-============-===============================================================
rc  linux-image-4.13.0-16-generic          4.13.0-16.19 amd64        Linux kernel image for version 4.13.0 on 64 bit x86 SMP
rc  linux-image-4.13.0-19-generic          4.13.0-19.22 amd64        Linux kernel image for version 4.13.0 on 64 bit x86 SMP
rc  linux-image-4.13.0-21-generic          4.13.0-21.24 amd64        Linux kernel image for version 4.13.0 on 64 bit x86 SMP
rc  linux-image-4.13.0-25-generic          4.13.0-25.29 amd64        Linux kernel image for version 4.13.0 on 64 bit x86 SMP
rc  linux-image-4.13.0-36-generic          4.13.0-36.40 amd64        Linux kernel image for version 4.13.0 on 64 bit x86 SMP
rc  linux-image-4.13.0-37-generic          4.13.0-37.42 amd64        Linux kernel image for version 4.13.0 on 64 bit x86 SMP
rc  linux-image-4.13.0-38-generic          4.13.0-38.43 amd64        Linux kernel image for version 4.13.0 on 64 bit x86 SMP
rc  linux-image-4.13.0-41-generic          4.13.0-41.46 amd64        Linux kernel image for version 4.13.0 on 64 bit x86 SMP
rc  linux-image-4.13.0-45-generic          4.13.0-45.50 amd64        Linux kernel image for version 4.13.0 on 64 bit x86 SMP
rc  linux-image-4.15.0-23-generic          4.15.0-23.25 amd64        Signed kernel image generic
rc  linux-image-4.15.0-30-generic          4.15.0-30.32 amd64        Signed kernel image generic
rc  linux-image-4.15.0-32-generic          4.15.0-32.35 amd64        Signed kernel image generic
rc  linux-image-4.15.0-34-generic          4.15.0-34.37 amd64        Signed kernel image generic
rc  linux-image-4.15.0-36-generic          4.15.0-36.39 amd64        Signed kernel image generic
rc  linux-image-4.15.0-39-generic          4.15.0-39.42 amd64        Signed kernel image generic
rc  linux-image-4.15.0-42-generic          4.15.0-42.45 amd64        Signed kernel image generic
rc  linux-image-4.15.0-43-generic          4.15.0-43.46 amd64        Signed kernel image generic
rc  linux-image-4.15.0-45-generic          4.15.0-45.48 amd64        Signed kernel image generic
rc  linux-image-4.15.0-47-generic          4.15.0-47.50 amd64        Signed kernel image generic
rc  linux-image-4.18.0-17-generic          4.18.0-17.18 amd64        Signed kernel image generic
rc  linux-image-5.0.0-13-generic           5.0.0-13.14  amd64        Signed kernel image generic
rc  linux-image-5.0.0-15-generic           5.0.0-15.16  amd64        Signed kernel image generic
rc  linux-image-5.0.0-16-generic           5.0.0-16.17  amd64        Signed kernel image generic
rc  linux-image-5.0.0-17-generic           5.0.0-17.18  amd64        Signed kernel image generic
rc  linux-image-5.0.0-19-generic           5.0.0-19.20  amd64        Signed kernel image generic
rc  linux-image-5.0.0-20-generic           5.0.0-20.21  amd64        Signed kernel image generic
rc  linux-image-5.0.0-21-generic           5.0.0-21.22  amd64        Signed kernel image generic
rc  linux-image-5.0.0-25-generic           5.0.0-25.26  amd64        Signed kernel image generic
rc  linux-image-5.0.0-27-generic           5.0.0-27.28  amd64        Signed kernel image generic
rc  linux-image-5.0.0-29-generic           5.0.0-29.31  amd64        Signed kernel image generic
rc  linux-image-5.0.0-32-generic           5.0.0-32.34  amd64        Signed kernel image generic
rc  linux-image-5.3.0-19-generic           5.3.0-19.20  amd64        Signed kernel image generic
rc  linux-image-5.3.0-22-generic           5.3.0-22.24  amd64        Signed kernel image generic
rc  linux-image-5.3.0-23-generic           5.3.0-23.25  amd64        Signed kernel image generic
rc  linux-image-5.3.0-24-generic           5.3.0-24.26  amd64        Signed kernel image generic
rc  linux-image-5.3.0-26-generic           5.3.0-26.28  amd64        Signed kernel image generic
ii  linux-image-5.3.0-29-generic           5.3.0-29.31  amd64        Signed kernel image generic
ii  linux-image-5.3.0-40-generic           5.3.0-40.32  amd64        Signed kernel image generic
rc  linux-image-extra-4.13.0-16-generic    4.13.0-16.19 amd64        Linux kernel extra modules for version 4.13.0 on 64 bit x86 SMP
rc  linux-image-extra-4.13.0-19-generic    4.13.0-19.22 amd64        Linux kernel extra modules for version 4.13.0 on 64 bit x86 SMP
rc  linux-image-extra-4.13.0-21-generic    4.13.0-21.24 amd64        Linux kernel extra modules for version 4.13.0 on 64 bit x86 SMP
rc  linux-image-extra-4.13.0-25-generic    4.13.0-25.29 amd64        Linux kernel extra modules for version 4.13.0 on 64 bit x86 SMP
rc  linux-image-extra-4.13.0-36-generic    4.13.0-36.40 amd64        Linux kernel extra modules for version 4.13.0 on 64 bit x86 SMP
rc  linux-image-extra-4.13.0-37-generic    4.13.0-37.42 amd64        Linux kernel extra modules for version 4.13.0 on 64 bit x86 SMP
rc  linux-image-extra-4.13.0-38-generic    4.13.0-38.43 amd64        Linux kernel extra modules for version 4.13.0 on 64 bit x86 SMP
rc  linux-image-extra-4.13.0-41-generic    4.13.0-41.46 amd64        Linux kernel extra modules for version 4.13.0 on 64 bit x86 SMP
rc  linux-image-extra-4.13.0-45-generic    4.13.0-45.50 amd64        Linux kernel extra modules for version 4.13.0 on 64 bit x86 SMP
ii  linux-image-generic                    5.3.0.40.34  amd64        Generic Linux kernel image

$ lsb_release -a

No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 19.10
Release:	19.10
Codename:	eoan
$ uname -a

Linux roaming-parsley 5.3.0-40-generic #32-Ubuntu SMP Fri Jan 31 20:24:34 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
$ sudo apt-get dist-upgrade 

Reading package lists... Done
Building dependency tree       
Reading state information... Done
Calculating upgrade... Done
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.

@myitcv

FROM golang:1.14-alpine
RUN  apk add --no-cache make gcc musl-dev linux-headers git wget

RUN \
  wget -O geth.tgz "https://github.com/ethereum/go-ethereum/archive/v1.9.11.tar.gz" && \
  mkdir /go-ethereum && tar -C /go-ethereum -xzf geth.tgz --strip-components=1 && \
  cd /go-ethereum && make geth
$ docker build .

Sending build context to Docker daemon  2.048kB
Step 1/3 : FROM golang:1.14-alpine
1.14-alpine: Pulling from library/golang
c9b1b535fdd9: Already exists 
cbb0d8da1b30: Already exists 
d909eff28200: Already exists 
8b9d9d6824f5: Pull complete 
a50ef8b76e53: Pull complete 
Digest: sha256:544b5e7984e7b2e7a2a9b967bbab6264cf91a3b3816600379f5dc6fbc09466cc
Status: Downloaded newer image for golang:1.14-alpine
 ---> 51e47ee4db58

Step 2/3 : RUN  apk add --no-cache make gcc musl-dev linux-headers git wget
 ---> Running in 879f98ddb4ff
[...]
OK: 135 MiB in 34 packages
Removing intermediate container 879f98ddb4ff
 ---> 9132e4dae4c3

Step 3/3 : RUN   wget -O geth.tgz "https://github.com/ethereum/go-ethereum/archive/v1.9.11.tar.gz" &&   mkdir /go-ethereum && tar -C /go-ethereum -xzf geth.tgz --strip-components=1 &&   cd /go-ethereum && make geth
 ---> Running in a24c806c60d3
2020-02-26 07:18:54--  https://github.com/ethereum/go-ethereum/archive/v1.9.11.tar.gz
[...]
2020-02-26 07:18:58 (2.48 MB/s) - 'geth.tgz' saved [8698235]

env GO111MODULE=on go run build/ci.go install ./cmd/geth
runtime: mlock of signal stack failed: 12
runtime: increase the mlock limit (ulimit -l) or
runtime: update your kernel to 5.3.15+, 5.4.2+, or 5.5+
fatal error: mlock failed
@myitcv
Copy link
Member

@myitcv myitcv commented Feb 26, 2020

Sorry, my previous comment was misleading. Because of course the kernel version returned by uname -a within the Docker container will be that of the host.

Hence per:

$ uname -a

Linux roaming-parsley 5.3.0-40-generic #32-Ubuntu SMP Fri Jan 31 20:24:34 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

you need to upgrade the host OS kernel.

FWIW, the steps you lay out above using Alpine to make geth work for me.:

...
Done building.
Run "./build/bin/geth" to launch geth.
@karalabe
Copy link
Contributor Author

@karalabe karalabe commented Feb 26, 2020

Yes, but in my previous posts I highlighted that I'm already on the latest Ubuntu and have installed the latest available kernel from the package repository. I don't see how I could update my kernel to work with Go 1.14 apart from rebuilding the entire kernel from source. Maybe I'm missing something?

@karalabe
Copy link
Contributor Author

@karalabe karalabe commented Feb 26, 2020

Just to emphasize, I do understand what the workaround is and if I want to make it work, I can. I opened this issue report because I'd expect other people to hit the same problem eventually. If just updating my system would fix the issue I'd gladly accept that as a solution, but unless I'm missing something, the fixed kernel is not available for (recent) Ubuntu users, so quite a large userbase might be affected.

@mwhudson
Copy link
Contributor

@mwhudson mwhudson commented Feb 26, 2020

Yes, but in my previous posts I highlighted that I'm already on the latest Ubuntu and have installed the latest available kernel from the package repository. I don't see how I could update my kernel to work with Go 1.14 apart from rebuilding the entire kernel from source. Maybe I'm missing something?

Hm yes, I have just reproduced on focal too. The fix is present in the git for the Ubuntu eoan kernel: https://kernel.ubuntu.com/git/ubuntu/ubuntu-eoan.git/commit/?id=59e7e6398a9d6d91cd01bc364f9491dc1bf2a426 and that commit is in the ancestry for the 5.3.0-40.32 so the fix should be in the kernel you are using. In other words, I think we need to get the kernel team involved -- I'll try to do that.

@myitcv
Copy link
Member

@myitcv myitcv commented Feb 26, 2020

@karalabe - I've just realised my mistake: I though I was using the latest Ubuntu, I am in fact using eoan.

@mwhudson - just one thing to note (although you're probably already aware of this), a superficial glance at the code responsible for this switch:

if major == 5 && (minor == 2 || minor == 3 && patch < 15 || minor == 4 && patch < 2) {
gsignalInitQuirk = mlockGsignal
if m0.gsignal != nil {
throw("gsignal quirk too late")
}
}

seems to suggest that the Go side is checking for patch release 15 or greater. What does 5.3.0-40.32 report as a patch version? I'm guessing 0?

Re-opening this discussion until we round out the issue here.

@myitcv myitcv reopened this Feb 26, 2020
@neelance
Copy link
Member

@neelance neelance commented Feb 26, 2020

A little summary because I had to piece it together myself:

So it seems like Ubuntu's kernel is patched, but the workaround gets enabled anyways.

@mwhudson
Copy link
Contributor

@mwhudson mwhudson commented Feb 26, 2020

So it seems like Ubuntu's kernel is patched, but the workaround gets enabled anyways.

Oh right, yes I should actually read the failure shouldn't I? This is the workaround failing rather than the original bug, in a case where the workaround isn't actually needed but there's no good way for Go to know this. I can patch the check out of the Go 1.14 package in Ubuntu but that doesn't help users running e.g. the docker golang:1.14-alpine image. Hrm.

@mwhudson
Copy link
Contributor

@mwhudson mwhudson commented Feb 26, 2020

I guess the question is, how many users are using "vulnerable" kernels at this point. There can't be all that many distributions that are compiling an unpatched kernel with gcc 9 by now.

arush-sal added a commit to arush-sal/provider-aws that referenced this issue Jul 7, 2020
Bumps submodule to use go 1.14.4 to pick up a
workaround for linux kernel bug described in
this isssue golang/go#37436

Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
@DanielShaulov
Copy link

@DanielShaulov DanielShaulov commented Jul 13, 2020

Just a heads up - Go1.15 is about to be released, and a beta was already released, but the temporary patch was not yet removed (There are todo comments to remove at Go1.15).

I think it is important to remove the workaround since Ubuntu 20.04 LTS uses a patched 5.4.0 kernel. This means that any user on Ubuntu 20.04 will still unnecessarily mlock pages, and if he runs in a docker container, that warning will be displayed for every crash, disregarding the fact that his kernel is not really buggy. So those users might be sent on a wild goose chase trying to understand and read all this info, and it will have nothing to do with their bug, probably for the entirety of Ubuntu 20.04 life cycle.

@networkimprov
Copy link

@networkimprov networkimprov commented Jul 13, 2020

@DanielShaulov thanks. Could you open a new issue for that? This one pertains to the problem in 1.14.

@DanielShaulov
Copy link

@DanielShaulov DanielShaulov commented Jul 13, 2020

@gopherbot
Copy link

@gopherbot gopherbot commented Jul 20, 2020

Change https://golang.org/cl/243658 mentions this issue: runtime: let GODEBUG=mlock=0 disable mlock calls

aarongable added a commit to letsencrypt/boulder that referenced this issue Jul 20, 2020
This was necessary to work around a poor interaction between
Go 1.4.x and unpatched linux kernels. Although we are still using
the same version of Go, and the Linux project only released the
fix in kernel 5.4.2 and later, Ubuntu has backported the fix into
Focal Fossa 20.04's 5.4.0 kernel. Therefore this workaround is
no longer needed.
golang/go#37436 (comment)

This also removes one need for elevated permissions, making it
easier to use docker rootless for development.
@gopherbot
Copy link

@gopherbot gopherbot commented Jul 22, 2020

Change https://golang.org/cl/244059 mentions this issue: runtime: don't mlock on Ubuntu 5.4 systems

gopherbot pushed a commit that referenced this issue Jul 22, 2020
For #35777
For #37436
Fixes #40184

Change-Id: I68561497d9258e994d1c6c48d4fb41ac6130ee3a
Reviewed-on: https://go-review.googlesource.com/c/go/+/244059
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
@ciceroverneck
Copy link

@ciceroverneck ciceroverneck commented Jul 31, 2020

When does go1.14.7 release with this modification?

@networkimprov
Copy link

@networkimprov networkimprov commented Jul 31, 2020

The fix has been in every release since 1.14.1, which shipped months ago.

@gopherbot
Copy link

@gopherbot gopherbot commented Jul 31, 2020

Change https://golang.org/cl/246200 mentions this issue: runtime: revert signal stack mlocking

gopherbot pushed a commit that referenced this issue Aug 13, 2020
Go 1.14 included a (rather awful) workaround for a Linux kernel bug
that corrupted vector registers on x86 CPUs during signal delivery
(https://bugzilla.kernel.org/show_bug.cgi?id=205663). This bug was
introduced in Linux 5.2 and fixed in 5.3.15, 5.4.2 and all 5.5 and
later kernels. The fix was also back-ported by major distros. This
workaround was necessary, but had unfortunate downsides, including
causing Go programs to exceed the mlock ulimit in many configurations
(#37436).

We're reasonably confident that by the Go 1.16 release, the number of
systems running affected kernels will be vanishingly small. Hence,
this CL removes this workaround.

This effectively reverts CLs 209597 (version parser), 209899 (mlock
top of signal stack), 210299 (better failure message), 223121 (soft
mlock failure handling), and 244059 (special-case patched Ubuntu
kernels). The one thing we keep is the osArchInit function. It's empty
everywhere now, but is a reasonable hook to have.

Updates #35326, #35777 (the original register corruption bugs).
Updates #40184 (request to revert in 1.15).
Fixes #35979.

Change-Id: Ie213270837095576f1f3ef46bf3de187dc486c50
Reviewed-on: https://go-review.googlesource.com/c/go/+/246200
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
wadells added a commit to gravitational/gravity that referenced this issue Aug 13, 2020
Without this, `go get` will fail on Linux 5.4.0 (Ubuntu 20.04) when it
hits golang/go#37436, with the following
signature:

  walt@work:~/git/gravity/docs$ make docs
  mkdir -p ../build/docs
  docker build \
  --build-arg UID=$(id -u) \
          --build-arg GID=$(id -g) \
          --build-arg WORKDIR=/home \
          --build-arg PORT=6601 \
          --no-cache \
          --tag docs-buildbox:latest .
  Sending build context to Docker daemon  11.16MB
  Step 1/11 : FROM quay.io/gravitational/debian-venti:go1.14-stretch AS
  milv-builder
   ---> cd741211fe17
  Step 2/11 : RUN GO111MODULE=on go get -u -v
  github.com/magicmatatjahu/milv@v0.0.6
   ---> Running in aa3ca3e24c02
  go: downloading github.com/magicmatatjahu/milv v0.0.6
  go: finding module for package github.com/pkg/errors
  go: finding module for package github.com/olekukonko/tablewriter
  go: finding module for package gopkg.in/yaml.v2
  go: finding module for package github.com/schollz/closestmatch
  go: finding module for package golang.org/x/net/html
  go: downloading gopkg.in/yaml.v2 v2.3.0
  go: downloading github.com/pkg/errors v0.9.1
  go: downloading github.com/schollz/closestmatch v1.0.0
  go: downloading github.com/olekukonko/tablewriter v0.0.4
  go: downloading golang.org/x/net v0.0.0-20200707034311-ab3426394381
  runtime: mlock of signal stack failed: 12
  runtime: increase the mlock limit (ulimit -l) or
  runtime: update your kernel to 5.3.15+, 5.4.2+, or 5.5+
  fatal error: mlock failed

While this won't affect our current release infra (kernel 3.10.0), it is
an important fix for developers running affected kernel versions.
wadells added a commit to gravitational/gravity that referenced this issue Aug 13, 2020
Without this, `go get` will fail on Linux 5.4.0 (Ubuntu 20.04) when it
hits golang/go#37436, with the following
signature:

  walt@work:~/git/gravity/docs$ make docs
  mkdir -p ../build/docs
  // snip ...
  go: downloading golang.org/x/net v0.0.0-20200707034311-ab3426394381
  runtime: mlock of signal stack failed: 12
  runtime: increase the mlock limit (ulimit -l) or
  runtime: update your kernel to 5.3.15+, 5.4.2+, or 5.5+
  fatal error: mlock failed

While this won't affect our current release infra (kernel 3.10.0), it is
an important fix for developers running affected kernel versions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.