Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

syscall: Exec of a setuid root binary sometimes doesn't end up as root on Linux #19546

Closed
chipaca opened this issue Mar 14, 2017 · 13 comments
Closed

Comments

@chipaca
Copy link
Contributor

@chipaca chipaca commented Mar 14, 2017

What version of Go are you using (go version)?

go version go1.8 linux/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH=""
GORACE=""
GOROOT="/home/john/go"
GOTOOLDIR="/home/john/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build571643010=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
PKG_CONFIG="pkg-config"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"

What did you do?

https://gist.github.com/chipaca/806c90d96c437444f27f45a83d00a813
and running

for i in `seq -w 99`; do ./a_go; done

(it's a race, so add more 9s if your machine is zippy)

What did you expect to see?

nothing, meaning the new process run as root.

What did you see instead?

a variable number of GOT 1000 lines, indicating that the new process sometimes was not running as root.

Notes

  • This is probably not a Go bug; see the a_p reproducer that only uses pthreads, no go involved.
  • This is probably not a pthreads bug: commenting out the import "C" line and building a “pure” Go binary reproduces the same issue (plus a lot of spam because of an unhandled EAGAIN from clone(2)).
  • Apparently you can work around (or mitigate) the issue by pinning your process to a single cpu. This works for a_go but not a_p above. Also running with a single core makes a_go behave, but not a_p.
  • This is probably a bug in the kernel itself. Our kernel team are looking into it. I'm filing this so that if other people come across this they don't waste too much time puzzling over it.
@bradfitz bradfitz changed the title syscall.Exec of a setuid root binary sometimes doesn't end up as root syscall: Exec of a setuid root binary sometimes doesn't end up as root on Linux Mar 14, 2017
@bradfitz bradfitz added this to the Unplanned milestone Mar 14, 2017
@chipaca
Copy link
Contributor Author

@chipaca chipaca commented May 17, 2017

So, we tracked this down to a couple of kernel races which have existed in the kernel since ~2009.
Colin King worked on it (see lp:1687079 and korg:195453). There's a fix in there that'll be coming to Ubuntu soon, and will be upstreamed hopefully (but note we haven't had much response from Al Viro et al on this, at least so far).
Not sure if this means this bug should be closed now or not :-) the fix is going to take a long time to get places; if this bug bites people running on strange machines (usually tied to specific, old, kernels), it might be best to add a mitigation in go itself (basically have a lock you grab for clone and for exec).
The latter is relatively easy to do, and I can propose a patch if wanted.
Over to you.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented May 17, 2017

I'm glad you sorted this out. Can you explain what needs to be locked to avoid the problem? The basic sequence for executing another process in Go is, of course,

clone
if (child) execve

Go programs also call clone to create new threads. Do we need to ensure that only one thread in the process calls clone in parallel, or do we need to ensure that only one thread in the process calls execve in parallel, or do we need to ensure that only one thread in the process does the sequence of clone; execve in parallel?

@chipaca
Copy link
Contributor Author

@chipaca chipaca commented May 17, 2017

There needs to be no execve going on around a clone for this not to be triggered (the problem arises because execve counts threads, and then acts based on that count).

The problem doesn't arise for the fork-then-exec, or at least I wasn't able to reproduce it via forking, possibly because there is in go a lock for forking already (and possibly the data structures that end up with the wrong data in the race get reset by the fork -- or something else entirely! I don't know).

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented May 17, 2017

Thanks. It's not entirely obvious to me how to fix this. It's easy to acquire a lock before clone and release it when clone is complete, and it's easy to acquire a lock before execve, but it's not so easy to release the lock after the execve is complete. Did you have an approach in mind?

@chipaca
Copy link
Contributor Author

@chipaca chipaca commented May 17, 2017

When one thread successfully execves all other threads are dead, and futexes are cleaned up on thread exit (this includes execve). Go's hand-rolled locks use futexes afaik, and on the cgo side pthread_mutex also is a futex.

@chipaca
Copy link
Contributor Author

@chipaca chipaca commented May 17, 2017

(further: pthread mutexes and condition variables are documented as not preserved on execve)

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented May 17, 2017

I don't understand how a futex could be cleaned up on thread exit, given that Go is using the system call directly, and is not calling the glibc functions. But there is a lot I don't know.

You offered to write a patch earlier--that is probably the best way to go if you have the time. Thanks.

@chipaca
Copy link
Contributor Author

@chipaca chipaca commented May 18, 2017

Yup! I've got a few things in my queue but should be able to get to it before EOW.

@chipaca
Copy link
Contributor Author

@chipaca chipaca commented May 20, 2017

OK, I've got a patch. Still needs some work (especially around splitting some bits so only Linux does this dance). Should I push it at gerrit for discussion? (I'm pretty certain it'll fail review as it stands)

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented May 20, 2017

Sending it to Gerrit is fine. Thanks.

@chipaca
Copy link
Contributor Author

@chipaca chipaca commented May 20, 2017

https://go-review.googlesource.com/43713 runtime, syscall: Workaround for bug in linux's execve
(first time using this; let me know if/how/when i miss up)

@gopherbot
Copy link

@gopherbot gopherbot commented Jun 15, 2017

CL https://golang.org/cl/43713 mentions this issue.

@gopherbot
Copy link

@gopherbot gopherbot commented Jun 15, 2017

CL https://golang.org/cl/45991 mentions this issue.

gopherbot pushed a commit that referenced this issue Jun 19, 2017
This is a runtime version of sync.RWMutex that can be used by code in
the runtime package. The type is not quite the same, in that the zero
value is not valid.

For future use by CL 43713.

Updates #19546

Change-Id: I431eb3688add16ce1274dab97285f555b72735bf
Reviewed-on: https://go-review.googlesource.com/45991
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
@gopherbot gopherbot closed this in 91139b8 Jun 20, 2017
@golang golang locked and limited conversation to collaborators Jun 20, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.