New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: remove assumptions on Android Bionic's TLS layout #29674

Open
rprichard opened this Issue Jan 11, 2019 · 7 comments

Comments

Projects
None yet
5 participants
@rprichard
Copy link

rprichard commented Jan 11, 2019

What version of Go are you using (go version)?

$ go version
go version devel +67164c3014 Fri Jan 4 20:25:45 2019 -0800 linux/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

Android

go env Output
$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/usr/local/google/home/rprichard/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/usr/local/google/home/rprichard/go"
GOPROXY=""
GORACE=""
GOROOT="/x/golang/go"
GOTMPDIR=""
GOTOOLDIR="/x/golang/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build009028863=/tmp/go-build -gno-record-gcc-switches"

Details

I posted about this issue earlier on golang-dev:
https://groups.google.com/forum/#!msg/golang-dev/yVrkFnYrYPE/2G3aFzYqBgAJ

I'm working on adding ELF TLS support to Android's libc (Bionic), and Go is making assumptions about Android's TLS memory layout that a future version of Android might break. Specifically:

  • On Android/{arm,arm64}, Go saves and restores its g register to a pthread key, which it accesses directly using the thread pointer. Go allocates a pthread key and assumes that it can find it at a positive offset, within 384 words after the thread pointer. ELF TLS on arm/arm64 allocates the space after the thread pointer to PT_TLS segments instead, which can be arbitrarily large. For maximum efficiency, Bionic might prefer to allocate the pthread keys at a negative offset from the TP, known at compile-time, but then Go can't find them.

  • On Android/{386,amd64}, Go uses %gs:0xf8/%fs:1d0 for its g register and assumes it can allocate the location by repeatedly calling pthread_key_create until it finds a key such that pthread_setspecific modifies the desired location. This assumption breaks if Bionic has an even number of TLS slots, because then Go's fixed location is allocated to a pthread_key_data_t::seq field rather than a pthread_key_data_t::data field.

I suspect Go is assuming that its pthread-allocated word is initialized to zero, but that's not generally the case when pthread keys are recycled. Bionic's pthread_getspecific lazily initializes a key to zero. (It'd be great if I could get confirmation on this point.)

Android may be able to keep Go working, but it would be better if Go had a reliable interface for accessing its TLS memory.

There's a simple way to fix arm/arm64 by adding an API to Bionic that allocates a single word of static TLS memory at a fixed offset. For example, I could add something like this to Bionic:

int pthread_alloc_static_tls_word_np(intptr_t* tpoff, void** tlsbase)
(https://android-review.googlesource.com/c/platform/bionic/+/862083)

gcc_android_{arm,arm64}.c could then look for the new API with dlopen and fall back to the current allocation behavior if the API is missing.

386/amd64 is harder to fix. Two possibilities:

  • It could use the API above, then use something resembling a TLS IE access in each function prologue. (Something like this: https://godbolt.org/z/5Cfw7S.)
  • Switch amd64 to an ARM-like design: pick a GPR to hold g, then save/restore the g register to memory allocated with the above API. Drop support for Android/386.

For the ARM design, I'm wondering to what extent Go could use pthread_getspecific or dynamic TLS (whether __tls_get_addr or TLSDESC) to load the g register. Possible downsides:

  • Might be noticeably slower.
  • Neither system is guaranteed to be async-signal safe. Perhaps Bionic could guarantee AS-safety.
  • Dynamic TLS could use a lot of stack if it needs to allocate heap memory. Perhaps Go can guarantee a large stack in situations where the thread's TLS var isn't allocated yet.

I could upload a Go patch demonstrating arm/arm64 use of the API I proposed.

@rprichard

This comment has been minimized.

Copy link

rprichard commented Jan 11, 2019

@eliasnaur

This comment has been minimized.

Copy link
Contributor

eliasnaur commented Jan 11, 2019

Thank you for working on this. I'll page someone more knowledgable on the runtime than myself: @randall77 @ianlancetaylor.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

ianlancetaylor commented Jan 11, 2019

What are the chances that x86 Bionic could implement TLS relocations as they are implemented on x86 GNU/Linux systems? Because then I think everything would just work.

Reserving a register on amd64 would break all existing assembly code, which means breaking a lot of crypto packages. I think that would be difficult. Though the ABI support in 1.12 might provide a stepping stone toward supporting that.

I don't fully understand your TLS IE example for amd64. It seems to assume that the offset from %fs can be determined at run time, but as far as I know it must be known at compile time. I think that handling a variable offset from %fs would require a longer instruction sequence.

In glibc, certain offsets from %fs are reserved for certain uses. See tcbhead_t in https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86_64/nptl/tls.h;h=e25430a92826855f62617c5b51c65cb8d194f898;hb=HEAD . For example, that is where gccgo stores its split-stack information. Is there any chance that Bionic could give Go a similar pointer-sized word?

@eliasnaur eliasnaur changed the title Android Go's g TLS allocation makes assumptions about Bionic's TLS layout runtime: remove assumptions on Android Bionic's TLS layout Jan 11, 2019

@rprichard

This comment has been minimized.

Copy link

rprichard commented Jan 12, 2019

What are the chances that x86 Bionic could implement TLS relocations as they are implemented on x86 GNU/Linux systems? Because then I think everything would just work.

Implementing the relocations is straightforward. The difficulty is that Go always uses static TLS ("IE" accesses), even though an Android app's solib is loaded with dlopen. Using IE in an solib doesn't generally work, because libc needs to allocate a thread's static TLS memory before it starts running, and it can't move/extend that memory when dlopen is called later. It can work if the libc reserves enough surplus static TLS memory. glibc reserves space (TLS_STATIC_SURPLUS and DTV_SURPLUS), and at least some BSDs also do (RTLD_STATIC_TLS_EXTRA). The reserved amounts I've seen vary between about 64 and ~1600 bytes.

The surplus amount needs to be enough for all dlopen'ed solibs. Once it's exhausted, dlopen fails.

There are other problems with surplus static TLS involving (a) initialization and (b) the DF_STATIC_TLS flag. Maybe we'd limit support to zero-initialized TLS segments and solibs marked with DF_1_NODELETE.

I don't fully understand your TLS IE example for amd64. It seems to assume that the offset from %fs can be determined at run time, but as far as I know it must be known at compile time. I think that handling a variable offset from %fs would require a longer instruction sequence.

On GNU/Linux, Go uses a TLS LE access in an executable, and each access is a single instruction, e.g.:

  402e10:       64 48 8b 0c 25 f8 ff    mov    %fs:0xfffffffffffffff8,%rcx

The offset is known at build-time and encoded into the instruction. Typically the static linker would encode the offset, but I think Go's compiler knows that the runtime.tlsg symbol is at offset -8 and can encode it earlier.

In a GNU/Linux shared object (-buildmode=c-shared), Go instead uses TLS IE, where the offset is known at load-time, e.g.:

   80250:       48 8b 0d 79 9d 39 00    mov    0x399d79(%rip),%rcx        # 419fd0 <.got>
   80257:       64 48 8b 09             mov    %fs:(%rcx),%rcx
   ...
   0000000000419fd0  0000000000000012 R_X86_64_TPOFF64                          0

The dynamic loader computes the TPOFF value for the solib's TLS segment and writes it into the GOT.

My proposal is to use something resembling TLS IE, but instead of storing the TPOFF in the GOT at load-time, it's computed at run-time and stored in an ordinary STB_LOCAL symbol.

In glibc, certain offsets from %fs are reserved for certain uses. ... Is there any chance that Bionic could give Go a similar pointer-sized word?

We could allocate a TLS slot that Go could use, but I think we're hesitant to give one to a particular language runtime if we can avoid it.

If we did allocate a slot today, Go would only be able to use it on new versions of Android. That's problematic for Android/x86 because it uses TLS LE in the app solib, e.g.:

   9ca20:       64 48 8b 0c 25 d0 01    mov    %fs:0x1d0,%rcx

The situation for Android/ARM is better. For ARM, Go is already computing a TP-relative offset at run-time.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

ianlancetaylor commented Jan 16, 2019

My proposal is to use something resembling TLS IE, but instead of storing the TPOFF in the GOT at load-time, it's computed at run-time and stored in an ordinary STB_LOCAL symbol.

I think that I have now paged enough memory to say that this seems reasonable to me. Thanks.

@rprichard

This comment has been minimized.

Copy link

rprichard commented Jan 18, 2019

Ok, I think Bionic can add an API like the one I've proposed[1], and then I can upload a patch for Go to use the API on arm/arm64. Fixing x86 is more involved, though -- maybe someone on the Go team could look at that?

I'll work on adding the Bionic API.

[1] https://android-review.googlesource.com/c/platform/bionic/+/862083

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

ianlancetaylor commented Jan 19, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment