Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/internal/obj/ppc64: assumes addresses of static data fit in 32 bits #22459

Closed
mwhudson opened this issue Oct 26, 2017 · 10 comments
Closed

cmd/internal/obj/ppc64: assumes addresses of static data fit in 32 bits #22459

mwhudson opened this issue Oct 26, 2017 · 10 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@mwhudson
Copy link
Contributor

./all.bash has been failing for a while on ubuntu ppc64el systems, with failures like this:

ok  	_/<<PKGBUILDDIR>>/misc/cgo/nocgo	0.002s
/tmp/go-build207717480/_/<<PKGBUILDDIR>>/misc/cgo/test/_test/test.test: error while loading shared libraries: R_PPC64_ADDR16_HA re10da2639c for symbol `' out of range
exit status 127
FAIL	_/<<PKGBUILDDIR>>/misc/cgo/test	0.002s

The test that is failing builds without any buildmode (so no -shared flags) but links with -extldflags=-pie. This is a recipe for text relocations of course which are generally undesirable but shouldn't break during program load. Also these tests used to work, so what's going on?

It turns out that recent kernels change the default base of dynamic executables on ppc64le from 0x20000000 to 0x100000000UL, i.e. above 32-bits. Because the compiler on ppc64 only uses two instructions to access static data, this causes the complaint from the dynamic linker we see above. This kernel change has been backported to various stable kernel releases so these executables are now broken in e.g. Ubuntu 16.04, not just the latest devel release.

The binary doesn't have to be PIE to run into this of course, but I guess even k8s isn't generating a 4 gig binary yet .

I'm not sure what the best fix would be. We could change the compiler and linker to use 64-bit capable instruction patterns to access static data, we could change the compiler to always generate PIC, we could just declare this to be a broken situation and skip the tests that do this on linux/ppc64.

/cc @laboger @ianlancetaylor

@ianlancetaylor
Copy link
Contributor

Any sense of how expensive it would be to always generate PIC?

@mwhudson
Copy link
Contributor Author

I don't think terribly expensive. It seems to be what gcc does (at least, I tried and failed to get it to generate non-r2-relative loads while poking at this).

@mwhudson
Copy link
Contributor Author

Here's what I think has to be done to always generate PIC:

  1. find most checks of Flag_shared in cmd/internal/obj/ppc64 and change them to assume Flag_shared is always true (the ones for TLS should stay)
  2. in the linker in the internal linking case of archreloc, account for the prologue in the implementation of R_CALLPOWER and implement R_ADDRPOWER_TOCREL and R_ADDRPOWER_PCREL
  3. actually create a .TOC. symbol in the internally linked case
  4. in the internally linked static case, the prologue of the entry point _rt0_ppc64le_linux needs to be mangled so it loads the absolute address of .TOC. into R2.

There is some very bogus stuff in the way TOCs are treated in the internally-linked-but-using-cgo case, which I already mentioned as blocking always-PIC a year or so ago #15409 (comment) but as that case is currently disabled because of #21961 maybe we can ignore that.

@laboger
Copy link
Contributor

laboger commented Oct 27, 2017

This looks like the same problem identified in #22126 and #21954 and I thought those were all fixed by CL 66870 to use -buildmode=pie instead of passing pie directly to the external linker. What distro are you using? Is this with latest from upstream?

IMO passing -pie to the external linker with non-PIC code is incorrect because the man pages for ld state that -pie must be used with code that has been compiled PIC. So the tests should be built with correct options (and I thought they all were now).

However, I'm not necessarily opposed to the idea of generating PIC by default for ppc64le, although I wouldn't want to enable that in 1.10. We would need some time to test that out. I think it could impact performance because a lot more calls would go through the plt sequence than before. Addressing for static and constants would include another instruction, but that needs improvement anyway as described in #17110. That might be easier to fix if we always had a real TOC. We still have #16662 where x/sys/unix doesn't even build with the -shared option and that makes me wonder if there is any other asm code out there that would have the same issue.

@laboger
Copy link
Contributor

laboger commented Oct 27, 2017

Ugh I just pulled latest and built it on a Debian 9 machine and hit this error:
--- FAIL: TestCgoPprofPIE (4.48s)
crash_cgo_test.go:292: exit status 127
FAIL
FAIL runtime 34.225s
Which is the same problem, after adding more detailed output I see this:
out: /tmp/go-build151734429/testprogcgo_-ldflags=-extldflags=-pie.exe: error while loading shared libraries: R_PPC64_ADDR16_HA re10ddb9b1c for symbol `' out of range

So this test should also use -buildmode=pie instead of -ldflags=-extdlfags=-pie.

@gopherbot
Copy link

Change https://golang.org/cl/73970 mentions this issue: runtime: use -buildmode=pie in testCgoPprofPIE instead of -extldflags=-pie

gopherbot pushed a commit that referenced this issue Oct 30, 2017
…=-pie

Errors occur in runtime test testCgoPprofPIE when the test
is built by passing -pie to the external linker with code
that was not built as PIC. This occurs on ppc64le because
non-PIC is the default, and fails only on newer distros
where the address range used for programs is high enough
to cause relocation overflow. This test should be built
with -buildmode=pie since that correctly generates PIC
with -pie.

Related issues are #21954 and #22126.

Updates #22459

Change-Id: Ib641440bc9f94ad2b97efcda14a4b482647be8f7
Reviewed-on: https://go-review.googlesource.com/73970
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@mwhudson
Copy link
Contributor Author

mwhudson commented Oct 31, 2017

I've pushed a version of go 1.9 to Debian and Ubuntu that has these tests fixed. Does it make sense to fix them in the 1.9 branch as well?

@laboger
Copy link
Contributor

laboger commented Nov 1, 2017

I think it makes sense to fix them in the go 1.9 branch since the error can happen whenever it is built on any newer distro.

@ianlancetaylor
Copy link
Contributor

Is this fixed on tip?

@ianlancetaylor ianlancetaylor added the NeedsFix The path to resolution is known, but the work has not been done. label Mar 29, 2018
@ianlancetaylor ianlancetaylor added this to the Unplanned milestone Mar 29, 2018
@mwhudson
Copy link
Contributor Author

mwhudson commented Apr 3, 2018

Is this fixed on tip?

Depends exactly what you mean by fixed. If you pass -linkmode=external -extldflags=-pie I assume you still get a broken binary but I'd be comfortable blaming the user for that (and the test suite no longer does it).

I'll close the bug, feel free to reopen if you disagree...

@mwhudson mwhudson closed this as completed Apr 3, 2018
@golang golang locked and limited conversation to collaborators Apr 3, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

4 participants