Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

misc/cgo/testsanitizers: enabling misc/cgo/testsanitizers testcases for ppc64le result in TLS relocation link errors and signal handling errors for cgo #45040

Open
laboger opened this issue Mar 15, 2021 · 11 comments
Labels
Milestone

Comments

@laboger
Copy link
Contributor

@laboger laboger commented Mar 15, 2021

What version of Go are you using (go version)?

$ go version
Fails with latest tip.

Does this issue reproduce with the latest release?

Yes, but the errors are different depending on the system. On a system where the verification of tsan fails in a certain way the test is skipped so the error doesn't occur and the test seems to pass.

This started with CL 297774 since these tests were not run on ppc64le before that.

What operating system and processor architecture are you using (go env)?

ppc64le
Fails consistently on Linux SMP Debian 4.19.160-2

go env Output
$ go env

What did you do?

This first appeared on the build dashboard: https://build.golang.org/log/a07cf12a0667dc38c330f06dc5c457841154ffc4 and then also appeared on a slowbot run: https://storage.googleapis.com/go-build-log/9fc3f56e/linux-ppc64le-power9osu_74ad5e7f.log.
I don't know why it would work sometimes but not others like has happened on the build dashboard.

What did you expect to see?

PASS

What did you see instead?

Various errors, including those shown in the logs above. It does pass sometimes as mentioned above, if the test decides that it shouldn't test tsan then the test seems to pass.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Mar 15, 2021

Sounds like the thread sanitizer doesn't really work on ppc64le. Let's either fix upcheckCSanitizer to detect this case or just skip the tests on that platform, by adding a t.Skip in misc/cgo/testsanitizers/tsan_test.go.

@laboger
Copy link
Contributor Author

@laboger laboger commented Mar 15, 2021

@pmur thinks he has a linker fix for the relocation error. I also got all but one tsan tests it to work by building them with a shared std and -linkshared.

With the linker error fixed, tsan8 still fails because it doesn't forward the Go signal.

But I agree that checkCSanitizer should be fixed.

@laboger
Copy link
Contributor Author

@laboger laboger commented Mar 16, 2021

How does this compare to using the -race option other than one is using LLVM and the other gcc?

@cherrymui cherrymui added the NeedsFix label Mar 16, 2021
@cherrymui cherrymui added this to the Go1.17 milestone Mar 16, 2021
@laboger
Copy link
Contributor Author

@laboger laboger commented Mar 16, 2021

Sounds like the thread sanitizer doesn't really work on ppc64le. Let's either fix upcheckCSanitizer to detect this case or just skip the tests on that platform, by adding a t.Skip in misc/cgo/testsanitizers/tsan_test.go.

Even if we just skip this test, I think this is showing a problem that could happen for other cases where Go and C code is linked together in this way.

@gopherbot
Copy link

@gopherbot gopherbot commented Mar 16, 2021

Change https://golang.org/cl/302209 mentions this issue: cmd/link: support 32b TLS_LE offsets on PPC64

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Mar 16, 2021

How does this compare to using the -race option other than one is using LLVM and the other gcc?

It's not related to LLVM vs. GCC; both LLVM and GCC support -fsanitize=thread. The tests in misc/cgo/testsanitizers/test_tsan.go are testing that Go code works when the C code is compiled with -fsanitize=thread. That is, it is testing the C thread sanitizer rather than the Go race detector. Now, it happens that the C thread sanitizer and the Go race detector were written by the same person and use the same runtime code. But they are different in that in one case the C compiler is using that runtime code and in the other case the Go compiler is using (a different copy) of that runtime code.

These tests are relevant because people can compile their C code with -fsanitize=thread, and it should be possible for them to call between that C code and Go code and have things more or less work. The support for that on the Go side is in cmd/cgo (search for "tsan" in cmd/cgo/out.go) and runtime/cgo (search for "TSAN" in runtime/cgo/libcgo.h). The tests in misc/cgo/testsanitizer are just there to make sure that this code continues to work.

@laboger
Copy link
Contributor Author

@laboger laboger commented Mar 18, 2021

We were looking at those files yesterday and see that some files are only built for amd64 and arm64, so this has never worked on ppc64le. Looks like we need a callCgoSigaction which on arm64 that calls _cgo_sigaction. I tried to add that but then it seemed like there were calls back and forth between the Go code and the tsan functions that need to handle the different register conventions. That's as far as I got in my debugging.

@laboger
Copy link
Contributor Author

@laboger laboger commented Mar 22, 2021

We have the issue now that is mentioned in issue #31827 for arm64 and possibly others. The nonvolatile registers are currently not being saved for Power in sigtramp. I tried to add that code it didn't fix the problem.

The failure is a SEGV in the tsan library because it tries to store at an address off R31 but R31 points to code. The address in R31 has the same value as LR, which is interesting since Go code does a mflr to R31. If I run with cpu=1 it works.

@laboger laboger changed the title misc/cgo/testsanitizers: link errors due to recent enablement of tsan testcases on ppc64le misc/cgo/testsanitizers: link errors relocation truncated to fit: R_PPC64_TPREL16 against `runtime.tls_g' when linking against C shared libraries Mar 23, 2021
@laboger
Copy link
Contributor Author

@laboger laboger commented Mar 23, 2021

This failure started happening when the testsanitizer tests in misc/cgo were enabled for any system where gcc allows -fsanitize=thread. On some ppc64le systems, the initial test to determine if -fsanitize=thread is allowed on the system might pass, but when building the test the link step fails. The link failure is not limited to tsan libraries -- this TLS error message could occur at link time when linking against a C shared library with large enough amount of TLS.

@laboger laboger changed the title misc/cgo/testsanitizers: link errors relocation truncated to fit: R_PPC64_TPREL16 against `runtime.tls_g' when linking against C shared libraries misc/cgo/testsanitizers: enabling misc/cgo/testsanitizers testcases for ppc64le result in TLS relocation link errors and signal handling errors for cgo Mar 23, 2021
@gopherbot
Copy link

@gopherbot gopherbot commented Mar 31, 2021

Change https://golang.org/cl/306369 mentions this issue: runtime/cgo,cmd/internal/obj/ppc64: fix signal forwarding with cgo

@laboger
Copy link
Contributor Author

@laboger laboger commented Mar 31, 2021

@Helflym My change does not change the way signals are done on AIX. I will leave that for you to change later if you feel it is needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants