New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime/race: update race windows syso to new LLVM version #53539
Comments
Change https://go.dev/cl/413817 mentions this issue: |
OK, looks like CL 413817 helps with that, but we're on to a new problem:
|
As of LLVM rev 41cb504b7c4b18ac15830107431a0c1eec73a6b2, the race detector runtime now refers to things in the windows synchronization library, hence when doing windows internal linking, at that library to the list of host archives that we visit. The tsan code that makes the reference is here: https://github.com/llvm/llvm-project/blob/41cb504b7c4b18ac15830107431a0c1eec73a6b2/compiler-rt/lib/sanitizer_common/sanitizer_win.cpp#L48 https://github.com/llvm/llvm-project/blob/41cb504b7c4b18ac15830107431a0c1eec73a6b2/compiler-rt/lib/sanitizer_common/sanitizer_win.cpp#L834 Note that libsynchronization.a is not guaranteed to be available on all windows systems, so in the external linking case, check for its existence before adding "-lsynchronization" to the external linker args. Updates #53539. Change-Id: I433c95c869915693d59e9c1082d5b8a11da1fc8c Reviewed-on: https://go-review.googlesource.com/c/go/+/413817 Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Than McIntosh <thanm@google.com>
Change https://go.dev/cl/414514 mentions this issue: |
Update: I sent D128641 to LLVM upstream with a fix, now submitted as 13fb97d68821. Still having trouble getting a clean race.bat run after running racebuild. My latest run is:
in combination with CL 414475, which picks GCC 11.2 instead of 5.3. The resulting race.bat run is still not passing however. Representative error:
I get this same symptom on both |
This error:
comes from here: Previously we did not do it on Windows since it was optional, but now it's required to zero the whole range. But since we map shadow on Windows lazily (otherwise we would get the same error earlier): We don't need to remap the whole range, we only need to remap what was mapped by MapShadow function. We probably need to add mapped_shadow_begin/end for Go, update them in MapShadow and re-map on that part in DoReset. |
Thanks, I tried coding up this suggestion. Tentative/experimental CL at https://reviews.llvm.org/D128909. This does seem to eliminate the "failed to allocate 0x020000000000" problem, but now I seem to be on to a different failure mode, e.g.
This actually looks pretty similar to some previous issues, e.g. 48231 and 46099. I see from this comment that the problem may be due to a new compiler that defaults to a PIE binary. I can pursue this angle (maybe with Go linker CL). @dvyukov , can you confirm that we need to avoid PIE in order for the race detector to function properly on windows/amd64? Thanks. |
87 is ERROR_INVALID_PARAMETER 87 (0x57) The parameter is incorrect. I see the addresses are not 64K aligned, that's probably what causes the error. |
Ah, ok, I guess this would make sense. I'll see if I can collect a stack trace. |
Looks like the offending call is from MapShadow itself (e.g. https://github.com/llvm/llvm-project/blob/13fb97d68821e1948d176057ebee94f50cb05b62/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp#L568). 64K: I tried upping the alignment to 64k, but that doesn't seem to help (I still get the same error). |
OK, looks like the most recent batch of problems all go away (at least for the hand tests I tried) when PIE is turned off. I will work on putting together a linker CL that disables PIE when -race is in effect. Stay tuned. |
Right, we only align it to 4K (page size), but not to 64K.
Humm... this is strange. You aligned both start address and size, right?
The only way I see PIE can affect things is that some ranges become only 4K aligned, whereas w/o PIE they are 64K aligned. |
Change https://go.dev/cl/415674 mentions this issue: |
Change https://go.dev/cl/415675 mentions this issue: |
Change https://go.dev/cl/414475 mentions this issue: |
Change https://go.dev/cl/415676 mentions this issue: |
Change https://go.dev/cl/416174 mentions this issue: |
This patch changes the default build mode from "pie" to "exe" when building programs on windows with "-race" in effect. The Go command already issues an error if users explicitly ask for -buildmode=pie in combination with -race on windows, but wasn't revising the default "pie" build mode if a specific buildmode was not requested. Updates #53539. Updates #35006. Change-Id: I2f81a41a1d15a0b4f5ae943146175c5a1202cbe0 Reviewed-on: https://go-review.googlesource.com/c/go/+/416174 Reviewed-by: Alex Brainman <alex.brainman@gmail.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Than McIntosh <thanm@google.com>
Turn off PIE explicitly for windows/amd64 when -race is in effect, since at the moment the race detector runtime doesn't seem to handle PIE binaries correctly. Note that newer C compilers on windows produce PIE binaries by default, so the Go linker needs to explicitly turn off PIE when invoking the external linker in this case. Updates #53539. Change-Id: Ib990621f22cf61a5fa383584bab81d3dfd7552e8 Reviewed-on: https://go-review.googlesource.com/c/go/+/415676 Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Than McIntosh <thanm@google.com> Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
For tracking purposes (this is more of a mental note / reminder), the updated windows race syso is dependent on functions from libsynchronization.a (which are not available on our existing Go windows builders), meaning that we can't check that in until we've transitioned over to the new ones. Otherwise we'll get unsats on WakeByAddressSingle/WakeByAddressAll/WaitOnAddress. |
Is this dashboard failure related? 2022-07-06T18:36:43-d69bac6-77cc1c0/windows-amd64-race:
|
Interesting, I haven't seem one of those before. Error code 1455 is 'The paging file is too small for this operation to complete'. I would guess that it's unrelated, the errors I am working on at the moment for V3 are 487 ("Attempt to access invalid address"). |
Capture the computed shadow begin/end values at the point where the shadow is first created and reuse those values on reset. Introduce new windows-specific function "ZeroMmapFixedRegion" for zeroing out an address space region previously returned by one of the MmapFixed* routines; call this function (on windows) from DoResetImpl tsan_rtl.cpp instead of MmapFixedSuperNoReserve. See golang/go#53539 (comment) for context; intended to help with updating the syso for Go's windows/amd64 race detector. Differential Revision: https://reviews.llvm.org/D128909
As of LLVM rev 41cb504b7c4b18ac15830107431a0c1eec73a6b2, the race detector runtime now refers to things in the windows synchronization library, hence when doing windows internal linking, at that library to the list of host archives that we visit. The tsan code that makes the reference is here: https://github.com/llvm/llvm-project/blob/41cb504b7c4b18ac15830107431a0c1eec73a6b2/compiler-rt/lib/sanitizer_common/sanitizer_win.cpp#L48 https://github.com/llvm/llvm-project/blob/41cb504b7c4b18ac15830107431a0c1eec73a6b2/compiler-rt/lib/sanitizer_common/sanitizer_win.cpp#L834 Note that libsynchronization.a is not guaranteed to be available on all windows systems, so in the external linking case, check for its existence before adding "-lsynchronization" to the external linker args. Updates golang#53539. Change-Id: I433c95c869915693d59e9c1082d5b8a11da1fc8c Reviewed-on: https://go-review.googlesource.com/c/go/+/413817 Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Than McIntosh <thanm@google.com>
This patch changes the default build mode from "pie" to "exe" when building programs on windows with "-race" in effect. The Go command already issues an error if users explicitly ask for -buildmode=pie in combination with -race on windows, but wasn't revising the default "pie" build mode if a specific buildmode was not requested. Updates golang#53539. Updates golang#35006. Change-Id: I2f81a41a1d15a0b4f5ae943146175c5a1202cbe0 Reviewed-on: https://go-review.googlesource.com/c/go/+/416174 Reviewed-by: Alex Brainman <alex.brainman@gmail.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Than McIntosh <thanm@google.com>
Turn off PIE explicitly for windows/amd64 when -race is in effect, since at the moment the race detector runtime doesn't seem to handle PIE binaries correctly. Note that newer C compilers on windows produce PIE binaries by default, so the Go linker needs to explicitly turn off PIE when invoking the external linker in this case. Updates golang#53539. Change-Id: Ib990621f22cf61a5fa383584bab81d3dfd7552e8 Reviewed-on: https://go-review.googlesource.com/c/go/+/415676 Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Than McIntosh <thanm@google.com> Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Capture the computed shadow begin/end values at the point where the shadow is first created and reuse those values on reset. Introduce new windows-specific function "ZeroMmapFixedRegion" for zeroing out an address space region previously returned by one of the MmapFixed* routines; call this function (on windows) from DoResetImpl tsan_rtl.cpp instead of MmapFixedSuperNoReserve. See golang/go#53539 (comment) for context; intended to help with updating the syso for Go's windows/amd64 race detector. Differential Revision: https://reviews.llvm.org/D128909
The work to finish out this task is now basically complete (see http://go.dev/cl/c/go/+/420197), but the CL didn't quite make it into the Go 1.20 release (it was blocked by the work to update the windows builders, which ran very late). I'm going to move this bug from 1.20 to 1.21 as a result; we should be able to submit the CL early in 1.21. |
Revise the recipe for building a windows race syso slightly to use the existing compilers on the windows machine (in C:\godep\gcc64\bin) as opposed to downloading GCC 5.X via "choco install". This requires updating PATH following the refresh env. Updates golang/go#35006. Updates golang/go#53539. Change-Id: I14c8491159f421f688f8d4b7c84250768d69ea42 Reviewed-on: https://go-review.googlesource.com/c/build/+/414475 Run-TryBot: Than McIntosh <thanm@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
When testing a change to the Go repo that is needed to help build/test a new version of the race detector, it helps to be able to do a racebuild run from a specific Go revision plus a cherry-picked CL on top of that revision. This patch adds a new flag ("-cherrypick") to support that, also a "-checkout" that can be used to check out a stack of pending changes on Gerrit. Updates golang/go#35006. Updates golang/go#53539. Change-Id: Id6c508f21e11a445c89df8457dc6a65020eee0fb Reviewed-on: https://go-review.googlesource.com/c/build/+/415674 Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Than McIntosh <thanm@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
When debugging new versions of the race detector runtime, it can be useful to copy the newly built syso back to the local machine's Go repo (from the gomote) even if race.bat/race.bash fails, so as to analyze the syso or run other tests with it. Add a command line option "-copyonfail" that attempts to perform the copy even if the script run fails. Updates golang/go#35006. Updates golang/go#53539. Change-Id: I688b8673b444d1b6d948f10ca2fa4ab109eade44 Reviewed-on: https://go-review.googlesource.com/c/build/+/415675 Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Than McIntosh <thanm@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
Closing out this issue; we've now upgraded all of the *.syso's (including windows) |
Capture the computed shadow begin/end values at the point where the shadow is first created and reuse those values on reset. Introduce new windows-specific function "ZeroMmapFixedRegion" for zeroing out an address space region previously returned by one of the MmapFixed* routines; call this function (on windows) from DoResetImpl tsan_rtl.cpp instead of MmapFixedSuperNoReserve. See golang/go#53539 (comment) for context; intended to help with updating the syso for Go's windows/amd64 race detector. Differential Revision: https://reviews.llvm.org/D128909
The syso that supports the race detector for windows is currently still at an old version, since we had been waiting for new windows builders with updated compilers:
https://go.googlesource.com/go/+/6bad7e82430bb1eb927a2901f44f9664637db27d/src/runtime/race/README#12
I took a stab at doing the update today but ran into some additional problems; filing this issue to note the problems and track the work needed to do the update.
The LLVM support library (llvm-project/compiler-rt/lib/tsan and related) has undergone a fair number of changes since we last did an update, and it looks as though there are now some calls in the windows version to new synchronization routines. Specifically if you run racebuild with the new builders and then race.bat, you get unsatisfied symbols.
With a slightly modified version of racebuild.go that doesn't use GCC 5.3 for the build.
Looks like this is coming from this code:
https://github.com/llvm/llvm-project/blob/41cb504b7c4b18ac15830107431a0c1eec73a6b2/compiler-rt/lib/sanitizer_common/sanitizer_win.cpp#L48
This implies that if we want to support -race + internal linking on windows we need to teach the Go linker to import this lib.
The text was updated successfully, but these errors were encountered: