Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: debug call injection tests contain many unsafe write barriers #49169

Open
mknyszek opened this issue Oct 26, 2021 · 1 comment
Open

runtime: debug call injection tests contain many unsafe write barriers #49169

mknyszek opened this issue Oct 26, 2021 · 1 comment

Comments

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented Oct 26, 2021

It's possible for the debug call injection tests to segfault, because (*debugCallHandler).inject and (*debugCallHandler).handle have write barriers but they're called in a signal handler. Normally write barriers would be prevented here via their caller, sighandler, which has go:nowritebarrierrec but unfortunately because they're called indirectly via the testSigtrap global variable, the analysis doesn't plumb through.

The failure is very rare; a GC has to be actively running during the injection. But, it is possible, and has been reproduced locally by myself and is observable on trybots, if you run the runtime tests enough times.

This is only a test problem, also. Real debug call injection implementations will usually be coordinated by a separate process; the tricky bit here is we're doing all the machinery in the signal handler of the same process.

I tried to do the simple thing and mark the two methods above as go:nowritebarrierrec, but there are enough write barriers in each function that it's tough to capture them all without the code being mostly fragile comments about why it's safe to elide all these write barriers. I think this test code just needs a rethink, so that the primary driver is not on the thread getting all the signals.

The trouble with that is dealing with the result of a panicking injected function. The panic value it returns may be the only reference to that value, so the signal handler needs to put it somewhere the GC can see. That's at odds with the fact that any such store would require a write barrier. There may be something clever here, but I haven't thought of it yet.

CC @prattmic @aclements

@mknyszek
Copy link
Contributor Author

@mknyszek mknyszek commented Nov 8, 2021

#49370 works around this for now, as it got more likely.

We should reconsider how the test implementation works.

Loading

@mknyszek mknyszek changed the title runtime: debug call injection tests sometimes segfault runtime: debug call injection tests contain many unsafe write barriers Nov 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant