New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: gcWriteBarrier requires g.m.p != nil even if g.m.dying > 0 #26575

Open
ianlancetaylor opened this Issue Jul 24, 2018 · 1 comment

Comments

Projects
None yet
2 participants
@ianlancetaylor
Contributor

ianlancetaylor commented Jul 24, 2018

A signal can occur on a thread with no p, that is, where g != nil && g.m != nil && g.m.p == nil. Thus, the signal hander must not have any write barriers. A signal can cause a panic, which the signal handler implements by calling startpanic_m. The code in startpanic_m is permitted to have write barriers, because write barriers are permitted, even if g.m.p == nil, if g.m.dying > 0. This check is made in wbBufFlush.

However, write barriers are currently implemented by calling gcWriteBarrier. And that function, written in assembler, assumes that g.m.p != nil. So when startpanic_m has a write barrier, which it does when setting _g_.writebuf = nil, we can get a segmentation violation while handling a signal.

I believe this can happen starting in the 1.10 release.

cc @aclements

@aclements

This comment has been minimized.

Member

aclements commented Jul 24, 2018

Yuck. I can think of a few solutions:

  1. Check if g.m.dying > 0 in gcWriteBarrier. This is the obvious one, but adds more instructions and another memory load to the write barrier hot path.

  2. Check if g.m.p is nil in gcWriteBarrier and jump to a slow path that checks g.m.dying. This adds two instructions to the hot path, but no more loads. The slow path could just be the flush path and we could do the detailed check in the Go code to keep the assembly simple.

  3. The _g_.writebuf = nil write is the only write barrier in startpanic_m. We could change how that works and then disallow write barriers in startpanic_m.

    1. We could just do that write without a write barrier, since the GC can't possibly transition any more once we've reached this point.
    2. We could ignore g.writebuf if g.m.dying > 0.
    3. If we cleaned up tracebacking (which I started to do but got stuck on write barriers, coincidentally), we might be able to get rid of g.writebuf. The sole user is runtime.Stack. That would also shrink the g.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment