Skip to content

runtime: notetsleep and all called by it should be marked as NOSPLIT #6747

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
alexbrainman opened this issue Nov 10, 2013 · 5 comments
Closed
Milestone

Comments

@alexbrainman
Copy link
Member

From an email:

...
runtime·notetsleepg in lock_sema.c calls
runtime·semacreate in syscall context, so runtime·semacreate must
not split the stack. On Windows, runtime·semacreate is not marked as
NOSPLIT. Perhaps it didn't have an issue because it was a trivial function
and it got inlined? On Solaris, the function was also not inlined, but
it was not trivial, it called another function that wasn't marked NOSPLIT.

Making runtime·semacreate and beneath NOSPLIT seems to have solved the
issue on Solaris.
...

Alex
@rsc
Copy link
Contributor

rsc commented Nov 11, 2013

Comment 1:

I would like to understand how severe this is before tagging the next Go 1.2 release
candidate.

Labels changed: added priority-later, go1.2, removed priority-triage.

@alexbrainman
Copy link
Member Author

Comment 2:

I haven't seen this error myself. I haven't heard anyone complain about it on windows. I
am not convinced this needs to be changed - I don't know runtime enough - I posted it
here for you to decide.
I think the change will be large and might introduce new bugs and instabilities. I think
we should wait until after Go 1.2.
Alex

@rsc
Copy link
Contributor

rsc commented Nov 13, 2013

Comment 3:

I don't believe this report is valid, but I cannot find the broader context. Was it a
private email, Alex? I'd like to find out more about the bug that was observed on
Solaris. 
At tip, lock_sema.c says:
// same as runtime·notetsleep, but called on user g (not g0)
// calls only nosplit functions between entersyscallblock/exitsyscall
bool
runtime·notetsleepg(Note *n, int64 ns)
{
    bool res;
    if(g == m->g0)
        runtime·throw("notetsleepg on g0");
    if(m->waitsema == 0)
        m->waitsema = runtime·semacreate();
    runtime·entersyscallblock();
    res = notetsleep(n, ns, 0, nil);
    runtime·exitsyscall();
    return res;
}
The comment explains the restrictions: between entersyscallblock and exitsyscall, we
must not do anything that might cause allocation or really perturb memory too badly,
because the code might be running concurrently with the garbage collector. (The
collector assumes "in syscall" counts as stopped.) The most important restriction is
that the calls must not visit any non-NOSPLIT function, because it might split the stack.
To ensure this constraint about only using NOSPLIT functions, entersyscallblock actually
sets the stack guard value to StackPreempt, which means that any split check will fail
and enter the splitting code, which will see what is going on and throw the error
"runtime: stack split during syscall". The only way a bug might be lurking is if there
is a call to a splitting function in some little-used code path.
notetsleep is itself marked NOSPLIT, and it calls runtime.casp, runtime.semasleep,
runtime.nanotime, and runtime.atomicloadp. All of these are themselves marked NOSPLIT.
Specifically, os_windows.c says:
#pragma textflag NOSPLIT
int32
runtime·semasleep(int64 ns)
#pragma textflag NOSPLIT
int64
runtime·nanotime(void)
Those functions call runtime.stdcall and runtime.timediv, both of which are NOSPLIT.
Stdcall calls asmcgocall, which moves over to the g0 stack and doesn't need paranoia
anymore. Timediv is a leaf.
So I think everything is marked NOSPLIT appropriately. The original mail claimed that
semasleep is not marked NOSPLIT, but it is.
Perhaps the author of the original mail was looking at the code on golang.org, which is
showing the Go 1.1 code, not the current code at tip. The code I am looking at looks
fine.
Marking "Invalid" because as far as I can tell. notetsleep and all called by it _are_
marked NOSPLIT.

Status changed to Invalid.

@alexbrainman
Copy link
Member Author

Comment 4:

> ... Was it a private email, Alex? I'd like to find out more about the bug that was
observed on Solaris. ...
Yes, I go this email from Aram (I hope he doesn't mind me publishing it here):
>>>
Hello,
I am porting go on Solaris[1]. The port is very similar to the Windows
port; Solaris, like Windows, does everything through library calls. The
mechanism is the same as for Windows. We #pragma dynimport functions
from libc.so and we have a asmsysvicall function, analogous to asmstdcall
that is called by asmcgocall.
G0 stack is big, like on Windows, and the stack guard is set
correspondingly.
The problem is that the preemption mechanism resets the stackguard to
do its thing, and then the runtime eventually panics with "runtime:
stack split during syscall".
I am trying to understand why Windows doesn't have this issue, but I
haven't found the code that disable preemption, or whatever mechanism
permits this not to happen.
For now I just hack it with
- if(oldstatus == Gsyscall && m->locks == 0)
+ // BUG(aram): at least check if it's on g0.
+ if(oldstatus == Gsyscall && m->locks == 0 && !Sunos)
      runtime·throw("runtime: stack split during syscall");
in stack.c, but that's terrible. It works great through, the stack is 2MB.
Perhaps you have some idea of what might be wrong or perhaps you can
explain me what I don't understand about the Windows runtime
Thanks,
[1] https://bitbucket.org/4ad/go-sunos
<<<
and also
>>>
Thank you for the reply, I think I solved the issue. I believe Windows
is susceptible too, but it got lucky not hitting it.
The problem is that runtime·notetsleepg in lock_sema.c calls
runtime·semacreate in syscall context, so runtime·semacreate must
not split the stack. On Windows, runtime·semacreate is not marked as
NOSPLIT. Perhaps it didn't have an issue because it was a trivial function
and it got inlined? On Solaris, the function was also not inlined, but
it was not trivial, it called another function that wasn't marked NOSPLIT.
Making runtime·semacreate and beneath NOSPLIT seems to have solved the
issue on Solaris.
Thanks,
<<<
> ... So I think everything is marked NOSPLIT appropriately.
Thank you for detailed explanation.
> ... The original mail claimed that semasleep is not marked NOSPLIT, ...
He was talking about "runtime·semacreate", not "semasleep".
> ... Marking "Invalid" because as far as I can tell. notetsleep and all called by it
_are_ marked NOSPLIT.
I will trust you. :-)
Alex

@4ad
Copy link
Member

4ad commented Nov 13, 2013

Comment 5:

I am the author of the email.
The problem on Solaris with runtime·semacreate, not runtime·semasleep;
runtime·semacreate indeed seems not  to be called by runtime·notetsleepg
in syscall context. This is interesting, somebody else called it in
syscall context, have to dig deeper how that happened.

@rsc rsc added this to the Go1.2 milestone Apr 14, 2015
@rsc rsc removed the go1.2 label Apr 14, 2015
@golang golang locked and limited conversation to collaborators Jun 25, 2016
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants