Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: support resuming a single goroutine under debuggers #25578

Open
aclements opened this Issue May 25, 2018 · 6 comments

Comments

Projects
None yet
6 participants
@aclements
Copy link
Member

aclements commented May 25, 2018

CL 109699 added support for debugger function call injection, but has an annoying limitation: it requires that the debugger resume the entire Go process after injecting the function call (and, to inject into a runnable but not running goroutine, it requires resuming the entire process even before injecting the call).

@heschik argued that this is a pretty bad experience. E.g., all the user wants to do is call String() to format something, and the entire process moves under them in the meantime. It's also different from what other debuggers do, which could surprise users.

This is tricky to solve. Simply resuming only the thread where the call was injected doesn't work because 1) it could easily lead to runtime-level deadlocks if any other thread is in the runtime, 2) the runtime could just switch to a different goroutine on that thread, and 3) if the GC kicks in it will try to cooperatively stop the other threads and deadlock.

I think solving this requires at least a little help from the runtime to pause all other user goroutines during the injected call. I'm not sure what exact form this should take, but I'm imagining the debugger could use call injection to first inject a runtime call to stop user goroutines, and then inject the real call.

However, even this gets tricky with non-cooperative safe points (e.g., the runtime would need to use the register maps to immediately preempt the other goroutines rather than waiting for them to reach a cooperative safe point) and with goroutines that are at unsafe points (particularly goroutines that are in the runtime). One possibility would be to have the debugger inject this "stop" call on every running thread. Using the call injection mechanism takes care of stopping at non-cooperative points, and would give the debugger the opportunity to step other goroutines past unsafe points and out of the runtime before injecting the stop. This puts some complexity into the debugger, but it should already have most of the core mechanisms necessary to do this (such as single stepping ability). Specifically, I'm picturing a protocol like:

  1. For each thread, attempt to inject a runtime.debugStop call. Let all threads resume.
  2. These calls will notify the debugger when the goroutine is successfully stopped, or the debug call injection will fail.
  3. For injection that failed because the thread is in the runtime, unwind the stack and insert a breakpoint at the first return to user code. At that breakpoint attempt another debugStop. For injection that failed because the thread is at an unsafe point, single step the thread, attempting to inject debugStop at each step.
  4. Let the remaining threads continue running. Repeat steps 2 through 4 until all threads are stopped.

This is basically a debugger-assisted non-cooperative stop-the-world. For Go 1.12, I plan to implement non-cooperative preemption directly in the runtime, which may move much of this logic into the runtime itself.

/cc @aarzilli

@aclements aclements self-assigned this May 25, 2018

@aarzilli

This comment has been minimized.

Copy link
Contributor

aarzilli commented May 28, 2018

For Go 1.12, I plan to implement non-cooperative preemption directly in the runtime, which may move much of this logic into the runtime itself.

That being the case, personally, I'd rather wait for 1.12's cycle for this. I think it's unlikely that we are going to finish implementing the debugger side of call injection before we go is deep in 1.12's cycle, even as it is now.

@aarzilli

This comment has been minimized.

Copy link
Contributor

aarzilli commented May 28, 2018

@derekparker

This comment has been minimized.

Copy link
Contributor

derekparker commented May 29, 2018

That being the case, personally, I'd rather wait for 1.12's cycle for this.

Fair enough, but if we do end up making more movement on the implementation of call injection on our end reliably, I'd rather get this feature into users hands sooner rather than later, this is a particularly exciting one.

@aclements

This comment has been minimized.

Copy link
Member Author

aclements commented May 29, 2018

I'd rather get this feature into users hands sooner rather than later

Note that debugger function call injection is supported by the runtime now. It's just that the debugger has to let the whole process resume execution during the injected call. You probably have a better sense of the impact that has on the user experience than I do.

As a data point, GDB by default continues the entire process during things like call injection and stepping (though set scheduler-locking can turn that off on Linux at least).

@gopherbot

This comment has been minimized.

Copy link

gopherbot commented Sep 11, 2018

Change https://golang.org/cl/134779 mentions this issue: runtime: support disabling goroutine scheduling by class

gopherbot pushed a commit that referenced this issue Oct 2, 2018

runtime: support disabling goroutine scheduling by class
This adds support for disabling the scheduling of user goroutines
while allowing system goroutines like the garbage collector to
continue running. User goroutines pass through the usual state
transitions, but if we attempt to actually schedule one, it will get
put on a deferred scheduling list.

Updates #26903. This is preparation for unifying STW GC and concurrent
GC.

Updates #25578. This same mechanism can form the basis for disabling
all but a single user goroutine for the purposes of debugger function
call injection.

Change-Id: Ib72a808e00c25613fe6982f5528160d3de3dbbc6
Reviewed-on: https://go-review.googlesource.com/c/134779
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
@derekparker

This comment has been minimized.

Copy link
Contributor

derekparker commented Feb 27, 2019

I wanted to ping for an update on this issue as @aarzilli @heschik @hyangah and I spoke about it today during our meeting.

It seems like some of the foundation is there to prevent the runtime from scheduling goroutines that aren't already in the running state. However if I understand correctly were still a bit short on this feature. We would still need cooperation from the runtime to "unschedule" any running goroutine aside from the one Delve wants to be allowed to execute correct? As I understand it, even if we inject schedEnableUser(false) into every thread, that thread will still be _Grunning meaning it will still execute, correct? I assume the above CL prevents any non-running goroutine from being scheduled, but does not deschedule any running goroutine.

If that is the case, then what is the proposed path forward in order to finish this feature? From the original description @aclements suggested injecting a call on each thread which would stop the goroutine. From the standpoint of Delve that makes sense as it would allow us to selectively inject this call onto any thread running a goroutine besides the one we wish to continue execution.

We would really like to get this feature completed as I think that having other side effects during a user-injected function call is less than ideal. Is there any bandwidth to get this done during the 1.13 cycle?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.