Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proc: changed windows backend to deal with simultaneous breakpoints #598

Merged
merged 5 commits into from
Oct 22, 2016

Conversation

aarzilli
Copy link
Member

@aarzilli aarzilli commented Jul 20, 2016

This solves most of the test flakiness that the windows back end has on multi core.

Before this patch when multiple breakpoints were hit simultaneously only one was reported, the other one was processed by (_Thread).singleStep while stepping over the breakpoint which caused the second breakpoint to be missed and the first breakpoint to be processed twice, because (_Thread).singleStep was unable to complete.


This change is Reviewable

@@ -127,6 +127,20 @@ func Launch(cmd []string) (*Process, error) {
return nil, ProcessExitedError{Pid: dbp.Pid, Status: exitCode}
}

for _, thread := range dbp.Threads {
_, err := _SuspendThread(thread.os.hThread)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to wrap this in execPtraceFunc as well, correct? I assume it has the same origin thread restrictions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lukehoban code didn't, their documentation doesn't mention it, and it doesn't seem to cause problems.

@aarzilli
Copy link
Member Author

I've found out that some of the errors that still happen are because somehow we get new threads starting while even after we suspended all existing threads in setCurrentBreakpoints, I changed waitForDebugEvent to suspend new threads in that section of the code and the only errors I still see are in TestIssue414, which isn't a very important test.

I've put those changes in a separate commit (addenda) for easier reviewing, I'll squash before merge.

for {
var err error
var tid int
dbp.execPtraceFunc(func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the idea of this function having the side effect of potentially causing the inferior to execute instructions. We continue and then immediately execute a non-blocking wait, which will return an error when we do not have any pending events. That means we never stop this thread, correct?

Is is not possible to just loop through WaitForDebugEvent in non-blocking mode until we have no events? I assume all events would be queued and we shouldn't have to continue the inferior.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop on dbp.Threads calls _SuspendTherad on every thread so the inferior shouldn't be able to execute any instructions.

WaitForDebugEvent doesn't seem to return anything unless we call ContinueDebugEvent in between.

@aarzilli
Copy link
Member Author

rebased on master, TestIssue414 is flaky but that shouldn't be a problem after #603.

var err error
var tid int
dbp.execPtraceFunc(func() {
err = _ContinueDebugEvent(uint32(dbp.Pid), uint32(dbp.os.breakThread), _DBG_CONTINUE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only thing that sticks out to me in this patch set. Here, we're continuing a thread and waiting for a debug event, however in the meantime it's possible that the thread will be executing instructions. This function should not have that side effect.

A non-blocking wait for debug event should suffice, yeah? If there's no events we simply move on. I don't understand the need to continue a thread in a function that is setting state on our thread objects.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we suspend all threads (which we did above this line) ContinueDebugEvent doesn't resume execution, it just acknowledges that we have processed the event we received. I couldn't figure out any other way of doing this that would work with the windows API. If we don't do this later on on the Continue procedure while we are singlestepping CurrentThread over its breakpoint instruction we receive events for other threads and we are not equipped to dealing with them there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so if I understand correctly, it behaves similar to the Darwin API, where you can suspend a thread multiple times, and have to resume an equal number of times for it to actually execute instructions. Is that correct, from your understanding? If so, could you add a comment / link to docs to make it clear we're not actually continuing execution.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's like the darwin API, however I don't actually ever suspend the thread more than once, when one of the threads generates a debug event windows will halt the execution of all threads but this state doesn't count as suspended, it's weird.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment describing what we are doing on that function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I think maybe we should alias _DBG_CONTINUE to _DBG_EXCEPTION_HANDLED. When reading up on the Windows debug API the other day, IIRC, that existed for 32 bit, but not 64 bit. However, semantically it's more accurate to what we're trying to accomplish, and makes it more explicit from the callers point of view.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Googling DBG_EXCEPTION_HANDLED leads me to this blog post which suggests that at some point DBG_EXCEPTION_HANDLED and DBG_CONTINUE did different things and had different values, I think appropriating the name as a synonym muddles the water but if you insist I'll switch.

@derekparker
Copy link
Member

Overall looks good. Can you rebase this on master so we can pull in the patch fixing prologue detection and get the AppVeyor CI to pass?

@aarzilli aarzilli force-pushed the solution3 branch 3 times, most recently from a22e5a5 to 35b9c9f Compare September 8, 2016 17:03
@aarzilli
Copy link
Member Author

aarzilli commented Sep 8, 2016

Rebased, the only failures I see (and the one in Appveyor) is connected to the Step code, I will debug it after #603, assuming it persists.

@aarzilli
Copy link
Member Author

I added a commit that fixes #594, I can split it out if you want to review it separately, it's pretty simple.

@@ -778,7 +778,9 @@ func TestStacktraceGoroutine(t *testing.T) {

for i, g := range gs {
locations, err := g.Stacktrace(40)
assertNoError(err, t, "GoroutineStacktrace()")
if err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a comment / explanation / TODO about why we're ignoring errors.

@derekparker
Copy link
Member

Overall LGTM, just a few comments. After those are addressed I will merge this.

@aarzilli
Copy link
Member Author

Done.

@aarzilli aarzilli force-pushed the solution3 branch 3 times, most recently from 2d29759 to ff41c6c Compare October 3, 2016 16:59
@aarzilli
Copy link
Member Author

aarzilli commented Oct 3, 2016

Rebased on master, I can't replicate the failure I see in appveyor, someone that can will have to fix that. I still think this is an improvement and should be merged.

Copy link
Member

@derekparker derekparker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@derekparker derekparker merged commit f6e8fb3 into go-delve:master Oct 22, 2016
abner-chenc pushed a commit to loongson/delve that referenced this pull request Mar 1, 2024
…o-delve#598)

* proc: changed windows backend to deal with simultaneous breakpoints

* bugfix: forgot to add windowsPrologue3 to the prologues list in e4c7df1

* Tolerate errors returned by Stacktrace in TestStacktraceGoroutine.

* bugfix: proc: propagate debug events we don't cause back to the target process

Fixes: go-delve#594

* proc: fixed TestStepConcurrentPtr

Implementation of nextInProgress was wrong.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants