-
-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PacketUnexpected with ThreadStopReason::GdbCtrlCInterrupt #89
Comments
Yeah... that's not good. Could you post some more context on how you arrived at this situation? i.e: the debug logs from |
gdbstub trace log
gdb log
|
Ahh, sorry - could you re-upload the gdbstub trace log with the |
Also, could you link to the specific revision of your |
I'm on the gdbstub with trace-pkt
|
Same issue with |
Before I look into it this more (likely sometime this weekend), do you know if this was something that started happening after I merged #88, or was this happening before as well? Theoretically, this assert is there because the GDB client should never send an interrupt packet while the GDB stub is still in the "paused" state (i.e: the |
Here is the start of my state machine: pretty much all the gdb logic stuff is in that file |
I didn't have |
Alright, I've skimmed through your implementation, and I think I understand the issue in your implementation. Before going into the specific issue at hand, lets back up for a second and discuss how the overall state machine flow goes:
The Once the In this state, there are two events that need to be handled (hence the two methods on The first case is easy to understand: call The second case seems a bit weird... why would the GDB client send data while the target is running? Well... as it turns out, Ctrl-C interrupts are signaled by sending special "interrupt" packets down the wire! Specifically, an interrupt can be signaled either by sending a single See https://sourceware.org/gdb/onlinedocs/gdb/Interrupts.html#Interrupts for details. In your implementation, you are not handling the "data is sent down the wire while the target is running" case at all. Instead, your current implementation seems to do the following:
i.e: you should only call A good starting point for how to structure your event loop would be to look at the guts of the new https://github.com/daniel5151/gdbstub/blob/4e46b72/src/gdbstub_impl/mod.rs#L199 After writing this explanation, I realize that the fact that you ran into this issue is indicative that this new API still needs some work + polish. I suspect that the API itself could be improved (possibly by adding some more fine-grained states), and it goes without saying that the biggest missing piece right now is good documentation/examples... Ideally, I want to craft an API that's impossible to misuse - leveraging as many Rust type-system shenanigans to have the compiler enforce all branches are being accounted for. I'll have to think about how I can tweak the current iteration of the API to achieve that... |
Ah, in addition, I really should remove the I'll go ahead and do immediately. |
See #89 for why this stop reason seems to do more harm than good.
Ah I see, yes I figured the chances are high it's just me doing something wrong here. This detailed explanation helps a lot. If you'd like me to I could try to put this into the docs (only if you don't want to do it yourself of course). |
I appreciate the offer, but I think this is something I should tackle myself - especially since there's a non-zero chance I rework the API a bit, which would invalidate any docs. And of course, when you get the chance, please let me know if my explanation + suggested fix was correct! |
PacketUnexpected
on Ctrl-C Interrupt
PacketUnexpected
on Ctrl-C Interrupt
Great, so thanks a lot for your help, I rewrote my state-machine a little to reflect the fact that it should pump() in DeferredStopReason when I have data available. I hope this accurately captures what you wanted me to do. With this change,
So while I think I understand what's going on: when I do e.g. the way the log looks it seems for one state-machine execution after taking a step it ends up many times in Pump state without me ever having a GDB prompt, so unsurprisingly it gets an unexpected interrupt packet when I hit Ctrl+c during this time:
|
Hmm, I'm not entirely sure what's going on here just yet, but here are a few things you can try:
|
Did some tests: Re 1: Changed it but didn't make a difference (still seems like it gets a ctrl+c packet when not expecting it) Re 2: Using the new feature branch ( With stepping is enabled, the
If I disable stepping (
|
FWIW it seems to me that the root of the issue is that the state machine unconditionally goes from deferred_stop_reason into pump state. However, I think in my case, gdb seems to think it's best to just execute the I'm maybe understanding this wrong but so far I assumed the STM is only supposed to be in the pump state when I have a gdb prompt on the client side? |
That could be indicative of an issue in how you're handling your breakpoints / other stop reasons... GDB will do some very weird stuff if breakpoints aren't implemented correctly, and that isn't something If it's not that... I can try to take a closer look sometime this weekend. In the meantime, you should consider doing some hands-on hacking with the current As mentioned earlier, this is relatively untested code! |
So I do/did worry about this, because I saw this piece in the docs and it was not something I'm doing.
In both examples if they use RFlags to single-step I couldn't find any adjustment they do to the RIP when single-stepping. I did add some asserts to verify when I get a debug interrupt due to hitting a breakpoint (with the |
While I can't give you a definitive yes/no answer for that, based on the docs you've linked, that seems correct? That said, the fact that you code completely breaks when single-stepping support isn't available is not a good sign (i.e: regardless if ctrl-c interrupts are being handled or not). Hardware-assisted single stepping is an optional feature, and if it isn't available, GDB should be using continue + temporary breakpoints to get things working. What if you roll-back the code to a point before you started trying to support ctrl-c interrupts, and make super sure that the simple case of disabling single stepping + resuming via continue & temporary breakpoints works as expected. |
And for having single-stepping disabled. I haven't shared a complete log for that failure yet so here is one: gdbstub trace log
gdb log
|
Wait a sec, that's very weird... why is your GDB client sending Assuming those logs were from the feature/optional-single-step branch, this particular PacketUnexpected should be happening in a totally different part of the codebase, namely: gdbstub/src/gdbstub_impl/ext/base.rs Line 510 in ba4b971
Could you add a log statement to |
Yes this is using the
|
Trace for that interaction: gdbstub trace log
|
Well, aside from the possibly weird interaction of returning a I quickly checked to see what happens if you disable single-step support (and just in case, range-step support) in the I'll be honest, I am now very confused. Your GDB client clearly acknowledged the Off the top of my head, a few more things to try:
|
Right, I noticed that too that I should probably respond with SwBreak. This is a problem with me forwarding this just to the HwBreak logic, will fix it. In theory it shouldn't matter for the vCont issue as this won't even set a BP. But I'll fix it now and report back... |
Ok I fixed reporting a HwBreak instead of SwBreak when appropriate.
I did some tests, it doesn't seem to matter what command I use:
|
Tried with latest gdb version 11.1 compiled from sources, same result:
|
Fascinating... Okay, another thought:
|
It's just
good question, I tried without passing a binary and/or gdbinit using |
Right. Well shoot, I'm stumped. Unless there's something incredibly obvious I'm missing, it seems that GDB is ignoring the fact that the stub doesn't support single-stepping, and is sending it single-step commands regardless. I'll try to find some time this weekend to whip up some kind of "dummy" In the meantime, I'll suggest that we shelve this digression into why optional single-stepping isn't working, and re-focus on the actual issue at hand: getting Ctrl-C interrupts working. Go ahead and re-enable support for single stepping support, and then see if you can get your state machine implementation working correctly without handling Ctrl-C interrupts. That should give us a baseline level of "working code" to then re-introduce Ctrl-C functionality into. |
Well I'll be damned. I hacked together a dummy I'm now digging through the GDB client code to figure out what the heck this is happening... |
Indeed, looks more and more like a gdb bug :) |
Right so coming back on this: I still believe there might be an issue with the assumptions in the state-machine; basically it seems what happens is that when I use |
Indeed, it seems you're right! I'm not sure why I didn't try this out sooner, but if I spin up the in-repo Now that I have an easy local repro, I can really dig into what's going on, though unfortunately, this week is a bit hectic for me, so I'm not sure when exactly I'll get the chance to dig into this issue... In any case, I can confirm this is definitely an issue on If you'd like to take a crack at fixing this issue, I'd be happy to review any code sent my way. Otherwise, give me some time, and I'll definitely fix this issue 😅 |
Sure sounds good, I can take a look this week and see if I can fix it before you |
Just a heads up - if you do end up trying to fix it, could you please investigate your fix in the context of the in-tree Assuming this isn't another funky architecture-specific issue, fixing it for the in-tree example should also fix things in your implementation, while also making it a lot easier for me to validate the solution on my end. |
Welp, this bug has nerd sniped me, so I spent my (late) lunch looking into this... I went ahead and tested out what Running What does "interrupt the target" entail? The gdbserver triggers a SIGINT (signal number 02) within the running process, which is then returned in the next stop-reply packet! i.e: instead of returning a typical SIGTRAP ( In fact, if you run the GDB command It seems a possible solution here will be to make the EDIT: I have a (brittle) local fix that seems to prove that returning |
Nice! |
this lifts the "disconnected" and "interrupted" events into the state machine itself, thereby making it impossible for implementation to accidentally not handle them. this change also fixes the PacketUnexpected error when a target gets interrupted while it is not running, closing #89 that said, this commit _has_ bumped the binary size a bit, so I might take a crack at reworking some of the internals to crunch things down a bit...
Sorry for the delay, but the fix is now in I've gone ahead and substantially reworked the state machine API to be a bit more streamlined and consistent. Notably, handling interrupts and disconnect conditions is now done via dedicated state machine states, rather than While I don't have great documentation just yet, I would once again recommend that you take a look at the implementation of I will close this issue after you get a chance to try out these changes + confirm that they are working as expected. Cheers! |
sounds good I will give it a try this weekend! |
Just a heads up - I spent a bunch of time working on Apologies for the API churn! |
Ok --I have a test now that does Ctrl+C & continue and happy to report that it's working as expected on dev/0.6 👍 |
Got excited too early. I still have some issue with next (aka doing single-stepping) and sending Ctrl+C during that time. Will need to investigate further. |
Ok nvm, I figured that was my bug -- when single-stepping I accidentially reported StepDone instead of SIGINT to |
Fantastic! It sure was a long journey to get this fixed, but I'm glad we got it working in the end! All that's left is the arduous task of documenting all of these APIs, updating the changelog, writing a 0.5 migration guide, and publishing to crates.io... At this rate, it's looking more and more likely that all that is going to have to happen sometime in December over the holiday season 😅 Cheers! |
I tried to support Ctrl+C (in GDB) by waiting for the serial line interrupt, and if it arrives providing a
ThreadStopReason::GdbCtrlCInterrupt
to the state machine. However, this gives me:Which seems to be going here. The comment is strange because I'm doing a Ctrl+C while my code is spinning in an infinite loop. Anyways I figured I'd ask also because PacketUnexpected is treated as "always a bug"...
The text was updated successfully, but these errors were encountered: