Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make single-stepping optional (removing ResumeAction) #92

Merged
merged 3 commits into from
Oct 22, 2021

Conversation

daniel5151
Copy link
Owner

Closes #59

See the associated issue for context and rationale.

This is definitely a big API change, but one that is important to make. Single stepping should not be required by the base protocol, as while it's trivial to implement on some targets (e.g: emulators), it is not trivial to implement in others.

The GDB client is able to spoof single-stepping by setting temporary breakpoint instructions in the guest. While this is undoubtedly slower than an optimized single-stepping implementation, it has the key benefit of working "out-of-the-box", without any explicit effort required by the gdbstub user to get it working.

@daniel5151 daniel5151 added this to the 0.6 milestone Oct 4, 2021
@daniel5151
Copy link
Owner Author

@gz and @bet4it, could I ask you two to do me a favor and make sure these changes work as expected?

I've confirmed that things are working in the armv4t and armv4t_multicore examples, but it never hurts to get an extra set of eyes on these kinds of large changes!

@bet4it
Copy link
Contributor

bet4it commented Oct 8, 2021

Just find you use support_{reverse_cont, reverse_step, single_step, range_step} here.
Do I need to rename enable_{close, pread, pwrite, ...} to support_{close, pread, pwrite, ...}?

@daniel5151
Copy link
Owner Author

As I've started doing more hands-on work in gdbstub recently, I've also come to notice the inconsistent naming...
I never settled on a consistent set of naming conventions, and over time, it seems that's resulted in more and more inconsistencies slipping in. Oops.

While I appreciate your offer, I'm probably going to do my own naming-consistency pass over the codebase at some point in the future, whereby I settle on a clear naming convention for IDETs.

@daniel5151
Copy link
Owner Author

Update: While working on #89, we discovered that the mainline GDB client doesn't seem to respect the optional nature of single-stepping on certain architectures!

Some context: #89 (comment)

I spent some time digging into GDB's source code to see if I could figure out why this was happening (including running GDB under GDB while the innner GDB was connected to gdbstub, just so I could see what GDB was going), but so far, I haven't been able to figure out the root cause of why this was happening.

All I know is that on the armv4t arch, things seem to work as expected (i.e: if the stub doesn't report support for single-stepping, the GDB client won't send single-step commands), whereas on x86, the GDB client will entirely ignore the lack of reported support, and send a single-step command regardless.

I've created a basic repro of this phenomenon over at https://github.com/daniel5151/gdb-optional-step-bug, and will be filing a bug upstream shortly.


In the meantime, I'm trying to think how I want to tackle this bug in gdbstub...

I suspect I'll have to introduce something similar to the implicit_sw_breakpoint guard rail introduced in 1d55539, except this time, it would be part of the Arch trait itself, and would be something like fn optional_single_step_support() -> bool.

i.e: armv4t works as expected, so it would report true from optional_single_step_support() at the Arch level. x86 does not work, so it would report false from optional_single_step_support().

Similarly, there would be a new top-level Target method called override_arch_optional_single_step_support() -> bool, which gives target implementation a chance to explicitly opt-in to supporting optional single stepping, e.g: if/when GDB upstream fixes this bug.

@daniel5151
Copy link
Owner Author

The bug is now filed at https://sourceware.org/bugzilla/show_bug.cgi?id=28440

Lets hope it hasn't actually been sent to /dev/null like some of the other bugs/questions I've reported...

@daniel5151
Copy link
Owner Author

Just pushed up some changes that add a guard rail to work around this bug. The Arch trait now includes a new supports_optional_single_step method, which is an "opt-in" way for architectures to signal support for optional single stepping.

At the moment, only armv4t is confirmed working with optional single stepping, but hopefully, more architectures will get tested over time...


With this new guard rail, I will probably merge these changes into dev/0.6 shortly.

@daniel5151 daniel5151 merged commit 644e138 into dev/0.6 Oct 22, 2021
@daniel5151 daniel5151 deleted the feature/optional-single-step branch October 22, 2021 19:37
@bet4it
Copy link
Contributor

bet4it commented Nov 17, 2021

I finally have time to figure out the reason why GDB client doesn't respect the optional nature of single-stepping on certain architectures:

GDB parses the result of vCont? in remote_target::remote_vcont_probe:
https://github.com/bminor/binutils-gdb/blob/c599303f92b1edd6ead3947737a8ae9e1c85e08c/gdb/remote.c#L6117-L6167
The state of s and S is stored in rs->supports_vCont.s and rs->supports_vCont.S and is only used in remote_target::can_do_single_step:
https://github.com/bminor/binutils-gdb/blob/c599303f92b1edd6ead3947737a8ae9e1c85e08c/gdb/remote.c#L14316-L14335
And if you search can_do_single_step in GDB source code you will find that it only be used in ARM:
https://github.com/bminor/binutils-gdb/blob/c599303f92b1edd6ead3947737a8ae9e1c85e08c/gdb/arm-linux-tdep.c#L927

That means that, although we guess from the spec ( I don't find the clear explanation of vCont? ) that GDB client should respect the result of vCont?, but the implementation of GDB itself doesn't really care about it ( except in ARM ).
If an architecture wants to support software single step, it need to register the function with set_gdbarch_software_single_step. You may find that some architectures such as X86 doesn't set it, so it's impossible to use software single step on such architectures.

The result of above: only the behavior of ARM will be affected by vCont?. Architectures such as X86 and AArch64 will always use hardware single step, while architectures such as MIPS will always use software single step, despite the result of vCont?. I have tested and confirmed it.

Then the code currently should be changed:

  1. You use supports_optional_single_step to make sure single step is implemented on X86 so the step command can work correctly ( although step could not be used during one debug session: Make single-stepping optional #59 (comment) ). If you think it's necessary, then we may also need to make sure the user implement software breakpoint on architectures such as MIPS, otherwise we can't do step.
  2. Or I think we could just remove the guard rail of optional single step and let the user choose which is necessary to implement.

@daniel5151
Copy link
Owner Author

Thanks for the deep-dive @bet4it. This is super valuable stuff.

Indeed, the idea behind the supports_optional_single_step architecture is to be able to encode whether the GDB client exhibits this issue on a per-arch level. At the time, we'd only confirmed that x86 requires hardware single stepping, and that ARM does not, falling back to the known-safe baseline behavior of "you must support single-step" on all other architectures (that we did not hands-on audit).

That means that, although we guess from the spec ( I don't find the clear explanation of vCont? ) that GDB client should respect the result of vCont?, but the implementation of GDB itself doesn't really care about it ( except in ARM ).

Yep, this behavior is definitely not "in the spec", and is strictly an issue with the GDB client implementation. Unfortunately, it's enough of a usability concern that it makes sense for gdbstub to include a guard rail to make sure end users aren't bitten by any surprising errors (as gz was when he was working on his implementation).

If you think it's necessary, then we may also need to make sure the user implement software breakpoint on architectures such as MIPS, otherwise we can't do step.

As is precedent in the project, I would not be comfortable encoding these sorts of invariants without getting sign-off from someone with a working target implementation for a particular architecture. i.e: I'll only merge a arch-level supports_optional_single_step update if the author has validated the behavior on a "real" target implementation.

To reiterate though: the baseline behavior of the supports_optional_single_step infrastructure is to enforce targets implementing single-step support, unless they have been explicitly tested to support implicit support (i.e: ARM), so your example with MIPS is currently doing the right thing.


What I should do is spin up a tracking issue that tracks which current Arch imlpls in gdbstub_arch still need to be audited. Maybe I'll do that sometime today...

@daniel5151
Copy link
Owner Author

Also, if you don't mind, could you post that exploration over on the upstream bug report at https://sourceware.org/bugzilla/show_bug.cgi?id=28440?

@bet4it
Copy link
Contributor

bet4it commented Nov 19, 2021

I think it's not a bug of GDB.
Firstly, the spec never guarantees when vCont? doesn't get ;s;S, GDB will use breakpoints to emulate single stepping. You get this illusion by the behaviour on ARM and guessing of the thing vCont? should do.
Then it's no problem that the implementation only covers part of the spec, right? 99.99% users of GDB client connect to gdbserver that running on a real hardware, and whether the CPU architecture supports hardware single step is deterministic. If the architecture don't support hardware single step, does GDB need to add code about hardware single step? ( although it's still possible when we emulate the architecture with a software way ) And if the architecture support hardware single step, it's not necessary for GDB to emulate single stepping by breakpoints ( although GDB does this on ARM ). If GDB fully considers the result of ;s;S, it could even not find a place to test the code ( where can you find a MIPS CPU that supports hardware single step? ), and nearly no people will use this function.

And you did't understand what I meant before. You think step can be optional because if it's not implemented, GDB should emulate it by temporary breakpoints. But I think step can be optional because if the user don't use commands like stepi, step won't be used, so we can just leave it to the user, let the user choose if they need to implement it.
You use supports_optional_single_step to make sure step can still work on some architectures even if the user don't implement it. But why step is so important that it must work? What if breakpoint is not implemented neither? And as I said before, only ARM will use breakpoint as a backup way of step.

to make sure end users aren't bitten by any surprising errors

According to common sense, the user shouldn't expect step can work if they not implement it ( Yes ARM can give us a suprise but it's not the common behaviour ). And if a user uses gdbstub on MIPS, he will find that step still can't work even if he implements the step function, because GDB will only use breakpoints to implement single step on such architectures, and doesn't consider if you support step by yourself. You use supports_optional_single_step to make sure step can work, then do we need another guard rail on MIPS to enforce the user implement breakpoint so step can work? I think it's better to just remove all the guard rails.

I would not be comfortable encoding these sorts of invariants without getting sign-off from someone with a working target implementation for a particular architecture.

I'm working on a project that uses gdbstub and can work on all major architectures ( X86, X86_64, ARM, AArch64, MIPS currently and can support PowerPC and RISCV with small changes later ). It will be public later ( I really hope gdbstub can release 0.6 before I release my project ), but if you want to test different architectures by yourself, I can give you privilege to my project.

@daniel5151
Copy link
Owner Author

I think it's not a bug of GDB.

This is definitely a bug in GDB. If you boil it down to the bare minimum, the bug is very simple: if a client doesn't respond with support for s;S, the GDB client should not send s or S or vCont;s;S packets!

The issue at hand is that when the target doesn't report support for single stepping, and the user tries to use stepi, the client doesn't report "command not supported by remote target" (as it would for any other packet that the target hasn't specified support for), and on some architectures, it simply sends the packet anyways.

In other words, the GDB client is not respecting the optional nature of single stepping, as specified by the GDB spec, and it needs to be updated to either a) force remote targets on certain arch's (e.g: x86) to support single-stepping, or b) properly report "stepi is not supported" when the target doesn't respond with support for single-stepping.

You use supports_optional_single_step to make sure step can work

That's not true at all. The purpose of this guard rail is to work around the aforementioned GDB bug, and avoid the gdbstub unexpectedly dying due to "packet unsupported" errors when the user attempts to stepi on a a target that hasn't implemented the single-stepping extension.

The reason I've made this guard rail configurable on a per-arch level is that the GDB client clearly has the ability to work around the lack of single-stepping support on certain platforms (e.g: ARM), and it would be a shame to avoid exposing that functionality.

I'm working on a project that uses gdbstub and can work on all major architectures ( X86, X86_64, ARM, AArch64, MIPS currently and can support PowerPC and RISCV with small changes later ).

Ah, nice! Based on our previous interactions, I had assumed you were working with primarily x86.

Sure, if you've got an implementation that works across multiple architectures, I'd me more than happy to take your word on how various features work on a certain arch.

e.g: if you have the time, it would be great if you could check whether other platforms exhibit the aforementioned bug, so that we can update their archs' supports_optional_single_step accordingly :)


And if a user uses gdbstub on MIPS, he will find that step still can't work even if he implements the step function, because GDB will only use breakpoints to implement single step on such architectures, and doesn't consider if you support step by yourself.

I don't have access to a MIPS target, so I can't double check that behavior, nor do I fully understand what behavior you're describing here...

I interpret what you wrote as as "the GDB client doesn't use the single-step handler on MIPS targets, even if its present".

If that's the right interpretation, then that's not great, but it's also not something gdbstub really needs to care about. i.e: the bug would manifest as "single stepping isn't as fast as it could be", rather than "gdbstub crashes due to the client sending an unexpected packet".

Though it might be worthwhile to document this behavior somewhere anyways...


According to common sense, the user shouldn't expect step can work if they not implement it

I think it's better to just remove all the guard rails.

You really need to get out of the C/C++ programmer mindset! 😄

Rust the language, and gdbstub the project, are all about adding guard rails!
e.g: array indexes in Rust are bounds checked by default because time has shown very clearly that programmers are human, and don't always write perfect code.
e.g: gdbstub's entire API is structured around IDETs to enforce that certain "groups" of packets are implemented together (consider z and Z).

The ethos of gdbstub, as well as any idiomatic Rust project, is to make it as hard as possible to misuse the APIs being exposed, while also providing an "escape hatch" for users that know what they're doing.

e.g: with checked array indexing, Rust has a guard rail for most casual users, ensuring they don't accidentally corrupt their own memory, while also providing unsafe methods for "power users" that give them the ability to override these guard rails when required.

e.g: with gdbstub, the various guard rails are in place to ensure users don't run into any unexpected footguns out of the library's control (such as this optional single step bug, or the implicit software breakpoints behavior). Sure, you could mention these behaviors purely in the library's docs, but as time has shown many times, invariants encoded in doc comments are often skimmed over and forgotten!

The crux of all these guard rails, API contortions, and extra engineering effort boils down to two very simple words: user empathy.

I hate debugging API misuse issues in C/C++ projects, and I appreciate that Rust's language and community is of the opinion that APIs should be as difficulty to misuse as possible. That makes my life a lot easier as a library consumer, and as such, I try and pass on these gains to folks that end up consuming my library.

@daniel5151
Copy link
Owner Author

Oh, and as for releasing gdbstub 0.6: I don't forsee any more breaking functionality being added prior to release. The only blocker now is documentation, which I should have time to do sometime over the next months or two (with the holidays coming up, I should have more time to sit down and hammer stuff out).

That said, if you're squeamish about pulling deps directly from git + tying to a particular commit hash, I don't mind pushing out a 0.6.0-beta-1 release on crates.io in the interim...

@bet4it
Copy link
Contributor

bet4it commented Nov 22, 2021

Oh no, you totally don't understand me...

if a client doesn't respond with support for s;S, the GDB client should not send s or S or vCont;s;S packets!

But the spec doesn't point it out. Don't you think we should always follow the spec and if the spec doesn't say something we should assume that it doesn't exist?

According to the code of GDB, this behaviour does exist, but only on ARM.

And even client responds with support for s;S, GDB client may not send s or S or vCont;s;S. GDB client will use breakpoint on MIPS.

force remote targets on certain arch's (e.g: x86) to support single-stepping

How can GDB do this? For example, how can GDB force remote targets support vMustReplyEmpty?

avoid the gdbstub unexpectedly dying due to "packet unsupported" errors when the user attempts to stepi on a a target that hasn't implemented the single-stepping extension.

If the GDB server doesn't respond with support for s;S on ARM, and breakpoint is not implemented, gdbstub will also die unexpectedly when the user attempts to stepi. How can you solve it?

the GDB client clearly has the ability to work around the lack of single-stepping support on certain platforms (e.g: ARM)

if you have the time, it would be great if you could check whether other platforms exhibit the aforementioned bug

Didn't I tell you that according to the code and the behaviour, only GDB on ARM has the ability to work around the lack of single-stepping support? This work around is a special case!

I interpret what you wrote as as "the GDB client doesn't use the single-step handler on MIPS targets, even if its present".

Yes, that's what I want to say.

but it's also not something gdbstub really needs to care about

But the gdbstub will die unexpectedly due to "packet unsupported" errors when the user attempts to stepi on a target that hasn't implemented the breakpoints extension...

@bet4it
Copy link
Contributor

bet4it commented Nov 22, 2021

The only blocker now is documentation, which I should have time to do sometime over the next months or two

Oh no, there should be some small changes we discussed before. For example, make resume optional.

That said, if you're squeamish about pulling deps directly from git + tying to a particular commit hash, I don't mind pushing out a 0.6.0-beta-1 release on crates.io in the interim..

Oh, it's not necessary. I can just target on the dev/0.6 branch.

@daniel5151
Copy link
Owner Author

But the spec doesn't point it out.

That's the entire point of the vCont? packet. It gives the GDB client a chance to perform feature negotiation with the remote target. If the remote target doesn't support a particular resume packet, the GDB client should not send it. Plain and simple.

force remote targets on certain arch's (e.g: x86) to support single-stepping

How can GDB do this?

Apologies, allow me clarify: when I say "force" I mean "report an error". i.e: if you're trying to resume an x86 target that doesn't support single step, the GDB client should simply refuse error when attempting to stepi, thereby "forcing" the target implementation to go back and implement support for single-stepping (if that's a feature they want to use)

Again, if you take a step back for a moment, and look at the bigger picture (ignoring the specific cases of x86, ARM, MIPS, etc...), the core bug is simple and undeniable: the GDB client must respect the feature negotiation that happens as part of vCont?, either by working around the lack of stepi support by falling back to a breakpoint based approach, or simply erroring when the user tries to run stepi.

It should not send explicitly unsupported vCont packets to the remote target after vCont? feature negotiation.

Didn't I tell you that according to the code and the behaviour, only GDB on ARM has the ability to work around the lack of single-stepping support? This work around is a special case!

Careful now, you're doing the thing again! You know, the thing thing where you make some short-term design decisions based on the concrete implementation details of a particular GDB client 👀

Based on my reading of the spec (and in this case, I'm almost certain that I'm reading it right), all targets should be allowed to omit support for single-stepping (as part of vCont feature negotiation), and in those cases, the GDB client should either a) disallow using the stepi GDB command, or b) work around the lack of support using breakpoints (if available).

The fact that the current GDB client has a bug, and doesn't follow the spec is not gdbstub's problem. We follow the spec, and anticipate that at some point in the future, GDB clients fix this bug. At that point, gdbstub users will have the option to bypass the current guard rail, and have single-step be an optional feature across all architectures.


But the gdbstub will die unexpectedly due to "packet unsupported" errors when the user attempts to stepi on a target that hasn't implemented the breakpoints extension...

Oops, yep, you're totally right, and that's something I'd intended to handle in the guard rail, and simply forgot to do. It's very easy to check for - simply augment the guard rail to check if the target has implemented breakpoint support :)

Thank you for pointing this out - I'll push out a fix for this shortly...


Oh no, there should be some small changes we discussed before. For example, make resume optional.

Shoot, I did say that I'd look into that, didn't I...
Okay, fair enough 😅

I'll try and land that change before pushing out 0.6. As I mentioned in the comment, it should be a mostly mechanical change to split out this functionality from the base method set...

@daniel5151
Copy link
Owner Author

But the gdbstub will die unexpectedly due to "packet unsupported" errors when the user attempts to stepi on a target that hasn't implemented the breakpoints extension...

Oops, yep, you're totally right, and that's something I'd intended to handle in the guard rail, and simply forgot to do. It's very easy to check for - simply augment the guard rail to check if the target has implemented breakpoint support :)

After digging into this a bit, I realized that this isn't actually the case. If the target doesn't implement breakpoint packets, the GDB client will not send breakpoint packets, and will instead attempt to overwrite target memory with breakpoint instructions (which would only happen if the user enabled the use_implicit_sw_breakpoints guard rail)

@daniel5151
Copy link
Owner Author

5bc5831 + 1c6560e has made all forms of resume optional 🎉

@bet4it
Copy link
Contributor

bet4it commented Nov 30, 2021

I'm a bit busy recently. Why not you just support more architectures (such as MIPS) in gdb-optional-step-bug so you can try it by yourself? I think it's quite simple.

@daniel5151
Copy link
Owner Author

Why not you just support more architectures (such as MIPS) in gdb-optional-step-bug so you can try it by yourself?

This is indeed a reasonable approach, though I suspect its one that will sit on the backburner for a while as I work on finishing other things up. I'm also quite busy myself (with the recent thanksgiving long weekend offering a brief window where I could really sit down and work on gdbsub finally), so I'm not sure when I'd do it.

The key thing to remember is that testing various architectures is something that can be done asynchronously from mainline gdbstub development. This falls into the realm of the gdbstub_arch crate, which is decoupled from gdbstub precisely to enable these sorts of decoupled development patterns.

@bet4it
Copy link
Contributor

bet4it commented Jan 10, 2022

Hi Daniel, I notice you are about to release gdbstub 0.6 version.
Hope you can try the behavior of single step in different architectures before that, or just reconsider about the code I pointed out in #92 (comment).
I'm sure the current supports_optional_single_step way is not appropriate🤔

@daniel5151
Copy link
Owner Author

daniel5151 commented Jan 10, 2022

Hey @bet4it,

To refresh your memory on how the current implementation works:

  • supports_optional_single_step defaults to false on all platforms that don't explicitly support it. ARM is the only platform that currently defaults to true, since that's the one I've tested myself.
    • This is a reasonable baseline behavior, as the alternative of having it true by default would lead to fatal PacketUnexpected on platforms that upstream GDB considers single-stepping to be required on (i.e: ignoring the target's vCont response entirely), e.g: x86
    • Updating these defaults can be done asynchronously with a minor version bump to gdbstub_arch, so testing every Arch isn't a blocker to 0.6
    • If the two options here are "crash gdbstub with an opaque error + trash the user's GDB session" vs. "force the user to implement single-step / manually investigate if their platform requires single-step", I know which one I'm picking
  • If the target truly supports optional single step, and doesn't implement SW breakpoint, the use_implicit_sw_breakpoints will remind the user that attempting to use single step / continue in GDB without implementing explicit support for sw breakpoints will result in memory being overwritten, which should be enough of a wakeup call to have the user implement sw breakpoints.

AFAICT, these two guard rails should work in tandem to cover the cases you care about (notably, your quibbles about MIPS).


Please remember that the whole point of these various guard rails is to work around the fact that the mainline GDB client has some edge cases in how it handles targets with "non-standard subsets" of supported features (i.e: spec-compliant, but rarely seen "in the wild"). I'd much rather document + work around these upstream bugs in the context of gdbstub, as I don't want folks to run into these issues when working on their gdbstub implementation + open bug reports on my project that require upstream patches.

And of course, if you don't like these guard rails - you can just turn them off! Overriding supports_optional_single_step to always return true lands you right back at option 2 in #92 (comment), whereby it'd be up to the GDB client user to make sure they aren't invoking the wrong command.

@bet4it
Copy link
Contributor

bet4it commented Jan 11, 2022

I want to ask you, what's the meaning of supports_optional_single_step?

You use it to represent whether or not the target supports single-stepping as an optional feature.
I guess that in your option if the target supports it (such as arm), that means:

  1. You can choose if you want to implement single step manually. The behavior of GDB client will change base on it.
  2. If you implement single step, then it's obvious that single step can be used. And even if you don't implement single step manually, GDB will emulate it by breakpoint, so it still can be used.

And if the target doesn't support it (such as x86), that means:

  1. You must implement single step manually to support single step, otherwise gdbstub will die unexpectedly due to "packet unsupported" errors.

Is my understanding correctly?

Firstly, as I said before, the spec never guarantees the behavior of GDB client need to change base on the result of vCont?, and we can know from the code that only arm has the special treatment for this, so this is only a special case, or you can think it as a feature of arm. You think it should be very common just because mainly of your work are done on arm. I don't think it's appropriate to add a global option on just a feature of one specific architecture.

As for 2 and 3, I told you that there is another situation: You must implement breakpoint manually to support single step, and if you implement single step but no breakpoint the single step feature still can't be used on MIPS. Then what value supports_optional_single_step should be set on such architectures?

@daniel5151
Copy link
Owner Author

So, it does indeed seem that you're misunderstanding the purpose of this guard rail infrastructure. Maybe this is indicative of poor naming or incomplete docs, in which case I'll have to think about what a better name might be...

I think the docs make it clear enough, but maybe you can offer some clarifications / suggestions if the current docs aren't sufficient: https://github.com/daniel5151/gdbstub/blob/3e5d2bc/src/arch.rs#L161-L196

In any case, let me go through your explanation and correct some things:

You use it to represent whether or not the target supports single-stepping as an optional feature.

So, I think this might be a root of confusion. The intent here is that the value this method returns is dependant on whether or not the upstream GDB client supports optional single step for the architecture, which in turn will guide whether or not the target is allowed to treat single step as optional.

With that in mind...

  1. You can choose if you want to implement single step manually. The behavior of GDB client will change base on it.

If you implement single step, gdbstub will send s;S as part of the response to vCont?, notifying the GDB client that the target supports single step. If you don't, it will not.

This behaviour has nothing to do with the value of this guard rail.

2. If you implement single step, then it's obvious that single step can be used. And even if you don't implement single step manually, GDB will emulate it by breakpoint, so it still can be used.

So, clarification: GDB will emulate it by breakpoint, but what kind of breakpoint will depend on whether or not sw breakpoints are supported. If a target doesn't implement single step, AND doesn't implement sw breakpoints, then GDB will literally overwrite instruction memory with undefined instructions to insert a "breakpoint". This is very weird behaviour in some contexts (notably, emulators/hypervisors), which is why the use_implicit_sw_breakpoints exists. It will loudly complain in this case, and remind the user that they must implement sw breakpoints OR accept the fact that GDB will overwrite memory.

And if the target doesn't support it (such as x86), that means:

  1. You must implement single step manually to support single step, otherwise gdbstub will die unexpectedly due to "packet unsupported" errors.

This is exactly correct. On x86, GDB will NOT use breakpoints (software, or implicit sw) to implement single step, and ALSO ignores the response from vCont?, assuming unconditional support for single step. Therefore, the single-step IDET MUST be implemented, or else when the GDB client sends s to gdbstub, that would be an unexpected packet, and cause a runtime error.

Firstly, as I said before, the spec never guarantees the behavior of GDB client need to change base on the result of vCont?,

Right, well, that's where I fundamentally have to disagree. vCont? is clearly a handshake feature negotiation packet, so the correct behaviour of the upstream GDB client is to respect the response from the target. The fact that it doesn't do this for single step on certain (possibly many!) architectures is a bug, and should be fixed upstream.

I suspect the reason no one has reported this in the past is because no one has ever written a truly generic gdbstub yet, with each existing implementation either being tied to a particular execution framework (e.g: QEMU), where single stepping is always supported regardless of platform, or the implementation is tied to a single platform/architecture, so supporting single-step is either in or not, without any configurability.

You must implement breakpoint manually to support single step, and if you implement single step but no breakpoint the single step feature still can't be used on MIPS. Then what value supports_optional_single_step should be set on such architectures?

So, this is exactly the use case for the other guard rail: use_implicit_sw_breakpoints. In this scenario, that other guard rail would fire, and remind you that you need to support some kind of breakpoints, as if you don't, GDB will attempt to emulate them via overriding memory.

In other words, this scenario you're explaining has nothing to do with optionally supporting single step!


Please let me know if that makes sense, and if there's anything you still don't understand. I'm very open to improving the docs to avoid this sort of confusion in the future.

@bet4it
Copy link
Contributor

bet4it commented Jan 11, 2022

So the key contradiction between us is still that you focus on spec but I focus on implementation.

According to the spec, GDB should treat single stepping as an optional feature for all architectures, as single stepping can be emulated using temporary breakpoints + regular "continue" resumption.

You think that an ideal GDB client should always respect the result of vCont? on all architectures. If the result include s;S, GDB client should use s to step, and if the result not include it, GDB client should know to use breakpoint to emulate single step.
Currently, GDB client doesn't work like this on some architectures, you think it's a bug of GDB client, and you believe the bug can be solved one day. So you set supports_optional_single_step to false on all the architectures that still use s even if the server report it doesn't support it, and in the future if GDB client knows to use breakpoints to emulate single step on such architectures, you will set supports_optional_single_step back to true on such architectures. Right?

But it seems that you don't think about the situation that always implements single step by breakpoints in your ideal GDB client? You don't answer me that what value supports_optional_single_step should be set on such situations, or you think that the option is meaningless on such situations so we don't need to set it?


I don't want to discuss on whether it's a bug. I just want to say that the situation will most likely not change during the lifetime of gdbstub.

I suspect the reason no one has reported this in the past is because no one has ever written a truly generic gdbstub yet, with each existing implementation either being tied to a particular execution framework (e.g: QEMU), where single stepping is always supported regardless of platform, or the implementation is tied to a single platform/architecture, so supporting single-step is either in or not, without any configurability.

I agree with you on this. In fact, I expressed the same meaning before:

Then it's no problem that the implementation only covers part of the spec, right? 99.99% users of GDB client connect to gdbserver that running on a real hardware, and whether the CPU architecture supports hardware single step is deterministic. If the architecture don't support hardware single step, does GDB need to add code about hardware single step? ( although it's still possible when we emulate the architecture with a software way ) And if the architecture support hardware single step, it's not necessary for GDB to emulate single stepping by breakpoints ( although GDB does this on ARM ). If GDB fully considers the result of ;s;S, it could even not find a place to test the code ( where can you find a MIPS CPU that supports hardware single step? ), and nearly no people will use this function.

That's why I think the current situation will most likely not be changed. But you insist to add code to deal with it.


I just want to ask you, if I'm a normal user of gdbstub, and I want to implement gdbstub on a MIPS system, how could I know how to implement it? I want to support single step, but after I implement single step, it still can't be used. I don't know what happened, and I find supports_optional_single_step, should I change it? And gdbstub complains about use_implicit_sw_breakpoints, I don't understand why do I need to think of breakpoint when I implement single step.

You think guard rail is so important, then we may add some guard rails to force the user to implement breakpoint if he want to support single step on MIPS?


Another question: I find the check of use_implicit_sw_breakpoints was placed in run_state_machine, so gdbstub will abort only after it's connected by client. Could we check it beforehand?

@daniel5151
Copy link
Owner Author

Alright, it seems you're on the same page as me wrt how I'm thinking about the situation. That's a start.

But it seems that you don't think about the situation that always implements single step by breakpoints in your ideal GDB client? You don't answer me that what value supports_optional_single_step should be set on such situations, or you think that the option is meaningless on such situations so we don't need to set it?

Once again, the value of supports_optional_single_step is independent of the particular mechanism by which single stepping is implemented - be it native (e.g: the TRACE flag on x86), hw/sw breakpoints (via the GDB RSP), or implicit (by overriding memory). The purpose of this guard rail to work around the fact that the current mainline upstream GDB client doesn't respect the s;S (or lack thereof) in the vCont? packet uniformly across all platforms.

That's it. No more, no less.

I will try to make the docs clearer in this respect, and make it more "front and center" that this guard rail is about working around a mainline GDB client bug.


If GDB fully considers the result of ;s;S, it could even not find a place to test the code ( where can you find a MIPS CPU that supports hardware single step? ), and nearly no people will use this function.

A few very pertinent points in regards to this one statement:

  1. the fact that optional single stepping was only introduced 2 years after gdbstub's first release is precisely because it's an incredibly niche optional feature, and almost all implementation do end up implementing support for single step! I'd really like to end this back and forth, because we're splitting hairs on a feature that most folks will never care about in the first place.

  2. In my opinion (and feel free to vehemently disagree), if the GDB client expects a target to support single-step, it must error out if the target doesn't respond with single-step capabilities in response to vCont?, as opposed to ignoring this part of the handshake and blindly sending unsupported packets to the target. That's just sloppy software engineering, and is absolutely a bug in my book - hence why I've filed a bug upstream regarding this issue.

  3. You're once again conflating single step implementation with the purpose of this guard rail. It doesn't matter if MIPS doesn't support architecture-level hardware single step - the point of this guard rail is to protect against the GDB client unconditionally sending s packets on targets that have not implemented support for them. Whether or not it does that on MIPS... I have no idea, I haven't tested it. Someone should check, and if it doesn't then we can loosen the guard rail for the MIPS implementation in gdbstub_arch.

    • And to reiterate: this is a guard rail that only really matter when using off the shelf Arch implementations from gdbstub_arch. If you're already implementing your own Arch, then you have free reign over whether this guard rail should be enabled or not.

You think guard rail is so important, then we may add some guard rails to force the user to implement breakpoint if he want to support single step on MIPS?

Dude, that is literally the current behavior of the use_implicit_sw_breakpoints guard rail!! This is what I've been trying to tell you across countless replies?!

If the user implements support for single-step on MIPS (via the set_resume_action_step IDET) without implementing any kind of breakpoint support, the use_implicit_sw_breakpoints will fire by default to warn them that GDB will attempt to emulate breakpoints by overriding memory. At this point, they can either opt-in to the functionality, or go ahead and implement breakpoints.

This is an explanation I've included many times, and it seems you're just not listening to me:

#92 (comment)

You must implement breakpoint manually to support single step, and if you implement single step but no breakpoint the single step feature still can't be used on MIPS. Then what value supports_optional_single_step should be set on such architectures?

So, this is exactly the use case for the other guard rail: use_implicit_sw_breakpoints. In this scenario, that other guard rail would fire, and remind you that you need to support some kind of breakpoints, as if you don't, GDB will attempt to emulate them via overriding memory.

#92 (comment)

  • If the target truly supports optional single step, and doesn't implement SW breakpoint, the use_implicit_sw_breakpoints will remind the user that attempting to use single step / continue in GDB without implementing explicit support for sw breakpoints will result in memory being overwritten, which should be enough of a wakeup call to have the user implement sw breakpoints.

#92 (comment)

After digging into this a bit, I realized that this isn't actually the case. If the target doesn't implement breakpoint packets, the GDB client will not send breakpoint packets, and will instead attempt to overwrite target memory with breakpoint instructions (which would only happen if the user enabled the use_implicit_sw_breakpoints guard rail)


So the key contradiction between us is still that you focus on spec but I focus on implementation.

I dunno what to tell you man.

That's the ethos of my open source project.

If you don't like it, feel free to fork gdbstub. That's what open source is all about.

And for what it's worth, I've recently had the pleasure (displeasure?) to gdbstub alongside some non-standard GDB clients, at which point, the spec became extremely important, since those alternative clients built their implementation based off the spec doc.


Another question: I find the check of use_implicit_sw_breakpoints was placed in run_state_machine, so gdbstub will abort only after it's connected by client. Could we check it beforehand?

The earliest point where I can insert these checks are after the GdbStub object has been constructed, which currently also requires an active Connection object to be be passed to it during init. I'll mull over if there's some way to lift these checks further up the chain, but I suspect that it might not be possible without some kind of "staggered" GdbStubBuilder initialization (something which I probably won't get around to in time for 0.6).

If this is something you'd really like, I can probably expose some kind of "validate_target" method (somewhere) that can be manually invoked to run these checks, though idk if it's really worth it, given that the checks are moreso a development aid, that shouldn't fire during "steady state" operation of the library.

@bet4it
Copy link
Contributor

bet4it commented Jan 12, 2022

the value of supports_optional_single_step is independent of the particular mechanism by which single stepping is implemented

I'm very clear of it, but I also know that supports_optional_single_step is tied with specific architecture, so what I ask is what supports_optional_single_step should be set on MIPS if MIPS works as I described, or you can tell me we don't need to set it on MIPS (so we will use the default false value).


the fact that optional single stepping was only introduced 2 years after gdbstub's first release is precisely because it's an incredibly niche optional feature, and almost all implementation do end up implementing support for single step! I'd really like to end this back and forth, because we're splitting hairs on a feature that most folks will never care about in the first place.

There are two reasons:

  1. If I don't need to implement it, I can just leave it as an empty function if it can't be opted out. It happened when I want to use gdbstub to create a minimal dummy example that can be connected by a normal GDB client previously: I only need to implement read_registers and read_addrs, and leave all other functions empty.
  2. You choose ARM program as example and mainly test on it, and nobody else use gdbstub on MIPS. If you choose MIPS, you will find it's very necessary: step is totally meaningless on MIPS.

Whether or not it does that on MIPS... I have no idea, I haven't tested it.

I have confirmed it and I really suggest you to test it before 0.6.

And to reiterate: this is a guard rail that only really matter when using off the shelf Arch implementations from gdbstub_arch. If you're already implementing your own Arch, then you have free reign over whether this guard rail should be enabled or not.

I just don't want to see the situation that you find the name of supports_optional_single_step is improper and need to change it after you release 0.6. Oh, that could be nothing for you🙃 Then you can just do it and ignore me.


Dude, that is literally the current behavior of the use_implicit_sw_breakpoints guard rail!! This is what I've been trying to tell you across countless replies?!

I know it, but how can a new user of gdbstub imagine use_implicit_sw_breakpoints has relation with single step?

That's what may happen when a new user tries to use gdbstub (#92 (comment)):

I just want to ask you, if I'm a normal user of gdbstub, and I want to implement gdbstub on a MIPS system, how could I know how to implement it? I want to support single step, but after I implement single step, it still can't be used. I don't know what happened, and I find supports_optional_single_step, should I change it? And gdbstub complains about use_implicit_sw_breakpoints, why do I need to think of breakpoint when I implement single step?


And for what it's worth, I've recently had the pleasure (displeasure?) to gdbstub alongside some non-standard GDB clients, at which point, the spec became extremely important, since those alternative clients built their implementation based off the spec doc.

Then it's even more improper to add supports_optional_single_step. You write gdbstub on spec, but if there is a non-standard GDB client that totally follow the spec on this, which means that it will respect the response of vCont?, The user of this non-standard GDB client will still face with the annoying supports_optional_single_step even if it's meaningless for this non-standard GDB.

And if I write my GDB client that doesn't totally follow the spec and will panic at some point, will you add guard rails for me? Certainly you won't. But why you add a guard rails for GNU GDB which panics because it doesn't follow the spec?


If this is something you'd really like, I can probably expose some kind of "validate_target" method (somewhere) that can be manually invoked to run these checks, though idk if it's really worth it, given that the checks are moreso a development aid, that shouldn't fire during "steady state" operation of the library.

Oh, if it's not easy to change we can just leave it as this.

@daniel5151
Copy link
Owner Author

what I ask is what supports_optional_single_step should be set on MIPS if MIPS works as I described

Based on what you've told me, it should be set to false. If the GDB client has been confirmed to send s packets to MIPS targets even in the case that the target itself has not responded with support for s support, then this must be set to false, as that means this arch does not support optional single step.

It just so happens that the default value is also false (i.e: "assume single step is required, until proven that it's possible to leave it optional"), which means you could just leave things as is, but i'd be nice to explicitly document this behavior by opening a PR against the corresponding gdbstub_arch implementation.

And like I said, this is not a 0.6 blocker, so if you'd like to open that PR, be my guest.

As part of that PR, please include the protocol logs that show the GDB implementation sending a response to vCont? omitting s support, and then later on, showing how the GDB client will still send an s packet (and how the gdbstub proceeds to fail with an unexpected packet).


  1. If I don't need to implement it, I can just leave it as an empty function if it can't be opted out.

You're conflating single step functionality and single step at the protocol level. If the GDB client request a single step via the GDP RSP, it's my duty as the library author to notify the user of this request - regardless of how they want to handle it.

...unfortunately, the only way to route this message to the user is via the single step IDET + its associated handler, which if left unimplemented, entirely stubs out the packet-parsing codepath related to the s packet. This is the intended behavior, and the whole point of using IDETs - you don't pay for what you don't use.

Unfortunately, that means that only notification the user gets in these situations would be an opaque PacketUnexpected error, at which point they proceed to open an issue GitHub that I have to respond to.

With this guard rail in place, this error is avoided entirely, since the user is notified that the GDB client will not respect the spec-compliant optional nature of single step for their chosen arch, and it's up to them to implement the single-step IDET (even if it is just a stub implementation).


I just don't want to see the situation that you find the name of supports_optional_single_step is improper and need to change it after you release 0.6. Oh, that could be nothing for you🙃 Then you can just do it and ignore me.

I don't entirely understand this statement, but ooookay.


I know it, but how can a new user of gdbstub imagine use_implicit_sw_breakpoints has relation with single step?

I can't keep responding to this question with the same answer. Please see my previous responses and really think about the implications here.

I'm really patient, but responding for a 5th time with the same answer is something I simply cannot do.


Then it's even more improper to add supports_optional_single_step. You write gdbstub on spec, but if there is a non-standard GDB client that totally follow the spec on this, which means that it will respect the response of vCont?, The user of this non-standard GDB client will still face with the annoying supports_optional_single_step even if it's meaningless for this non-standard GDB.

But why you add a guard rails for GNU GDB which panics because it doesn't follow the spec?

This is the reality of software engineering in the real world my friend ¯\_(ツ)_/¯

Specifically, I know for a fact that 99% of users will be using upstream GDB, which means that any upstream GDB bugs that affect the gdbstub experience merit on-by-default work arounds.

In an ideal world, the mainline GDB client would've followed its own spec to-a-tee, the same way it would expect all other implementations to conform to the spec. Alas, either by historical accident, and/or human oversight, it doesn't (and hasn't!) conformed to this edge case of the spec in a looooong time, which means that as the author of gdbstub, I'm choosing to work around the current issues present in the GDB client in my library. Not doing so would result in my bug reports filed against my project, and I'd like to nip those bug reports in the bud pre-emptively.

That said, if you are one of the rare folks using gdbstub with a non-standard GDB client, I'm offering the flexibility to disable these guard rails yourself, with the promise that You Know What You're Doing ™️

Again, to be perfectly crystal clear: these are guard rails, not maximum security holding cells!
If you don't care this guard rail (or any other one for that matter), just turn them off!

@daniel5151
Copy link
Owner Author

Anyways, I'm not sure if I have the time or energy to keep going on in this back and forth.

I'm happy to read + ack any response you might have, but I'm pretty convinced that this functionality should remain in gdbstub and ship as part of 0.6.

If you don't like it, you can just turn it off (or in your case, for MIPS, keep the default behavior, because it sounds like you'll need to support single step packets anyways).

And to be clear: these guard rails are a new idea I'm testing out as part of the 0.6 release. If it turns out folks categorically don't like them, then I might remove them at some point in the future.

I'm not claiming to know what's perfect - I'm only human.

Cheers

@bet4it
Copy link
Contributor

bet4it commented Jan 14, 2022

5bc5831 + 1c6560e has made all forms of resume optional tada

Just found that support_single_step is a part of SingleThreadResume? Can't I only support single step but not continue?
And as I said in #59, we need to rename SingleThreadResume to SingleThreadContinue?

@daniel5151
Copy link
Owner Author

Skimming through #59, I remember why I went with this approach. It's a two-parter:

Pragmatically:

In part due to the fact that gdbstub uses multi-process extensions under the hood, vCont responses will pretty much always include a -1;c component (i.e: resume all other threads), even in the case of attempting to single-step the target. In order to parse + handle this aspect of the vCont packet, the target needs to implement the continue IDET so I can properly report this condition to the target.

Ideally:

The spec states the following:

Stubs that only control single-threaded targets can implement run control with the ‘c’ (continue) command, and if the target architecture supports hardware-assisted single-stepping, the ‘s’ (step) command. Stubs that support multi-threading targets should support the ‘vCont’ command.

Admittedly, this is a bit ambiguous, but I comprehend this statement as: IF you support resuming the target, you MUST support continue, and OPTIONALLY support single-stepping (IF the target support hardware-assisted single-stepping).


FWIW, I can understand why single step and continue should be orthogonal features, but the reality is that they seem to be inexorably intertwined with one another. I can see a future where I end up circling back to this issue and making the two orthogonal within gdbstub, but doing so would require some refactoring + adding a new guard rail to work around the aforementioned pragmatic points.

This is a niche enough edge case that I'm not going to be blocking 0.6 to change this, but if you feel strongly about this, consider opening a dedicated tracking issue for this.


Oh, and lastly - if you really only want to support single step without continue, you can just ignore continue requests (or treat them as single-step requests). Not ideal, but I think the GDB client should be resilient enough to handle that.

daniel5151 added a commit that referenced this pull request Jan 17, 2022
These changes are guided by discussion in:
- daniel5151/gdb-optional-step-bug#1
- #92

And in addition, address this minor nit:
- 5bc5831#commitcomment-63859961
@daniel5151
Copy link
Owner Author

Quite late to acknowledge this, but it seems there's actually been some movement on https://sourceware.org/bugzilla/show_bug.cgi?id=28440 (re: how certain platforms don't respect optional single stepping).

I haven't tested it, but things might finally work properly on GDB main? Would be good to check and see if we can update certain docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make single-stepping optional
2 participants