Prevent runToEntryPoint from hanging indefinitely #489

trond-snekvik · 2021-09-23T16:16:23Z

If the user passes a function name to runToEntry for a function that
doesn't get called on startup, the debugger will hang indefinitely,
waiting for the breakpoint to be hit, and the UI won't be respond to any
input until it does.

This has also been observed with valid breakpoints in the Zephyr RTOS:
platformio/platformio-vscode-ide#2582
This appears to be a bug in the JLink server, which occasionally fails
to set breakpoints that have multiple locations. Zephyr RTOS
applications contain a weak main function that the kernel will fall back
to if the user doesn't declare a main function themselves. If -Og is
enabled in the build, this weak function's debug symbol will still hang
around, causing break-insert -t --function main to insert a breakpoint
at two addresses - one for the actual main(), and one at 0x0 for the
discarded symbol.

If the user passes a function name to runToEntry for a function that doesn't get called on startup, the debugger will hang indefinitely, waiting for the breakpoint to be hit, and the UI won't be respond to any input until it does. This has also been observed with valid breakpoints in the Zephyr RTOS: platformio/platformio-vscode-ide#2582 This appears to be a bug in the JLink server, which occasionally fails to set breakpoints that have multiple locations. Zephyr RTOS applications contain a weak main function that the kernel will fall back to if the user doesn't declare a main function themselves. If -Og is enabled in the build, this weak function's debug symbol will still hang around, causing break-insert -t --function main to insert a breakpoint at two addresses - one for the actual main(), and one at 0x0 for the discarded symbol.

haneefdm · 2021-09-23T16:24:56Z

Btw, we have a list of function names available to us. C++/Rust mangling may get in the way but we have the info. Default is to de-mangle.

We could check upfront to see if runToEntryPoint is invalid. But we can't be 1000% sure that we have all the information. Close to 99% sure we have it.

But, what you did is also needed just in case.

trond-snekvik · 2021-09-24T14:11:28Z

We could check upfront to see if runToEntryPoint is invalid.

That's a good idea! I wasn't aware of the existing symbol lookup mechanism, that looks promising. The issue that triggered this PR is still unresolved, and probably needs a fix on Segger's side, I think. The main problem for me is that the breakpoint is valid, but my JLink doesn't detect it, for some reason. This workaround helps kick the VS Code debugger UI out of this non-responsive mode when this happens, but I still don't get the expected behavior.

haneefdm · 2021-09-24T19:25:59Z

Question: Doesn't GDB ask for breakpoints in terms of addresses? It does not use names. One problem that gdb-servers have is that they don't have access to the symbol table or even the executable being used.

From what I remember, gdb is the one who determines the address and asks the gdb-server to set the breakpoints. I am sure you enabled logs for the JLink server to monitor the traffic between gdb and the server.

trond-snekvik · 2021-09-29T10:33:19Z

Question: Doesn't GDB ask for breakpoints in terms of addresses?

Not necessarily. For the run-to-main breakpoint, GDB just requests -break-insert -t --function main, and it's up to the server to determine what this means for the debug chip. In my case, the GDB server reports two breakpoints coming back.

I have gathered some more information, and the issue actually appears to be a CPU exception occurring when the CPU is halted at the breakpoint. I thought this was just the CPU ignoring the breakpoint, and moving into main, but it's actually faulting the CPU somehow. I have reported this to Segger. Hopefully, it's something that can be fixed on their side.

haneefdm · 2021-09-29T12:57:57Z

Now, I am pretty sure gdb is the one that converts the function names to addresses. Just like C++ a single name can resolve to multiple addresses.

I have edited OpenOCD source pretty heavily and I am certain it does not even know the path to the executable. I had to add code to query gdb for symbol addresses in support of an RTOS and fix bugs there. Expecting servers to know the super complicated elf/dwarf/other formats would be a bit of an ask. Anyways, it doesn't matter

trond-snekvik · 2021-09-29T13:09:20Z

Sorry, you're right! The communication I was referencing is happening between the DAP process and GDB, not GDB and the GDB server. There are too many processes in this chain 😄

The communication between GDB and the GDB server is in this binary format that I don't have a dissector for, so I haven't been able to confirm exactly what the server gets told, but you're right, it makes sense that the address is resolved by GDB itself.

haneefdm · 2021-09-29T16:24:33Z

https://sourceware.org/gdb/current/onlinedocs/gdb/Server.html

It is painful to look at but, see section 20.3.3 monitor set debug and monitor set remote-debug. I put these in my .gdbinit or do it from the Debug Console. the transactions between gdb and the server are not binary, btw. They are just very terse text.

Also, JLink and OpenOCD (-v3/debug_level or something like that) have their own flags to be verbose. You may get more than you like to see :-)

If the user passes a function name to runToEntry for a function that doesn't get called on startup, the debugger will hang indefinitely, waiting for the breakpoint to be hit, and the UI won't be respond to any input until it does. This has also been observed with valid breakpoints in the Zephyr RTOS: platformio/platformio-vscode-ide#2582 This appears to be a bug in the JLink server, which occasionally fails to set breakpoints that have multiple locations. Zephyr RTOS applications contain a weak main function that the kernel will fall back to if the user doesn't declare a main function themselves. If -Og is enabled in the build, this weak function's debug symbol will still hang around, causing break-insert -t --function main to insert a breakpoint at two addresses - one for the actual main(), and one at 0x0 for the discarded symbol.

haneefdm merged commit 00124a7 into Marus:master Sep 23, 2021

trond-snekvik deleted the run_to_entry_timeout branch September 24, 2021 14:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prevent runToEntryPoint from hanging indefinitely #489

Prevent runToEntryPoint from hanging indefinitely #489

Uh oh!

trond-snekvik commented Sep 23, 2021

Uh oh!

haneefdm commented Sep 23, 2021

Uh oh!

trond-snekvik commented Sep 24, 2021

Uh oh!

haneefdm commented Sep 24, 2021

Uh oh!

trond-snekvik commented Sep 29, 2021

Uh oh!

haneefdm commented Sep 29, 2021

Uh oh!

trond-snekvik commented Sep 29, 2021

Uh oh!

haneefdm commented Sep 29, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Prevent runToEntryPoint from hanging indefinitely #489

Prevent runToEntryPoint from hanging indefinitely #489

Uh oh!

Conversation

trond-snekvik commented Sep 23, 2021

Uh oh!

haneefdm commented Sep 23, 2021

Uh oh!

trond-snekvik commented Sep 24, 2021

Uh oh!

haneefdm commented Sep 24, 2021

Uh oh!

trond-snekvik commented Sep 29, 2021

Uh oh!

haneefdm commented Sep 29, 2021

Uh oh!

trond-snekvik commented Sep 29, 2021

Uh oh!

haneefdm commented Sep 29, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants