Skip to content

Conversation

@trond-snekvik
Copy link
Contributor

If the user passes a function name to runToEntry for a function that
doesn't get called on startup, the debugger will hang indefinitely,
waiting for the breakpoint to be hit, and the UI won't be respond to any
input until it does.

This has also been observed with valid breakpoints in the Zephyr RTOS:
platformio/platformio-vscode-ide#2582
This appears to be a bug in the JLink server, which occasionally fails
to set breakpoints that have multiple locations. Zephyr RTOS
applications contain a weak main function that the kernel will fall back
to if the user doesn't declare a main function themselves. If -Og is
enabled in the build, this weak function's debug symbol will still hang
around, causing break-insert -t --function main to insert a breakpoint
at two addresses - one for the actual main(), and one at 0x0 for the
discarded symbol.

If the user passes a function name to runToEntry for a function that
doesn't get called on startup, the debugger will hang indefinitely,
waiting for the breakpoint to be hit, and the UI won't be respond to any
input until it does.

This has also been observed with valid breakpoints in the Zephyr RTOS:
platformio/platformio-vscode-ide#2582
This appears to be a bug in the JLink server, which occasionally fails
to set breakpoints that have multiple locations. Zephyr RTOS
applications contain a weak main function that the kernel will fall back
to if the user doesn't declare a main function themselves. If -Og is
enabled in the build, this weak function's debug symbol will still hang
around, causing break-insert -t --function main to insert a breakpoint
at two addresses - one for the actual main(), and one at 0x0 for the
discarded symbol.
@haneefdm
Copy link
Collaborator

Btw, we have a list of function names available to us. C++/Rust mangling may get in the way but we have the info. Default is to de-mangle.

We could check upfront to see if runToEntryPoint is invalid. But we can't be 1000% sure that we have all the information. Close to 99% sure we have it.

But, what you did is also needed just in case.

@haneefdm haneefdm merged commit 00124a7 into Marus:master Sep 23, 2021
@trond-snekvik
Copy link
Contributor Author

We could check upfront to see if runToEntryPoint is invalid.

That's a good idea! I wasn't aware of the existing symbol lookup mechanism, that looks promising. The issue that triggered this PR is still unresolved, and probably needs a fix on Segger's side, I think. The main problem for me is that the breakpoint is valid, but my JLink doesn't detect it, for some reason. This workaround helps kick the VS Code debugger UI out of this non-responsive mode when this happens, but I still don't get the expected behavior.

@trond-snekvik trond-snekvik deleted the run_to_entry_timeout branch September 24, 2021 14:11
@haneefdm
Copy link
Collaborator

Question: Doesn't GDB ask for breakpoints in terms of addresses? It does not use names. One problem that gdb-servers have is that they don't have access to the symbol table or even the executable being used.

From what I remember, gdb is the one who determines the address and asks the gdb-server to set the breakpoints. I am sure you enabled logs for the JLink server to monitor the traffic between gdb and the server.

@trond-snekvik
Copy link
Contributor Author

Question: Doesn't GDB ask for breakpoints in terms of addresses?

Not necessarily. For the run-to-main breakpoint, GDB just requests -break-insert -t --function main, and it's up to the server to determine what this means for the debug chip. In my case, the GDB server reports two breakpoints coming back.

I have gathered some more information, and the issue actually appears to be a CPU exception occurring when the CPU is halted at the breakpoint. I thought this was just the CPU ignoring the breakpoint, and moving into main, but it's actually faulting the CPU somehow. I have reported this to Segger. Hopefully, it's something that can be fixed on their side.

@haneefdm
Copy link
Collaborator

Now, I am pretty sure gdb is the one that converts the function names to addresses. Just like C++ a single name can resolve to multiple addresses.

I have edited OpenOCD source pretty heavily and I am certain it does not even know the path to the executable. I had to add code to query gdb for symbol addresses in support of an RTOS and fix bugs there. Expecting servers to know the super complicated elf/dwarf/other formats would be a bit of an ask. Anyways, it doesn't matter

@trond-snekvik
Copy link
Contributor Author

Sorry, you're right! The communication I was referencing is happening between the DAP process and GDB, not GDB and the GDB server. There are too many processes in this chain 😄

The communication between GDB and the GDB server is in this binary format that I don't have a dissector for, so I haven't been able to confirm exactly what the server gets told, but you're right, it makes sense that the address is resolved by GDB itself.

@haneefdm
Copy link
Collaborator

https://sourceware.org/gdb/current/onlinedocs/gdb/Server.html

It is painful to look at but, see section 20.3.3 monitor set debug and monitor set remote-debug. I put these in my .gdbinit or do it from the Debug Console. the transactions between gdb and the server are not binary, btw. They are just very terse text.

Also, JLink and OpenOCD (-v3/debug_level or something like that) have their own flags to be verbose. You may get more than you like to see :-)

mayjs pushed a commit to mayjs/cortex-debug that referenced this pull request Apr 6, 2022
If the user passes a function name to runToEntry for a function that
doesn't get called on startup, the debugger will hang indefinitely,
waiting for the breakpoint to be hit, and the UI won't be respond to any
input until it does.

This has also been observed with valid breakpoints in the Zephyr RTOS:
platformio/platformio-vscode-ide#2582
This appears to be a bug in the JLink server, which occasionally fails
to set breakpoints that have multiple locations. Zephyr RTOS
applications contain a weak main function that the kernel will fall back
to if the user doesn't declare a main function themselves. If -Og is
enabled in the build, this weak function's debug symbol will still hang
around, causing break-insert -t --function main to insert a breakpoint
at two addresses - one for the actual main(), and one at 0x0 for the
discarded symbol.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants