Skip to content

Conversation

@trond-snekvik
Copy link
Contributor

Although the extension is looking for the JLink GDB server to emit a log
line saying that it's waiting for the GDB Client to connect, it's
actually not ready until about 400 milliseconds later.

Occasionally, the JLink Server to loses track of the debug chip's state
right at the beginning of the session, putting the entire server into an
invalid state until the debugger chip is reset with "monitor reset".
This behavior only occurs if the monitor reset command is issued before
400 milliseconds have passed since the "Waiting for GDB connection..."
line is emitted by the JLink Server.

The issue is easiest to recognize through a series of errors in the
Adapter output, where the JLink server reads 0xdeadbeef from all
registers, which is propagated to the UI, making it look like the
program counter is stuck at address 0xdeadbeef or 0xdeadbeee.

To prevent the JLink server from ever ending up in this state, delay the
GDB Client launch with 500ms in the serverLaunchCompleted call. To avoid
breaking the git blame for the entire GDBDebugSession's attachOrLaunch
function, make the server init handler asynchronous, and add an await on
the call to serverLaunchCompleted.

Although the extension is looking for the JLink GDB server to emit a log
line saying that it's waiting for the GDB Client to connect, it's
actually not ready until about 400 milliseconds later.

Occasionally, the JLink Server to loses track of the debug chip's state
right at the beginning of the session, putting the entire server into an
invalid state until the debugger chip is reset with "monitor reset".
This behavior only occurs if the monitor reset command is issued before
400 milliseconds have passed since the "Waiting for GDB connection..."
line is emitted by the JLink Server.

The issue is easiest to recognize through a series of errors in the
Adapter output, where the JLink server reads 0xdeadbeef from all
registers, which is propagated to the UI, making it look like the
program counter is stuck at address 0xdeadbeef or 0xdeadbeee.

To prevent the JLink server from ever ending up in this state, delay the
GDB Client launch with 500ms in the serverLaunchCompleted call. To avoid
breaking the git blame for the entire GDBDebugSession's attachOrLaunch
function, make the server init handler asynchronous, and add an await on
the call to serverLaunchCompleted.
@trond-snekvik
Copy link
Contributor Author

I logged the MI command communication with the GDB Client for 150 runs. A failure looks like something like this:

Typical failing run ```[103] -> 1-gdb-set target-async on [104] -> 2-interpreter-exec console "source /home/trond/vscode/cortex-debug/support/gdbsupport.init" [104] -> 3-interpreter-exec console "set print demangle on" [104] -> 4-interpreter-exec console "set print asm-demangle on" [104] -> 5-target-select extended-remote localhost:50003 [104] -> 6-interpreter-exec console "add-symbol-file \"/home/trond/ncs/zephyr/samples/basic/blinky/build_1/zephyr/zephyr.elf\"" [104] -> 7-interpreter-exec console "monitor halt" [104] -> 8-interpreter-exec console "monitor reset" [104] -> 9-target-download [104] -> 10-interpreter-exec console "monitor reset" [104] -> 11-enable-pretty-printing [108] <- undefined=thread-group-added,id="i1" [108] <- =cmd-param-changed,param="print pretty",value="on" [109] <- =cmd-param-changed,param="history save",value="on" [109] <- =cmd-param-changed,param="history size",value="10000" [109] <- =cmd-param-changed,param="history filename",value="/home/trond/.gdbhistory" [109] <- ~"Reading symbols from /home/trond/ncs/zephyr/samples/basic/blinky/build_1/zephyr/zephyr.hex...\n" [109] <- ~"(No debugging symbols found in /home/trond/ncs/zephyr/samples/basic/blinky/build_1/zephyr/zephyr.hex)\n" [109] <- (gdb) [109] <- 1^done [109] <- (gdb) [112] <- 2^done [112] <- (gdb) [112] <- 3^done [112] <- (gdb) [112] <- =cmd-param-changed,param="print asm-demangle",value="on" [112] <- 4^done [112] <- (gdb) [113] <- =thread-group-started,id="i1",pid="42000" [113] <- =thread-created,id="1",group-id="i1" [125] <- ~"0x00001a14 in ?? ()\n" [125] <- *stopped,frame={addr="0x00001a14",func="??",args=[],arch="arm"},thread-id="1",stopped-threads="all" [125] <- 5^connected [125] <- (gdb) [125] <- ~"add symbol table from file \"/home/trond/ncs/zephyr/samples/basic/blinky/build_1/zephyr/zephyr.elf\"\n" [125] <- ~"(y or n) [answered Y; input not from terminal]\n" [126] <- ~"Reading symbols from /home/trond/ncs/zephyr/samples/basic/blinky/build_1/zephyr/zephyr.elf...\n" [132] <- 6^done [132] <- (gdb) [133] <- 7^done [133] <- (gdb) [202] <- @"Resetting target\r\n" [202] <- 8^done [202] <- (gdb) [203] <- 9+download,{section=".sec1",section-size="24216",total-size="24216"} [203] <- 9+download,{section=".sec1",section-sent="16048",section-size="24216",total-sent="16048",total-size="24216"} [203] <- 9^done,address="0x1a14",load-size="24216",transferred-bits="193728",write-rate="12108" [203] <- (gdb) [449] <- @"Resetting target\r\n" [449] <- 10^done [449] <- (gdb) [449] <- 11^done [449] <- (gdb) [450] -> 12-break-insert "/home/trond/ncs/zephyr/samples/basic/blinky/src/main.c:41" [450] -> 13-break-insert -t --function main [451] <- 12^done,bkpt={number="1",type="breakpoint",disp="keep",enabled="y",addr="0x00000426",func="main",file="/home/trond/ncs/zephyr/samples/basic/blinky/src/main.c",fullname="/home/trond/ncs/zephyr/samples/basic/blinky/src/main.c",line="42",thread-groups=["i1"],times="0",original-location="/home/trond/ncs/zephyr/samples/basic/blinky/src/main.c:41"} [451] <- (gdb) [453] <- 13^done,bkpt={number="2",type="breakpoint",disp="del",enabled="y",addr="",times="0",original-location="-function main"},{number="2.1",enabled="y",addr="0x00000038",thread-groups=["i1"]},{number="2.2",enabled="y",addr="0x00000404",func="main",file="/home/trond/ncs/zephyr/samples/basic/blinky/src/main.c",fullname="/home/trond/ncs/zephyr/samples/basic/blinky/src/main.c",line="31",thread-groups=["i1"]} [453] <- (gdb) [478] -> 14-exec-continue [478] <- 14^running [478] <- *running,thread-id="all" [478] <- (gdb) [486] <- =thread-created,id="2",group-id="i1" [486] <- ~"[New Remote target]\n" [486] <- =thread-exited,id="1",group-id="i1" [486] <- ~"\nThread " [486] <- ~"2 received signal SIGTRAP, Trace/breakpoint trap.\n" [486] <- ~"[Switching to Remote target]\n" [499] <- ~"0xdeadbeee in ?? ()\n" [499] <- *stopped,reason="signal-received",signal-name="SIGTRAP",signal-meaning="Trace/breakpoint trap",frame={addr="0xdeadbeee",func="??",args=[],arch="arm"},thread-id="2",stopped-threads="all" [504] -> 15-thread-list-ids [505] <- 15^done,thread-ids={thread-id="2"},current-thread-id="2",number-of-threads="1" [505] <- (gdb) [505] -> 16-thread-info 2 [505] <- 16^done,threads=[{id="2",target-id="Remote target",frame={level="0",addr="0xdeadbeee",func="??",args=[],arch="arm"},state="stopped"}] [505] <- (gdb) [506] -> 17-data-list-register-names [507] <- 17^done,register-names=["r0","r1","r2","r3","r4","r5","r6","r7","r8","r9","r10","r11","r12","sp","lr","pc","","","","","","","","","","xpsr","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","msp","psp","primask","basepri","faultmask","control","fpscr","s0","s1","s2","s3","s4","s5","s6","s7","s8","s9","s10","s11","s12","s13","s14","s15","s16","s17","s18","s19","s20","s21","s22","s23","s24","s25","s26","s27","s28","s29","s30","s31","d0","d1","d2","d3","d4","d5","d6","d7","d8","d9","d10","d11","d12","d13","d14","d15"] [507] <- (gdb) [537] -> 18-stack-info-depth --thread 2 10000 [538] -> 19-stack-info-depth --thread 2 10000 [538] <- 18^done,depth="1" [538] <- (gdb) [538] <- 19^done,depth="1" [538] <- (gdb) [538] -> 20-stack-list-frames --thread 2 0 0 [539] -> 21-stack-list-frames --thread 2 0 0 [539] <- 20^done,stack=[frame={level="0",addr="0xdeadbeee",func="??",arch="arm"}] [539] <- (gdb) [539] <- 21^done,stack=[frame={level="0",addr="0xdeadbeee",func="??",arch="arm"}] [539] <- (gdb) ```
The numbers at the start of each line is milliseconds since the Jlink server emits its line. Note how the first `monitor reset` command (command number 8) finishes at 202ms.

A successful launch looks like this:

Typical successful run ``` [204] -> 1-gdb-set target-async on [204] -> 2-interpreter-exec console "source /home/trond/vscode/cortex-debug/support/gdbsupport.init" [204] -> 3-interpreter-exec console "set print demangle on" [205] -> 4-interpreter-exec console "set print asm-demangle on" [205] -> 5-target-select extended-remote localhost:50003 [205] -> 6-interpreter-exec console "add-symbol-file \"/home/trond/ncs/zephyr/samples/basic/blinky/build_1/zephyr/zephyr.elf\"" [205] -> 7-interpreter-exec console "monitor halt" [205] -> 8-interpreter-exec console "monitor reset" [205] -> 9-target-download [205] -> 10-interpreter-exec console "monitor reset" [205] -> 11-enable-pretty-printing [208] <- undefined=thread-group-added,id="i1" [210] <- =cmd-param-changed,param="print pretty",value="on" [210] <- =cmd-param-changed,param="history save",value="on" [210] <- =cmd-param-changed,param="history size",value="10000" [210] <- =cmd-param-changed,param="history filename",value="/home/trond/.gdbhistory" [210] <- ~"Reading symbols from /home/trond/ncs/zephyr/samples/basic/blinky/build_1/zephyr/zephyr.hex...\n" [210] <- ~"(No debugging symbols found in /home/trond/ncs/zephyr/samples/basic/blinky/build_1/zephyr/zephyr.hex)\n" [210] <- (gdb) [210] <- 1^done [210] <- (gdb) [212] <- 2^done [212] <- (gdb) [212] <- 3^done [212] <- (gdb) [212] <- =cmd-param-changed,param="print asm-demangle",value="on" [212] <- 4^done [212] <- (gdb) [214] <- =thread-group-started,id="i1",pid="42000" [214] <- =thread-created,id="1",group-id="i1" [349] <- ~"0x00001976 in ?? ()\n" [349] <- *stopped,frame={addr="0x00001976",func="??",args=[],arch="arm"},thread-id="1",stopped-threads="all" [349] <- 5^connected [349] <- (gdb) [349] <- ~"add symbol table from file \"/home/trond/ncs/zephyr/samples/basic/blinky/build_1/zephyr/zephyr.elf\"\n" [349] <- ~"(y or n) [answered Y; input not from terminal]\n" [350] <- ~"Reading symbols from /home/trond/ncs/zephyr/samples/basic/blinky/build_1/zephyr/zephyr.elf...\n" [354] <- 6^done [354] <- (gdb) [354] <- 7^done [354] <- (gdb) [424] <- @"Resetting target\r\n" [424] <- 8^done [424] <- (gdb) [425] <- 9+download,{section=".sec1",section-size="24216",total-size="24216"} [425] <- 9+download,{section=".sec1",section-sent="16048",section-size="24216",total-sent="16048",total-size="24216"} [602] <- 9^done,address="0x1a14",load-size="24216",transferred-bits="193728",write-rate="12108" [602] <- (gdb) [671] <- @"Resetting target\r\n" [671] <- 10^done [671] <- (gdb) [671] <- 11^done [671] <- (gdb) [672] -> 12-break-insert "/home/trond/ncs/zephyr/samples/basic/blinky/src/main.c:41" [672] -> 13-break-insert -t --function main [673] <- 12^done,bkpt={number="1",type="breakpoint",disp="keep",enabled="y",addr="0x00000426",func="main",file="/home/trond/ncs/zephyr/samples/basic/blinky/src/main.c",fullname="/home/trond/ncs/zephyr/samples/basic/blinky/src/main.c",line="42",thread-groups=["i1"],times="0",original-location="/home/trond/ncs/zephyr/samples/basic/blinky/src/main.c:41"} [673] <- (gdb) [675] <- 13^done,bkpt={number="2",type="breakpoint",disp="del",enabled="y",addr="",times="0",original-location="-function main"},{number="2.1",enabled="y",addr="0x00000038",thread-groups=["i1"]},{number="2.2",enabled="y",addr="0x00000404",func="main",file="/home/trond/ncs/zephyr/samples/basic/blinky/src/main.c",fullname="/home/trond/ncs/zephyr/samples/basic/blinky/src/main.c",line="31",thread-groups=["i1"]} [675] <- (gdb) [789] -> 14-exec-continue [789] <- 14^running [789] <- *running,thread-id="all" [790] <- (gdb) [3676] -> 15-exec-interrupt [3677] <- 15^done [3677] <- (gdb) [3684] <- =thread-created,id="2",group-id="i1" [3684] <- ~"[New Remote target]\n" [3684] <- =thread-exited,id="1",group-id="i1" [3684] <- ~"\nThread " [3684] <- ~"2 received signal SIGTRAP, Trace/breakpoint trap.\n" [3684] <- ~"[Switching to Remote target]\n" [3686] <- ~"?? () at /home/trond/ncs/zephyr/arch/arm/core/aarch32/cpu_idle.S:158\n" [3686] <- ~"158\t\tmsr\tBASEPRI, r0\n" [3686] <- *stopped,reason="signal-received",signal-name="SIGTRAP",signal-meaning="Trace/breakpoint trap",frame={addr="0x0000198a",func="??",args=[],file="/home/trond/ncs/zephyr/arch/arm/core/aarch32/cpu_idle.S",fullname="/home/trond/ncs/zephyr/arch/arm/core/aarch32/cpu_idle.S",line="158",arch="arm"},thread-id="2",stopped-threads="all" [3690] -> 16-thread-list-ids [3690] <- 16^done,thread-ids={thread-id="2"},current-thread-id="2",number-of-threads="1" [3690] <- (gdb) [3691] -> 17-thread-info 2 [3691] <- 17^done,threads=[{id="2",target-id="Remote target",frame={level="0",addr="0x0000198a",func="??",args=[],file="/home/trond/ncs/zephyr/arch/arm/core/aarch32/cpu_idle.S",fullname="/home/trond/ncs/zephyr/arch/arm/core/aarch32/cpu_idle.S",line="158",arch="arm"},state="stopped"}] [3691] <- (gdb) [3700] -> 18-data-list-register-names [3701] <- 18^done,register-names=["r0","r1","r2","r3","r4","r5","r6","r7","r8","r9","r10","r11","r12","sp","lr","pc","","","","","","","","","","xpsr","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","msp","psp","primask","basepri","faultmask","control","fpscr","s0","s1","s2","s3","s4","s5","s6","s7","s8","s9","s10","s11","s12","s13","s14","s15","s16","s17","s18","s19","s20","s21","s22","s23","s24","s25","s26","s27","s28","s29","s30","s31","d0","d1","d2","d3","d4","d5","d6","d7","d8","d9","d10","d11","d12","d13","d14","d15"] [3701] <- (gdb) [3704] -> 19-stack-info-depth --thread 2 10000 [3706] -> 20-data-list-register-values N [3712] <- 19^done,depth="9" [3712] <- (gdb) [3712] -> 21-stack-list-frames --thread 2 0 8 [3728] <- 20^done,register-values=[{number="0",value="32"},{number="1",value="0"},{number="2",value="1"},{number="3",value="0"},{number="4",value="1"},{number="5",value="2"},{number="6",value="32"},{number="7",value="0"},{number="8",value="0"},{number="9",value="0"},{number="10",value="0"},{number="11",value="0"},{number="12",value="0"},{number="13",value="0x20001c70 "},{number="14",value="4933"},{number="15",value="0x198a"},{number="25",value="1627389952"},{number="91",value="536878432"},{number="92",value="536878192"},{number="93",value="1"},{number="94",value="0"},{number="95",value="0"},{number="96",value="2"},{number="97",value="0"},{number="98",value="0"},{number="99",value="0"},{number="100",value="0"},{number="101",value="0"},{number="102",value="0"},{number="103",value="0"},{number="104",value="0"},{number="105",value="0"},{number="106",value="0"},{number="107",value="0"},{number="108",value="0"},{number="109",value="0"},{number="110",value="0"},{number="111",value="0"},{number="112",value="0"},{number="113",value="0"},{number="114",value="0"},{number="115",value="0"},{number="116",value="0"},{number="117",value="0"},{number="118",value="0"},{number="119",value="0"},{number="120",value="0"},{number="121",value="0"},{number="122",value="0"},{number="123",value="0"},{number="124",value="0"},{number="125",value="0"},{number="126",value="0"},{number="127",value="0"},{number="128",value="0"},{number="129",value="0"},{number="130",value="0"},{number="131",value="0"},{number="132",value="0"},{number="133",value="0"},{number="134",value="0"},{number="135",value="0"},{number="136",value="0"},{number="137",value="0"},{number="138",value="0"},{number="139",value="0"},{number="140",value="0"},{number="141",value="0"},{number="142",value="0"},{number="143",value="0"},{number="144",value="0"},{number="145",value="0"}] [3728] <- (gdb) [3730] <- 21^done,stack=[frame={level="0",addr="0x0000198a",func="??",file="/home/trond/ncs/zephyr/arch/arm/core/aarch32/cpu_idle.S",fullname="/home/trond/ncs/zephyr/arch/arm/core/aarch32/cpu_idle.S",line="158",arch="arm"},frame={level="1",addr="0x00001344",func="k_cpu_atomic_idle",file="/home/trond/ncs/zephyr/include/kernel.h",fullname="/home/trond/ncs/zephyr/include/kernel.h",line="5441",arch="arm"},frame={level="2",addr="0x00001344",func="lfclk_spinwait",file="/home/trond/ncs/zephyr/drivers/clock_control/clock_control_nrf.c",fullname="/home/trond/ncs/zephyr/drivers/clock_control/clock_control_nrf.c",line="505",arch="arm"},frame={level="3",addr="0x000013f6",func="z_nrf_clock_control_lf_on",file="/home/trond/ncs/zephyr/drivers/clock_control/clock_control_nrf.c",fullname="/home/trond/ncs/zephyr/drivers/clock_control/clock_control_nrf.c",line="561",arch="arm"},frame={level="4",addr="0x0000154a",func="sys_clock_driver_init",file="/home/trond/ncs/zephyr/drivers/timer/nrf_rtc_timer.c",fullname="/home/trond/ncs/zephyr/drivers/timer/nrf_rtc_timer.c",line="349",arch="arm"},frame={level="5",addr="0x00002edc",func="z_sys_init_run_level",file="/home/trond/ncs/zephyr/kernel/device.c",fullname="/home/trond/ncs/zephyr/kernel/device.c",line="84",arch="arm"},frame={level="6",addr="0x00003134",func="z_cstart",file="/home/trond/ncs/zephyr/kernel/init.c",fullname="/home/trond/ncs/zephyr/kernel/init.c",line="437",arch="arm"},frame={level="7",addr="0x000019cc",func="z_arm_prep_c",file="/home/trond/ncs/zephyr/arch/arm/core/aarch32/prep_c.c",fullname="/home/trond/ncs/zephyr/arch/arm/core/aarch32/prep_c.c",line="189",arch="arm"},frame={level="8",addr="0x00001a5c",func="??",file="/home/trond/ncs/zephyr/arch/arm/core/aarch32/cortex_m/reset.S",fullname="/home/trond/ncs/zephyr/arch/arm/core/aarch32/cortex_m/reset.S",line="154",arch="arm"}] [3730] <- (gdb) ```

56/158 runs failed in my trials, and the reset command was always issued some time between 99 and 302 ms after the JLink "ready" line, while the successful runs always saw the reset command issued 500-1000ms after the line was fired.

@haneefdm
Copy link
Collaborator

Btw, I made a bunch of changes to start/stop/disconnect areas, now that VSCode/Debug-Protocol supports different ways to Stop/Disconnect for both Launch/Attach sessions. I have used most of my changes to the master. You may want to make sure my changes are compatible with yours.

It is getting super complicated for my taste. This whole thing should be refactored and re-written at some point. The bandaids are piling up.

@haneefdm haneefdm merged commit bb2fd43 into Marus:master Sep 23, 2021
@trond-snekvik trond-snekvik deleted the deadbeef branch September 24, 2021 14:04
@trond-snekvik
Copy link
Contributor Author

You may want to make sure my changes are compatible with yours.
I'll double check this when I have time, but it works well on current master, at least 👍

It is getting super complicated for my taste. This whole thing should be refactored and re-written at some point. The bandaids are piling up.

Yeah, I see what you mean. I'm not super eager to do big changes to parts that are this central though, out of fear of breaking adapter/board combinations I can't test locally. I'd be happy to assist in this though, if I can.

@haneefdm
Copy link
Collaborator

Me too. Super scared of breaking stuff. But at some point, we have to do this. I am waiting for the insiders/beta release mechanism to be adopted universally by VSCode extensions. It is getting super close to getting adopted.

We are also using an obsolete linting method. It has been abandoned and it is a HUGE pain to move our code base to the new one. I mean thousands of errors (some serious) and warnings and auto-fixes did not work or it did the wrong thing.

mayjs pushed a commit to mayjs/cortex-debug that referenced this pull request Apr 6, 2022
Although the extension is looking for the JLink GDB server to emit a log
line saying that it's waiting for the GDB Client to connect, it's
actually not ready until about 400 milliseconds later.

Occasionally, the JLink Server to loses track of the debug chip's state
right at the beginning of the session, putting the entire server into an
invalid state until the debugger chip is reset with "monitor reset".
This behavior only occurs if the monitor reset command is issued before
400 milliseconds have passed since the "Waiting for GDB connection..."
line is emitted by the JLink Server.

The issue is easiest to recognize through a series of errors in the
Adapter output, where the JLink server reads 0xdeadbeef from all
registers, which is propagated to the UI, making it look like the
program counter is stuck at address 0xdeadbeef or 0xdeadbeee.

To prevent the JLink server from ever ending up in this state, delay the
GDB Client launch with 500ms in the serverLaunchCompleted call. To avoid
breaking the git blame for the entire GDBDebugSession's attachOrLaunch
function, make the server init handler asynchronous, and add an await on
the call to serverLaunchCompleted.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants