Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

twr_nranges_tdma consistent crash on startup #8

Open
altonova opened this issue Aug 17, 2020 · 10 comments
Open

twr_nranges_tdma consistent crash on startup #8

altonova opened this issue Aug 17, 2020 · 10 comments

Comments

@altonova
Copy link

altonova commented Aug 17, 2020

When attempting to run the example app twr_nranges_tdma (1 master, 3 slave and 1 tag node) I always receive a crash exception on startup. It always occurs on at least the master and tag, which I am monitoring from a PC. My hardware platform is the Decawave MDEK1001 evaluation kit units. The GDB exception and RTT output is included below. I'd greatly appreciate any help is discovering the source of the issue. Thanks!

Reading symbols from bin/targets/nrng_tag/app/apps/twr_nranges_tdma/twr_nranges_tdma.elf...
os_tick_idle (ticks=33) at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_os_tick.c:160
160 if (ticks > 0) {
Resetting target
0x000000dc in ?? ()
(gdb) continue
Continuing.

Program received signal SIGTRAP, Trace/breakpoint trap.
hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:56
56 asm("bkpt");
(gdb) quit
A debugging session is active.


{"utime": 34566581,"seq": 30,"uid": 7211,"ouid": [16663,16556],"rng": [0.112,1.252]}
{"utime": 34593475,"seq": 31,"uid": 7211,"ouid": [16663,16556],"rng": [0.082,1.235]}
{"utime": 34620354,"seq": 32,"uid": 7211,"ouid": [16663,16556],"rng": [0.098,1.266]}
{"utime": 34647248,"seq": 33,"uid": 7211,"ouid": [16663,16556],"rng": [0.114,1.228]}
{"utime": 34674128,"seq": 34,"uid": 7211,"ouid": [16663,16556],"rng": [0.063,1.268]}
{"utime": 34701015,"seq": 35,"uid": 7211,"ouid": [16663,16556],"rng": [0.107,1.292]}
{"utime": 34727909,"seq": 36,"uid": 7211,"ouid": [16663,16556],"rng": [0.114,1.214]}
{"utime": 34754789,"seq": 37,"uid": 7211,"ouid": [16663,16556],"rng": [0.089,1.310]}
{"utime": 34781683,"seq": 38,"uid": 7211,"ouid": [16663,16556],"rng": [0.082,1.264]}
{"utime": 34830405,"seq": 0,"ouid": [0,0,2599,0,0],"rng": [0.000,0.000,0.000,0.000,0.000]}
{"utime": 34830405,"seq": 0,"ouid": [325,0,325,0,325,0,325],"rng": [0.000,0.000,0.000,0.000,0.000,0.000,0.000]}
{"utime": 34830405,"seq": 0,"ouid": [325,0,325,0],"rng": [0.000,0.000,0.000,0.000]}
{"utime": 34830405,"seq": 0,"ouid": [325,0,325,0],"rng": [0.000,0.000,0.000,0.000]}
{"utime": 34830405,"seq": 0,"ouid": [20688,56571,18441,18698],"rng": [0.000,9259532.000,-1.470,0.000]}
{"utime": 34830405,"seq": 0,"ouid": [597,0,1077,0],"rng": [32840.031,0.000,0.000,0.000]}
004461 Unhandled interrupt (3), exception sp 0x20000c98
004461 r0:0x20000890 r1:0x20002ccc r2:0x00000000 r3:0x30333734
004461 r4:0x20000890 r5:0x20002ccc r6:0x200008dc r7:0x200008c4
004461 r8:0x00000000 r9:0x00000100 r10:0x00000000 r11:0x00000000
004461 r12:0x00000000 lr:0x0001bfc3 pc:0x30333734 psr:0x20000000
004461 ICSR:0x00421803 HFSR:0x40000000 CFSR:0x00000100
004461 BFAR:0xe000ed38 MMFAR:0xe000ed34

@chepo92
Copy link

chepo92 commented Oct 9, 2020

I can reproduce with {1 master, 1 slave, 1 tag} {1 master only}
I add gdb list output:

(gdb) C
Continuing.

Program received signal SIGTRAP, Trace/breakpoint trap.
hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:56
56                  asm("bkpt");
(gdb) list
51              if (hal_debugger_connected()) {
52                  /*
53                   * If debugger is attached, breakpoint here.
54                   */
55      #if !MYNEWT_VAL(MCU_DEBUG_IGNORE_BKPT)
56                  asm("bkpt");
57      #endif
58              }
59              NVIC_SystemReset();
60          }

and backtrack

(gdb) bt
#0  hal_system_reset () at repos/apache-mynewt-core/hw/mcu/nordic/nrf52xxx/src/hal_system.c:56
#1  0x00009288 in os_default_irq (tf=0x2000ffc0)
    at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/os_fault.c:249
#2  0x0000aeb6 in os_default_irq_asm () at repos/apache-mynewt-core/kernel/os/src/arch/cortex_m4/m4/HAL_CM4.s:260
#3  0xffffffec in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

@chepo92
Copy link

chepo92 commented Oct 20, 2020

With only 1 master node I get the following

{"utime": 10,"msg": "dw1000_dev_init"}
{"utime": 1223322,"msg": "dw1000_pkg_init"}
{"utime": 1231604,"msg": "uwb_ccp_pkg_init"}
{"utime": 1232447,"msg": "uwb_wcs_pkg_init"}
{"utime": 1233193,"msg": "tdma_pkg_init"}
{"utime": 1234052,"msg": "pan_pkg_init"}
{"utime": 1234643,"msg": "rng_pkg_init"}
{"utime": 1235243,"msg": "wcs_timescale_pkg_init"}
{"utime": 1236297,"msg": "nmgr_uwb_init"}
{"utime": 1236883,"msg": "nrng_pkg_init"}
{"utime": 1237684,"msg": "ss_nrng_pkg_init"}
{"utime": 1238295,"msg": "survey_pkg_init"}
{"utime": 1244058,"exec": "apps/twr_nranges_tdma/src/main.c"}
{"device_id":"DECA0130","panid":"DECA","addr":"FBA","part_id":"CC500FBA","lot_id":"82B3107"}
{"utime": 1244058,"msg": "frame_duration = 201 usec"}
{"utime": 1244058,"msg": "SHR_duration = 139 usec"}
{"utime": 1244058,"msg": "holdoff = 821 usec"}
{"utime": 153989005,"seq": 7,"ouid": [0,0,2619,0,0],"rng": [0.000,0.000,0.000,0.000,0.000]}
{"utime": 153989005,"seq": 7,"ouid": [325,0,325,0],"rng": [0.000,0.000,0.000,0.000]}
{"utime": 153989005,"seq": 7,"ouid": [325,0,325,0],"rng": [0.000,0.000,0.000,0.000]}
{"utime": 153989005,"seq": 7,"ouid": [325,0,325,0],"rng": [0.000,0.000,0.000,0.000]}
{"utime": 153989005,"seq": 7,"ouid": [20688,56571,18441,18698],"rng": [0.000,9259532.000,-1.470,0.000]}
{"utime": 153989005,"seq": 7,"ouid": [601,0,1081,0],"rng": [32840.031,0.000,0.000,0.000]}
{"utime": 153989005,"seq": 7,"ouid": [9592,18288,19208,26651,2011,54528,48640,62399,36687,18694,19206,26826,62466],"rng": [-2147483648.2147483648,-0.503,-2147483648.2147483648,2147483647.2147483647,-0.501,0.000,257.877,-2147483648.2147483648,61477.468,2147
{"utime": 153989005,"seq": 7,"ouid": [19202,26648,61440,1,18288,48896,60912,57344,61519,17280],"rng": [2147483647.2147483647,2147483647.2147483647,2147483647.2147483647,-0.000,-2147483648.2147483648,-2147483648.2147483648,-2147483648.2147483648,0.000,21474
019713 Unhandled interrupt (3), exception sp 0x20002448
019713  r0:0x00000101  r1:0x2000218f  r2:0x0000007f  r3:0x0000d069
019713  r4:0x000000a0  r5:0x20005c00  r6:0x20006ad0  r7:0x200063d0
019713  r8:0x0000fb5d  r9:0x20006470 r10:0x00000000 r11:0x00000000
019713 r12:0x00000000  lr:0x0000ac9f  pc:0x34363338 psr:0x60000000
019713 ICSR:0x00421803 HFSR:0x40000000 CFSR:0x00000100
019713 BFAR:0xe000ed38 MMFAR:0xe000ed34

If you look at the two last utime lines they dont' have the finishing "]}", also they are longer than the others, added to the possible corrupt stack of the backtrace seems an overflow.

the two codes that I found that print lines like that are uwb-core/lib/uwb_rng/src/rng_encode.c and uwb-core/lib/survey/src/survey_encode.c , and are compiled according to SURVEY_VERBOSE, NRNG_VERBOSE also in main.c there is SURVEY_ENABLED, tried disabling them one at a time and the error is gone when SURVEY_VERBOSE:0 , SURVEY_ENABLED: 1 NRNG_VERBOSE : 0

@ncasaril @pkettle do you have any hint ?

@EpsilonZ
Copy link

Getting the same. Any hint to this? @ncasaril @pkettle. Have you fixed it @chepo92?

@chepo92
Copy link

chepo92 commented Jan 12, 2021

I haven't implemented a fix as the workaround worked for now. but I think is a matter of fixing the print routine of survey as it is overflowing and causing the crash (the problem is the uwb-core not this code)
Probably here
https://github.com/Decawave/uwb-core/blob/591b800d81fe435dc103bd00d23eebc9d8e23ea9/lib/survey/src/survey_encode.c

@EpsilonZ
Copy link

@chepo92 okay! I'll check that out. Thank you 👍

@EpsilonZ
Copy link

EpsilonZ commented Jan 19, 2021

@chepo92 After checking that, I'll just leave it with the argument SURVEY_ENABLED as you mentioned! It works like a charm now.

Nonetheless, currently I'm having issues setting up the IDE. Do you use VS Code? If not what do you use? I'm trying to use VS Code but I can not seem to find the appropiate .vscode/ configuration after trying. It would really help me if you share any guide you used to set it up. Thanks a lot!

Edit: I've tried SEGGER Embedded Studio for ARM V4.12 (as mentioned in the DW1001 Decawave docs) but once you try to compile a message that says "The evaluation period for this release has now expired" pops up as it's an old version. That's why I moved to VS Code.

@chepo92
Copy link

chepo92 commented Jan 20, 2021

I use VSC to edit (well, more for studying and commenting) the code, for debugging and compiling I open a separate msys terminal as I couldn't integrate it to VSC or launch a taks using msys from VSC, also, running on terminal, cmd or powershell always throws a newt, github, or another error, so I gave up with it and got used to command line, as usual.
I followed official docs https://mynewt.apache.org/latest/misc/ide.html

@EpsilonZ
Copy link

EpsilonZ commented Jan 20, 2021

Thanks for the quick response! Does your Visual Code resolve correctly the libraries? I'm finding that even though the libraries are "correctly" resolved, Visual Code detects errors. Moreover, the Visual Code does not seem to resolve the MY_NEWT variables, therefore everything is shadowed as the IDE thinks that part is not being included.

Do you develop on Windows or Linux (I can change to any if needed, but atm I'm in Linux) ? Would it be possible for you to share the .vscode/ folder so I can copy it and see if everything is working? I think this is purely a matter of me including wrong paths to the librararies of incorrect architectures.

From what you say it seems like you haven't set up the debugger or building the project from the VS Code. Funny enough, I've set it up and it seems to work. I'll copy it here so you can maybe try and make it work:

launch.json -> this is used for debug

{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "gdb_dw1001_tag",
"type": "gdb",
"request": "attach",
"executable": "${workspaceRoot}/bin/targets/twr_tag_tdma/app/apps/twr_tag_tdma/twr_tag_tdma.elf", //this can be any path you want
"target": ":3333",
"gdbpath": "arm-none-eabi-gdb",
"cwd": "${workspaceRoot}",
"remote": true,
"valuesFormatting": "parseText"
}
]
}

tasks.json -> with this you can launch commands from pressing F1 and then selecting. The debug command references the launch.json configuration

{
// See https://go.microsoft.com/fwlink/?LinkId=733558
// for the documentation about the tasks.json format
"version": "2.0.0",
"tasks": [
{
"label": "load_bootloader",
"type": "shell",
"command": "newt load dwm1001_boot"
},
{
"label": "build_twrTag_tdma",
"type": "shell",
"command": "newt build twr_tag_tdma"
},
{
"label": "load_twrTag_tdma",
"type": "shell",
"command": "newt load twr_tag_tdma"
},
{
"label": "debug_tag_tdma",
"type": "shell",
"command": "newt debug twr_tag_tdma -n"
},
{
"label": "run_tag_tdma",
"type": "shell",
"command": "newt run twr_tag_tdma 0"
},
{
"label": "clean memory",
"type": "shell",
"command": "JLinkExe -device nRF52 -speed 4000 -if SWD"
},
]
}

Note: I've got the Native Debugger installed! Hope it helps!

@EpsilonZ
Copy link

@chepo92 have you got it working with other preamble lengths or data rates? I can't seem to get it working if I change default datarate and preamble length.

When changing those parameters, it seems like they can't communicate anymore... Do you have any hint into this? I tried to change them by using UWBCFG_DEF_DATARATE and UWBCFG_DEF_TX_PREAM_LEN. I've also tried to modify default RX_TIMEOUT values such as NRNG_RX_TIMEOUT, TWR_SS_RX_TIMEOUT, TWR_SS_NRNG_RX_TIMEOUT, NRNG_RX_TIMEOUT but none of them seems to work.

Thanks!

@rouming
Copy link

rouming commented Jun 3, 2022

Had the same issue. There are two problems here.

Problem one: different NRNG_NODES define for master/slave and tag targets

This output (which looks like a garbage to me btw: negative ranges are weird):

{"utime": 153989005,"seq": 7,"ouid": [0,0,2619,0,0],"rng": [0.000,0.000,0.000,0.000,0.000]}
{"utime": 153989005,"seq": 7,"ouid": [325,0,325,0],"rng": [0.000,0.000,0.000,0.000]}
{"utime": 153989005,"seq": 7,"ouid": [325,0,325,0],"rng": [0.000,0.000,0.000,0.000]}
{"utime": 153989005,"seq": 7,"ouid": [325,0,325,0],"rng": [0.000,0.000,0.000,0.000]}
{"utime": 153989005,"seq": 7,"ouid": [20688,56571,18441,18698],"rng": [0.000,9259532.000,-1.470,0.000]}
{"utime": 153989005,"seq": 7,"ouid": [601,0,1081,0],"rng": [32840.031,0.000,0.000,0.000]}
{"utime": 153989005,"seq": 7,"ouid": [9592,18288,19208,26651,2011,54528,48640,62399,36687,18694,19206,26826,62466],"rng": [-2147483648.2147483648,-0.503,-2147483648.2147483648,2147483647.2147483647,-0.501,0.000,257.877,-2147483648.2147483648,61477.468,2147
{"utime": 153989005,"seq": 7,"ouid": [19202,26648,61440,1,18288,48896,60912,57344,61519,17280],"rng": [2147483647.2147483647,2147483647.2147483647,2147483647.2147483647,-0.000,-2147483648.2147483648,-2147483648.2147483648,-2147483648.2147483648,0.000,21474

overflows the nrng_json_t structure which is defined in nrfng_json.h file. The nrng_json_t structure has ouid,rng arrays, which size depends on the NRNG_NODES define. The NRNG_NODES is defined differently for the nrng_master/slave_node (NRFNG_NODES == 16) and nrng_tag (NRNG_NODES == 8) targets.

The fix is to make the define equal, in my case I removed the following lines:

NRNG_NFRAMES: 16
NRNG_NNODES: 8

from the uwb-apps/targets/nrng_tag/syscfg.yml config, thus making them default values:

NRNG_NFRAMES: 32
NRNG_NNODES: 16

Originally this misconfiguration comes from the description on the main page: https://github.com/Decawave/uwb-apps/tree/master/apps/twr_nranges_tdma , where it is said in the "Building target for tags" section to specify config values differently "syscfg=NRNG_NTAGS=4:NRNG_NNODES=8:NRNG_NFRAMES=16".

This only partially fixes the issue.

Problem two: overflow of the nrng_json_t.iobuf thus stack corruption

After a bit of debugging I narrowed down the issue to the nrng_write_line, which fills in the ->iobuf, without checking the size of the iobuf, which is too small (256 bytes) for the whole json output. The fix can be in increasing the ->iobuf:

diff --git a/lib/nrng/include/nrng/nrng_json.h b/lib/nrng/include/nrng/nrng_json.h
index fdcfc0ff7cf9..382c0b8eb1bf 100644
--- a/lib/nrng/include/nrng/nrng_json.h
+++ b/lib/nrng/include/nrng/nrng_json.h
@@ -34,7 +34,7 @@
 #include <uwb/uwb.h>
 
 #ifndef PAGE_SIZE
-#define PAGE_SIZE 256
+#define PAGE_SIZE 512
 #endif

and adding a proper size check for the ->iobuf:

diff --git a/lib/nrng/src/nrng_json.c b/lib/nrng/src/nrng_json.c
index 940c130de076..aa41c65f73ce 100644
--- a/lib/nrng/src/nrng_json.c
+++ b/lib/nrng/src/nrng_json.c
@@ -37,6 +37,8 @@ static int
 nrng_write_line(void *buf, char* data, int len)
 {
     nrng_json_t * json = buf;
+
+    len = min(len, sizeof(json->iobuf) - json->idx);
     for (uint16_t i=0; i < len; i++){
         json->iobuf[json->idx++] = data[i];
         if (data[i]=='\0'){

If maintainers do still care I'm happy to discuss the patches and make a proper pull request on this matter.

Thanks, regards.

--
Roman

rouming added a commit to rouming/uwb-core that referenced this issue Jun 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants