Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests/thread_float: crashes on avr-rss2 #16908

Closed
benpicco opened this issue Sep 28, 2021 · 5 comments · Fixed by #18632
Closed

tests/thread_float: crashes on avr-rss2 #16908

benpicco opened this issue Sep 28, 2021 · 5 comments · Fixed by #18632
Assignees
Labels
Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)

Comments

@benpicco
Copy link
Contributor

benpicco commented Sep 28, 2021

Description

The thread_float test crashes on avr-rss2 (ATmega256rfr2).

Steps to reproduce the issue

make -C tests/thread_float BOARD=avr-rss2 flash term

Expected results

main(): This is RIOT! (Version: 2021.10-devel-801-g6d9db)
THREADS CREATED

THREAD 3 start
T(3): 141.466812
T(3): 141.490616
T(3): 141.521576
T(3): 141.559769
T(3): 141.605148
T(3): 141.657516
T(3): 141.717255
T(3): 141.783844
T(3): 141.857513
T(3): 141.938278
T(3): 142.026108
T(3): 142.120789
T(3): 142.222488
T(3): 142.331390
T(3): 142.446899
T(3): 142.569565
T(3): 142.699020
T(3): 142.835480
THREAD 4 start
THREAD 5 start
T(3): 142.978668
T(5): 141.521576
T(3): 143.128845
T(5): 141.559769
T(3): 143.285721
T(5): 141.605148
T(3): 143.449615
T(5): 141.657516
T(3): 143.620071
T(5): 141.717255
T(3): 143.797531
T(5): 141.783844
T(3): 143.981476
T(5): 141.857513
T(3): 144.172333
T(5): 141.938278

Actual results

2021-09-28 15:32:17,608 # main(): This is RIOT! (Version: 2021.10-devel-801-g6d9db)
2021-09-28 15:32:17,611 # THREADS CREATED
2021-09-28 15:32:17,611 # 
2021-09-28 15:32:17,907 # IRQ_STATUS 0x80
2021-09-28 15:32:17,908 # IRQ_STATUS1 00
2021-09-28 15:32:17,910 # SCIRQS 00
2021-09-28 15:32:17,910 # BATMON 0x22
2021-09-28 15:32:17,910 # EIFR 00
2021-09-28 15:32:17,912 # PCIFR 00
2021-09-28 15:32:17,913 # *** RIOT kernel panic:
2021-09-28 15:32:17,916 # i): %f
2021-09-28 15:32:17,916 # 
2021-09-28 15:32:17,916 # 
2021-09-28 15:32:17,916 # *** halted.
2021-09-28 15:32:17,916 # 

Versions

@benpicco benpicco added the Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) label Sep 28, 2021
@benpicco benpicco changed the title tests/thread_float: crashes on tests/thread_float: crashes on avr-rss2 Sep 28, 2021
@emmanuelsearch
Copy link
Member

it is also flaky on native (I'm currently testing on 2022.01-RC1)

INFO:native.tests/thread_float:Run test.flash
WARNING:native.tests/thread_float:make RIOT_CI_BUILD=1 CC_NOCOLOR=1 --no-print-directory BUILD_IN_DOCKER=1 DOCKER="sudo docker" -C ./tests/thread_float test
r
/home/emmanuel/RIOT/tests/thread_float/bin/native/tests_thread_float.elf /dev/ttyACM0 
RIOT native interrupts/signals initialized.
LED_RED_OFF
LED_GREEN_ON
RIOT native board initialized.
RIOT native hardware initialization complete.

Help: Press s to start test, r to print it is ready
READY
s
START
main(): This is RIOT! (Version: buildtest)
THREADS CREATED

THREAD t1 start
t1: 141.443787
t1: 141.443787
t1: 141.443787
t1: 141.443787
t1: 141.443787
t1: 141.443787
THREAD t2 start
THREAD t3 start
t3: 141.466812
t3: 141.466812
t3: 141.466812
t3: 141.466812
t3: 141.466812
t3: 141.466812
t3: 141.466812
t1: -nan
t1: 141.412857
t1: 141.412857
t1: 141.412857
t1: 141.412857
t1: 141.412857
t1: 141.412857
t1: 141.412857
t1: 141.412857
t1: 141.412857
t1: 141.412857
t1: 141.412857
t1: 141.412857
t1: 141.412857
t1: 141.412857
t3: -nan
t3: 141.436065
t3: 141.436065

Traceback (most recent call last):
  File "/home/emmanuel/RIOT/tests/thread_float/tests/01-run.py", line 81, in <module>
    sys.exit(run(testfunc))
  File "/home/emmanuel/RIOT/dist/pythonlibs/testrunner/__init__.py", line 30, in run
    testfunc(child)
  File "/home/emmanuel/RIOT/tests/thread_float/tests/01-run.py", line 67, in testfunc
    assert result == first_result, "same calculation but different result"
AssertionError: same calculation but different result
make: *** [/home/emmanuel/RIOT/makefiles/tests/tests.inc.mk:22: test] Error 1

@fjmolinas
Copy link
Contributor

@emmanuelsearch the failure you mentioned is mentioned in #495 and #17170

@maribu
Copy link
Member

maribu commented Jan 21, 2022

@emmanuelsearch The issue is exactly what #17170 is about. A fix would be in glibc's implementation of ucontext by also backing up and restoring the FPU state on context switches. Likely, we should create a minimal example (that only used ucontext and nothing from RIOT) that breaks and report it upstream to glibc.

@maribu
Copy link
Member

maribu commented Jan 21, 2022

@benpicco: have you tried increasing stack sizes?

@benpicco
Copy link
Contributor Author

@benpicco: have you tried increasing stack sizes?

Even with

CFLAGS += -DTHREAD_STACKSIZE_MAIN=4096
CFLASG += -DTHREAD_STACKSIZE_IDLE=2048

it still crashes.

bors bot added a commit that referenced this issue Jan 3, 2023
18632: tests/thread_float: do not overload slow MCUs with IRQs r=benpicco a=maribu

### Contribution description

If the regular context switches are triggered too fast, slow MCUs will be able to spent little time on actually progressing in the test. This will scale the IRQ rate with the CPU clock as a crude way too keep load within limits.

### Testing procedure

The unit test should now pass on the Microduino CoreRF

```
$ make BOARD=microduino-corerf AVRDUDE_PROGRAMMER=dragon_jtag -C tests/thread_float flash test
make: Entering directory '/home/maribu/Repos/software/RIOT/tests/thread_float'
Building application "tests_thread_float" for "microduino-corerf" with MCU "atmega128rfa1".
[...]
   text	  data	   bss	   dec	   hex	filename
  12834	   520	  3003	 16357	  3fe5	/home/maribu/Repos/software/RIOT/tests/thread_float/bin/microduino-corerf/tests_thread_float.elf
avrdude -c dragon_jtag -p m128rfa1  -U flash:w:/home/maribu/Repos/software/RIOT/tests/thread_float/bin/microduino-corerf/tests_thread_float.hex
[...]
Welcome to pyterm!
Type '/exit' to exit.
READY
s
START
main(): This is RIOT! (Version: 2022.10-devel-858-g18566-tests/thread_float)
THREADS CREATED

Context switch every 3125 µs
{ "threads": [{ "name": "idle", "stack_size": 192, "stack_used": 88 }]}
{ "threads": [{ "name": "main", "stack_size": 640, "stack_used": 220 }]}
THREAD t1 start
THREAD t2 start
THREAD t3 start
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770

make: Leaving directory '/home/maribu/Repos/software/RIOT/tests/thread_float'
```

(~~Note: The idle thread exiting is something that should never occur. I guess the culprit may be `cpu_switch_context_exit()` messing things up when the main thread exits. But that is not directly related to what this PR aims to fix. Adding a `thread_sleep()` at the end of `main()` does indeed prevent the idle thread from exiting.~~
Update: That's expected. The idle thread stats are printed on exit of the main thread, the idle thread does not actually exit.)

### Issues/PRs references

Fixes #16908 maybe?

18950: tests/unittests: add unit tests for core_mbox r=benpicco a=maribu

### Contribution description

As the title says

### Testing procedure

The test cases are run on `native` by Murdock anyway.

### Issues/PRs references

Split out of #18949

19030: tests/periph_timer_short_relative_set: improve test r=benpicco a=maribu

### Contribution description

Reduce the number lines to output by only testing for intervals 0..15 to speed up the test.

In addition, run each test case 128 repetitions (it is still faster than before) to give some confidence the short relative set actually succeeded.

### Testing procedure

The test application should consistently fail or succeed, rather than occasionally passing.

### Issues/PRs references

None

19085: makefiles/tests/tests.inc.mk: fix test/available target r=benpicco a=maribu

### Contribution description

`dist/tools/compile_and_test_for_board/compile_and_test_for_board.py` relies on `make test/available` to check if a test if available. However, this so far did not take `TEST_ON_CI_BLACKLIST` and `TEST_ON_CI_WHITELIST` into account, resulting in tests being executed for boards which they are not available. This should fix the issue.

### Testing procedure


#### Expected to fail

```
$ make BOARD=nrf52840dk -C tests/gcoap_fileserver test/available
$ make BOARD=microbit -C tests/log_color test/available
```

(On `master`, they succeed, but fail in this PR.)

#### Expected to succeed

```
$ make BOARD=native -C tests/gcoap_fileserver test/available
$ make BOARD=nrf52840dk -C tests/pkg_edhoc_c test/available
$ make BOARD=nrf52840dk -C tests/log_color test/available
```

(Succeed in both `master` and this PR.)

### Issues/PRs references

None

Co-authored-by: Marian Buschsieweke <marian.buschsieweke@ovgu.de>
bors bot added a commit that referenced this issue Jan 4, 2023
18632: tests/thread_float: do not overload slow MCUs with IRQs r=kaspar030 a=maribu

### Contribution description

If the regular context switches are triggered too fast, slow MCUs will be able to spent little time on actually progressing in the test. This will scale the IRQ rate with the CPU clock as a crude way too keep load within limits.

### Testing procedure

The unit test should now pass on the Microduino CoreRF

```
$ make BOARD=microduino-corerf AVRDUDE_PROGRAMMER=dragon_jtag -C tests/thread_float flash test
make: Entering directory '/home/maribu/Repos/software/RIOT/tests/thread_float'
Building application "tests_thread_float" for "microduino-corerf" with MCU "atmega128rfa1".
[...]
   text	  data	   bss	   dec	   hex	filename
  12834	   520	  3003	 16357	  3fe5	/home/maribu/Repos/software/RIOT/tests/thread_float/bin/microduino-corerf/tests_thread_float.elf
avrdude -c dragon_jtag -p m128rfa1  -U flash:w:/home/maribu/Repos/software/RIOT/tests/thread_float/bin/microduino-corerf/tests_thread_float.hex
[...]
Welcome to pyterm!
Type '/exit' to exit.
READY
s
START
main(): This is RIOT! (Version: 2022.10-devel-858-g18566-tests/thread_float)
THREADS CREATED

Context switch every 3125 µs
{ "threads": [{ "name": "idle", "stack_size": 192, "stack_used": 88 }]}
{ "threads": [{ "name": "main", "stack_size": 640, "stack_used": 220 }]}
THREAD t1 start
THREAD t2 start
THREAD t3 start
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770

make: Leaving directory '/home/maribu/Repos/software/RIOT/tests/thread_float'
```

(~~Note: The idle thread exiting is something that should never occur. I guess the culprit may be `cpu_switch_context_exit()` messing things up when the main thread exits. But that is not directly related to what this PR aims to fix. Adding a `thread_sleep()` at the end of `main()` does indeed prevent the idle thread from exiting.~~
Update: That's expected. The idle thread stats are printed on exit of the main thread, the idle thread does not actually exit.)

### Issues/PRs references

Fixes #16908 maybe?

19031: cpu/stm32/periph_timer: implement timer_set() r=benpicco a=maribu

### Contribution description

The fallback implementation of timer_set() in `drivers/periph_common` is known to fail on short relative sets. This adds a robust implementation.

### Testing procedure

Run `tests/periph_timer_short_relative_set` at least a few dozen times (or use #19030 to have a few dozen repetitions of the test case in a single run of the test application). It should now succeed.

### Issues/PRs references

None

Co-authored-by: Marian Buschsieweke <marian.buschsieweke@ovgu.de>
@bors bors bot closed this as completed in 9a45f4b Jan 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants