Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: kernel: timer: Test timeout_abs from tests/kernel/timer/timer_api hangs causing test scenarios to fail #33667

Closed
PerMac opened this issue Mar 24, 2021 · 7 comments · Fixed by #35510
Assignees
Labels
area: Tests Issues related to a particular existing or missing test bug The issue is a bug, or the PR is fixing a bug platform: nRF Nordic nRFx priority: low Low impact/importance bug
Milestone

Comments

@PerMac
Copy link
Member

PerMac commented Mar 24, 2021

Describe the bug
The test timeout_abs from kernel.timer.tickless and kernel.timer scenarios at tests/kernel/timer/timer_api hangs on nrf5340dk_nrf5340_cpuappns causing this scenarios to fail (test sleep_abs is not executed)

To Reproduce
Steps to reproduce the behavior:

  1. have nrf5340dk connected
  2. go to your zephyr dir
  3. run ./scripts/twister -T tests/kernel/timer/timer_api/ -p nrf5340dk_nrf5340_cpuappns --device-testing --device-serial /dev/ttyACM2 --jobs 1 -v -v --inline-logs
  4. See error

Expected behavior
Test passes

Impact
Not clear

Logs and console output

...
START - test_timer_remaining
PASS - test_timer_remaining in 0.51 seconds
===================================================================
START - test_timeout_abs

and hangs. Then is terminated with a timeout

Environment (please complete the following information):

  • OS: Ubuntu 18.04
  • Toolchain zephyr sdk 0.12.2
  • Commit SHA or Version used zephyr-v2.5.0-1463-gc59cf

Additional context
This is a different issue than #32839. There the test fails due to an assertion error. Here it hangs.

@PerMac PerMac added bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug platform: nRF Nordic nRFx area: Tests Issues related to a particular existing or missing test labels Mar 24, 2021
@PerMac PerMac changed the title tests: kernel: timer: Test timeout_abs from kernel/timer/timeout_abs hangs causing test scenarios to fail tests: kernel: timer: Test timeout_abs from kernel/timer/timer_api hangs causing test scenarios to fail Mar 24, 2021
@PerMac PerMac changed the title tests: kernel: timer: Test timeout_abs from kernel/timer/timer_api hangs causing test scenarios to fail tests: kernel: timer: Test timeout_abs from tests/kernel/timer/timer_api hangs causing test scenarios to fail Mar 24, 2021
@ioannisg
Copy link
Member

@anangl I wonder if you could take a look at this bug.

@ioannisg
Copy link
Member

This seems to fail also for nrf9160dk_nrf9160ns, in the same way (@PerMac confirm?)

@PerMac
Copy link
Member Author

PerMac commented Apr 15, 2021

yes, it fails in the same way on nrf9160ns.

@ioannisg
Copy link
Member

@anangl did you have a chance to pick this bug investigation?

@anangl
Copy link
Member

anangl commented Apr 15, 2021

@anangl did you have a chance to pick this bug investigation?

The problem is that this loop:

do {
t0 = k_uptime_ticks();
rem_ticks = k_timer_remaining_ticks(&remain_timer);
t1 = k_uptime_ticks();
} while (t0 != t1);
has no chance to finish because the execution of rem_ticks = k_timer_remaining_ticks(&remain_timer); always takes more than one tick (for the non-secure image; for the secure one, it is a bit shorter, so the test passes).

@carlescufi
Copy link
Member

@andyross what do you suggest we do for configurations where the execution of the loop will always exceed one tick given that the calls are slow to execute when we are in TEE non-secure mode?

@andyross
Copy link
Contributor

andyross commented Apr 22, 2021

Just noticing this comment. What's the reason that the remaining check is taking longer than a tick? That call is just a wrapper around z_timeout_remaining(), which is a pure kernel thing that just queries state. The only interaction with the timer driver is a call to sys_clock_elapsed() to get the current ticks since the last interrupt.

The intent is that sys_clock_elapsed() should be implementable with a single register read, it shouldn't require any blocking on hardware state. If it actually requires waiting for a full 30+ us tick, that's.... actually a pretty severe performance bug. The kernel queries this function a lot and assumes it's fast.

(FWIW: there's an Intel-internal platform with a similar glitch right now where it needs to wait for that kind of period on register writes only, and that's causing a bunch of problems too. Doing it on a read seems like it's going to be a non-starter for any kind of performance-senstive work.)

Broadly: what is this trying to achieve? And if this is unavoidable, maybe this configuration really just doesn't want to be doing tickless and would be happy with 1024 Hz ticks or something (the kernel only needs to constantly check the current time with tickless, obviously).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Tests Issues related to a particular existing or missing test bug The issue is a bug, or the PR is fixing a bug platform: nRF Nordic nRFx priority: low Low impact/importance bug
Projects
None yet
5 participants