-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update idle loop to reduce calls to suspend #6534
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -95,21 +95,24 @@ uint32_t OS_Tick_GetInterval (void) { | |
|
||
static void default_idle_hook(void) | ||
{ | ||
uint32_t elapsed_ticks = 0; | ||
|
||
core_util_critical_section_enter(); | ||
uint32_t ticks_to_sleep = svcRtxKernelSuspend(); | ||
if (ticks_to_sleep) { | ||
os_timer->schedule_tick(ticks_to_sleep); | ||
|
||
sleep(); | ||
|
||
os_timer->cancel_tick(); | ||
// calculate how long we slept | ||
elapsed_ticks = os_timer->update_tick(); | ||
uint32_t ticks_to_sleep = osKernelSuspend(); | ||
os_timer->suspend(ticks_to_sleep); | ||
|
||
bool event_pending = false; | ||
while (!os_timer->suspend_time_passed() && !event_pending) { | ||
|
||
core_util_critical_section_enter(); | ||
if (osRtxInfo.kernel.pendSV) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Might help with the review, details are here #6273 (comment) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agree that there should be better way - pendSV peeking was the best I could think of with current APIs. The "simple" API change I would suggest is to permit osKernelSuspend/Resume to be callable from critical section (which is what happens via direct svcKernelSuspend calls at the moment, and this replaces), or to have them or equivalents actually return from the suspend call such that the thread is now in a critical section. I guess in general one also needs to think about priviliged/unprivileged operation, in case a system is using User mode - system suspension, critical sections and the WFI instruction are privileged operations anyway, so this whole "normal thread code does suspend" only works if the idle thread is privileged (or the suspend call makes you privileged?), or you have a WFE-only version (which I believe is less flexible and can reduce power saving opportunities). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @JonatanAntoni when I was initially working on tickless I proposed an extension to the RTOS to allow this, which you might be able to use: As for this PR, how do you want to proceed? Can we move forward using this internal API until RTX has a public API which supports this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, for the time being you can access this internal bit. But as it is not public API it might change in the future without being warned. So we should align the low power operation, soon. We need to address this privileged/unprivileged issue nonetheless talking about MPU protection. |
||
event_pending = true; | ||
} else { | ||
sleep(); | ||
} | ||
core_util_critical_section_exit(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There might be call for an ISB here - you're wanting pending interrupts to trigger between this line and the next one, and an ISB should make sure that happens. In theory I think you could redisable interrupts before any have a chance to trigger - formally the sequence "enable IRQs / disable IRQs" needs an ISB somewhere in the middle to guarantee they have time to trigger. Relevant ARM ARM quote:
(In practice they're likely to trigger before you hit the next disable, which means you'd see pendSV set by the next time you hit line 104.) |
||
|
||
// Ensure interrupts get a chance to fire | ||
__ISB(); | ||
} | ||
svcRtxKernelResume(elapsed_ticks); | ||
core_util_critical_section_exit(); | ||
osKernelResume(os_timer->resume()); | ||
} | ||
|
||
#elif defined(FEATURE_UVISOR) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@c1728p9 Why is there no remove() call here? I'm seeing issues on my system because of this.
The SysTimer is based on a TimerEvent class, which has a static member for the timer's event data struct. Additionally, the call to suspend may happen before the previously set timer expires (for example when all threads go idle and the idle handler gets activated). At that point, the timer event data is still present in the timer queue. Calling schedule_tick here will insert the same event structure again in the queue. And since the queue is by reference, that'll lead to a recursive timer that breaks the system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @stevew817 I gave this a try but was unable to reproduce the failure. Do you have example code which reproduces the issue?
From both code analysis and running the code can I cant find a sequence which causes this. The basic flow of the idle loop should unconditionally call remove before each insert as indicated below:
With a debugger I can confirm the following sequence is being met by setting a breakpoint in TimerEvent::remove, TimerEvent::insert and TimerEvent::insert_absolute. Below is the callstack from each:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran across this when running cloud-client-example using Thread as a sleepy node in tickless mode. The code would hang, and attaching a debugger confirmed it was stuck in the tick handler IRQ because somehow the tick event had managed to add itself recursively (the next pointer of the tick event data was pointing to the same event data, causing the remove call to not do anything since the head just always ands up being the tick event data).
I fixed the issue by adding a cancel_tick() call before the schedule_ticks call in SysTimer::suspend. I didn't investigate further after solving my issue, so I'm afraid I don't have a reproducer ready.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stevew817 I'm still unable to reproduce this problem, but I created #7027 just to be safe. Could you provide further details on how to replicate this failure (such as git shas used in cloud-client-example and mbed-os along with config, any local mods you made and the hardware you were using) or an example application?