-
Notifications
You must be signed in to change notification settings - Fork 1.5k
sched/semaphore: Optimize critical section scope in sem_clockwait() #18199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Optimize critical section protection by narrowing the scope to only protect necessary data access in sem_clockwait(). This improves latency and reduces interrupt disable time while maintaining all synchronization guarantees and thread safety for semaphore clock-based wait operations. Signed-off-by: hujun5 <hujun5@xiaomi.com>
linguini1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Can you please share what system you tested on and provide an OStest log from the testing?
done |
|
|
||
| ret = nxsem_wait(sem); | ||
|
|
||
| leave_critical_section(flags); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should include wd_cancel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need to place wd_cancel inside the critical section.
1 The watchdog is bound to the current TCB; we don't have to worry about other tasks modifying the watchdog (e.g., by calling wd_start) after we leave the critical section.
2 Even if a watchdog timer interrupt occurs after we leave the critical section, this logic is handled correctly in the nxsem_timeout function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so why we need trigger the nxsem_timeout() interrupt again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, have you considered the case where the semaphore is waited on by other threads before wd_cancel is called, which would cause wd_cancel to incorrectly cancel other pending events?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, have you considered the case where the semaphore is waited on by other threads before wd_cancel is called, which would cause wd_cancel to incorrectly cancel other pending events?
This won't happen. For other threads calling this, wd_start(&rtcb->waitdog ....)
rtcb is always this_task, so wd_cancel won't incorrectly cancel the watchdog.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
T1 T2
nxsem_wait
|
leave_critical_section
|
-------------------------------------->nxsem_clockwait
|
wd_start
|
nxsem_wait
|
|<--------------------------------------
|
wd_cancel
I mean 2 tasks are contending for the same lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
T1 T2 nxsem_wait | leave_critical_section | -------------------------------------->nxsem_wait | ->wd_start |<-------------------------------------- | wd_cancelI mean 2 tasks are contending for the same lock.
T1 T2 nxsem_wait | leave_critical_section | -------------------------------------->nxsem_wait | ->wd_start(&rtcb_t2->waitdog) |<-------------------------------------- | wd_cancel(&rtcb_t1->waitdog)
Even if this happens, it won't cause any issues. Since rtcb_t1->waitdog and rtcb_t2->waitdog are different, both can run normally without contention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, but wd_cancel() still needs to be moved into the critical section, which avoids an extra timer callback
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally speaking, the smaller the critical section protection range, the better.
1 On single-core systems, this improves real-time performance.
2 On multi-core systems, this enhances concurrency and reduces the likelihood of deadlocks caused by nested locks.
3 Regarding the additional timer callback you mentioned,
this is actually a rare edge case requiring multiple specific conditions to be met.
It is therefore not appropriate to handle it by enlarging the critical section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
This interface requires greater determinism, rather than an occasional spurious extra callback.
-
The implementation of wd_cancel also invokes critical section code. If csec is implemented with reference counter, it’s unclear which approach offers better performance.
-
If you do not delete the node, other WD timers will incur an extra traversal when being added to the global linked list, leading to even higher overhead.
I think it got mixed up in your table formatting |
ok |
Summary
This PR optimizes the semaphore clock-wait implementation by reducing the scope of critical section protection. The changes narrow the protected region to only essential data access operations, improving system responsiveness and reducing interrupt disable time while maintaining complete thread safety and synchronization guarantees.
Changes:
Motivation
Original Design: The critical section protected a larger scope than necessary, including operations that don't require protection from concurrent access.
Issue: Excessive critical section scope increases interrupt disable time, which can impact system latency and responsiveness.
Solution: Carefully analyze the function to identify minimal protected region that maintains thread safety, then move non-critical operations outside the critical section.
Impact
Analysis
Critical Section Optimization:
Benefits:
Testing
###Test log in hardware with esp32s3-devkit:nsh
nsh> uname -a
NuttX 12.12.0 30028ce Jan 28 2026 08:51:51 xtensa esp32s3-devkit
nsh> ostest
stdio_test: write fd=1
stdio_test: Standard I/O Check: printf
stdio_test: write fd=2
stdio_test: Standard I/O Check: fprintf to stderr
ostest_main: putenv(Variable1=BadValue3)
ostest_main: setenv(Variable1, GoodValue1, TRUE)
ostest_main: setenv(Variable2, BadValue1, FALSE)
ostest_main: setenv(Variable2, GoodValue2, TRUE)
ostest_main: setenv(Variable3, GoodValue3, FALSE)
ostest_main: setenv(Variable3, BadValue2, FALSE)
show_variable: Variable=Variable1 has value=GoodValue1
show_variable: Variable=Variable2 has value=GoodValue2
show_variable: Variable=Variable3 has value=GoodValue3
ostest_main: Started user_main at PID=3
user_main: Begin argument test
user_main: Started with argc=5
user_main: argv[0]="ostest"
user_main: argv[1]="Arg1"
......................
End of test memory usage:
VARIABLE BEFORE AFTER
======== ======== ========
arena 5d8d0 5d8d0
ordblks 7 6
mxordblk 54920 54920
uordblks 4fc8 4fc8
fordblks 58908 58908
Final memory usage:
VARIABLE BEFORE AFTER
======== ======== ========
arena 5d8d0 5d8d0
ordblks 1 6
mxordblk 59278 54920
uordblks 4658 4fc8
fordblks 59278 58908
user_main: Exiting
ostest_main: Exiting with status 0
nsh>
nsh>
nsh>
Metrics:
Build: ARM GCC 10.x, 0 warnings, All tests PASS
Code Changes
File:
sched/semaphore/sem_clockwait.cKey modifications:
Pattern: