-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eliminating recursive locks in SB could allow for using more efficient resource #948
Comments
Note any fix will conflict with #947, so hold off for now. |
Tested on linux - enabled all event types, then stopped SAMPLE_APP which then reported
Sent a few noops and they all worked. So double lock didn't seem to hang up on this system. |
I was mistaken, I had thought we were using fast mutexes, but we are in fact using recursive mutexes on POSIX: So this is probably why it works ....? Using RECURSIVE is sort of like a cheat/easy way out - using a slower resource all the time because one doesn't want to expend the effort to avoid double lock. Ideally we should fix the code so it works with normal (non-recursive) mutexes. |
At least from the documentation I've come across, all three OSAL implementations in the framework support the nested locks. I don't see it mentioned in the API that the implementation MUST support nested locks though. There is a test to check it: Given that, I wouldn't think we'd want to change the behavior of this API. As you mention, could eliminate all the recursive locks and add a new API to improve performance... but at this point not a priority. |
I agree WRT not changing the OSAL behavior at this time, but we definitely should work toward removing the requirement/dependency on recursive mutexes in CFE. It is just bad design to have tasks locking the same resources more than once. I wrote issue nasa/osal#623 for some things we can easily do in OSAL to facilitate. Recommendation is to start with an debug message in the event that a task is double locking (can be done at shared layer). |
Yes, it does - at least for all the cases listed in the description. The general pattern implemented now will defer sending events (including debug events) until the function is finishing - after unlocking - so there is no nested/double lock. However - I did not explicitly test for nested locking as part of #1092. |
Closed by #1073 |
Describe the request
Recursive locks possible in the following code (may also be in other locations):
cFE/fsw/cfe-core/src/sb/cfe_sb_api.c
Lines 1110 to 1113 in dc3d62b
cFE/fsw/cfe-core/src/sb/cfe_sb_api.c
Lines 1121 to 1124 in dc3d62b
cFE/fsw/cfe-core/src/sb/cfe_sb_task.c
Lines 1172 to 1173 in dc3d62b
cFE/fsw/cfe-core/src/sb/cfe_sb_task.c
Lines 1200 to 1201 in dc3d62b
Related - the locking in the SendPrevSubs command handling doesn't look like it really helps since it has to unlock to send the message (same issues as the commands to record route/map info to file), typical use case is to enable subscription reporting, then send all previous subscriptions so may make sense to refactor (and possibly throttle).
To Reproduce
Clear filters on the debug messages and trigger (I stopped SAMPLE_APP to cause the pipe deletion), or just subscribe and unsubscribe twice to trigger CFE_SB_UNSUB_NO_SUBS_EID.
Expected behavior
Avoiding recursive lock could allow for using a more efficient resource on platforms where it's supported.
Code snips
See above.
System observed on:
From code analysis, tested on Ubuntu 18.04.
Additional context
From analysis during #928 and #947
Reporter Info
Jacob Hageman - NASA/GSFC
The text was updated successfully, but these errors were encountered: