Ensure that only 1 signal handler runs at a time #95
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When I ran qdrouterd with perf and pressed Ctrl+C, perf would send SIGTERM as well as SIGINT right after each other to the router.
Scenario:
Tthe first thread would pick up and handle the first signal in handle_signals_LH(). However, within the function, it unlocks the qd_server->lock before calling the registered handler.
When it unlocks this lock, some other thread will pick up the second signal and jump into the qd_server_pause() code, where it will wait indefinitely for threads to pause (I saw this in GDB, where 1 thread was 'missing', and all others marked as canceled). The original handler will have canceled the thread trying to pause all others, but it is not able to jump out.
This patch ensures that only 1 signal handler can run at a time, which fixes the issue for me. Maybe another approach involving a signal handler lock would be better, but this signal_handler_running variable is protected by the qd_server->lock lock.