New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race condition in wtp #1274
Race condition in wtp #1274
Conversation
cancelation cleanup handlers. Previously, trying to shutdown a thread that has not yet been properly initialized could lead to a deadlock. This change makes wtpStartWrkr() synchronous in regard to the initialization of the newly created thread. Thanks to Rado Sroka for the analysis and an initial patch.
Note: the debian CI failuer is due to a change in CI environment, for which your version of master did not have the necesary patch (we changed CI to detect a previously undetected error condition). IAW: you do not need to care about that failure. All is well. |
I did an analysis for this race.
|
What fix do you have @rgerhards ? This cause the problem with system restart that rsyslog hangs, we had this problem in RHEL long ago and it's reproducible with newer versions of rsyslog. |
The problem on the Debian Buildbot Slave is unrelated. This is the issue: #1271 I need to review this PR, but quick overview made me think it is good ... and I appreciate this fix, especially as we were quite puzzled on the original issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good (actually excellent ;-))
/* wait for the new thread to initialize its signal mask and | ||
* cancelation cleanup handler before proceeding | ||
*/ | ||
d_pthread_cond_wait(&pThis->condThrdInitDone, &pThis->mutWtp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I realize that I am a bit late to the game here, but the condition wait call does not appear to be in a loop. Perhaps I am missing the loop? POSIX allows for spurious wake ups to occur for callers of pbench_cond_wait()
, so the recommended sequence/hand shake is to:
WAITER:
lock
loop
check predicate (memory location value)
if not expected state of predicate
then call condition wait (which unlocks mutex and reacquires on wakeup)
else exit loop
unlock
SIGNALER:
lock
change wait predicate
signal or broadcast
unlock
This code does not appear to do that, so it is likely wide open to race conditions.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
The shutdown sequence in wtp relies for its operation on signals and cancelation cleanup handlers. Previously, trying to shutdown a thread that has not yet been properly initialized could lead to a deadlock. This change makes wtpStartWrkr() synchronous in regard to the initialization of the newly created thread.
This issue extends an older one here: #966