New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
race condition in AbstractAsyncWriter causing "java.lang.RuntimeException: Queue should be empty but is size:" #564
Comments
i think reversing the order of operations in the while loop will fix it, ie
because, if Can someone confirm that there is indeed a race condition and that the proposed fix makes it go away? |
Discussed in person with @akiezun, and we convinced ourselves that there is indeed a potential race here due to the writer thread checking for emptiness while it's still possible to add additional items via the main thread. Checking for closure first, and only checking emptiness if the writer is already in a state where no more items can be added (ie., it's already closed) seems like it should fix the race, but empirical testing and additional sets of eyes are required. The way the class is currently constructed seems like an invitation to races of this sort, since you have an |
@akiezun Should we disable async I/O in GATK4 until this is resolved? |
@droazen if possible i'd rather fix this. but yes, if we have no fix for this by the time we close the milestone, we'll turn off asyncIO |
Subject of the issue
I think there's a race condition in AbstractAsyncWriter that is causing broadinstitute/gatk#1699
The race is exposed by a context switch in the middle of evaluating the
||
condition of the writer threadThe scenario is like this.
let's say at a specific moment the queue is empty (but the writer is not closed)
writer thread starts executing this the loop condition
while (!queue.isEmpty() || !isClosed.get()) {
it evaluates the first condition and queue is empty so it moves on to evaluating the second condition.
But before it gets to it, ...
main thread calls
write
a few times - the queue becomes non emptymain thread calls
close()
main runs this line and thus flips isClosed to true
!this.isClosed.getAndSet(true)
then main checks that
this.queue.isEmpty()
which it is not so it moves onthen main blocks on writer.join() and waits
isClosed.get()
and that returns true now so the loop is not entered - because the first part of the condition is still remembered asfalse
)Writer thread ends the
run()
method and diesThe text was updated successfully, but these errors were encountered: