LUCENE-9084: circular synchronization wait (potential deadlock) in AnalyzingInfixSuggester #1064
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue Report On JIRA
I also reported this at:
https://issues.apache.org/jira/browse/LUCENE-9084#
Bug description
Detailed code (snippets and links) are in the sections after this overview (section Detailed Code and This Patch's Code).
Method
ensureOpen()
issynchronized
(acquiresthis
) and its body contains asynchronized (searcherMgrLock)
block (i.e., then acquiressearcherMgrLock
).Method
ensureOpen()
is called two times from public methodsadd()
andupdate()
.A thread calling public methods
add()
orupdate()
will acquire locks in order:Public method
build()
has asynchronized (searcherMgrLock)
block in which it calls methodadd()
. Methodadd()
, as described above, calls methodensureOpen()
.Therefore, a thread calling public method
build()
will acquire locks in order:2 threads can acquire locks in different order which may cause a circular wait (deadlock).
I do not know which threads call these methods, but there is a lot of synchronization in these methods and in this file, so I think these methods must be called concurrently.
One thread can acquire:
this -> searcherMgrLock
(the first order above)and the other thread can acquire:
searcherMgrLock -> this
(the second order above).Note how the above 2 orders lead to a circular wait.
Detailed Code
Method
ensureOpen()
issynchronized
and its body contains asynchronized (searcherMgrLock)
:https://github.com/apache/lucene-solr/blob/master/lucene/suggest/src/java/org/apache/lucene/search/suggest/analyzing/AnalyzingInfixSuggester.java#L371-L379
Method
ensureOpen()
is called two times from public methodsadd()
andupdate()
:https://github.com/apache/lucene-solr/blob/master/lucene/suggest/src/java/org/apache/lucene/search/suggest/analyzing/AnalyzingInfixSuggester.java#L394-L395
https://github.com/apache/lucene-solr/blob/master/lucene/suggest/src/java/org/apache/lucene/search/suggest/analyzing/AnalyzingInfixSuggester.java#L406-L407
Public method
build()
has asynchronized (searcherMgrLock)
block in which it calls methodadd()
:https://github.com/apache/lucene-solr/blob/master/lucene/suggest/src/java/org/apache/lucene/search/suggest/analyzing/AnalyzingInfixSuggester.java#L278-L310
Method
add()
is the same one I linked above.This Patch's Code
Note that method
ensureOpen()
(inlined above) is the only place (method or synchronization block) that issynchronized
onthis
.All the other synchronizations in this file are on
searcherMgrLock
.This CR removes the
synchronized
onthis
(again, being the onlysynchronized
onthis
, we can do this change safely).And moves
synchronized (searcherMgrLock)
a few lines above, to protect the entire code (that otherwise was protected bysynchronized
onthis
).The above breaks the lock cycle I described earlier.
The fix looks big because it changes indentation.
But only one line is moved by a few lines up.
I.e., from this:
To this:
Here are all the places where
synchronized (searcherMgrLock)
appears in this file (and again, no othersynchronized
on other objects is done):I.e., doing the synchronization like above is safe and consistent with the rest of the file.