New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SOLR-16412 : Race condition in SizeLimitedDistributedMap for cleanup #1032
SOLR-16412 : Race condition in SizeLimitedDistributedMap for cleanup #1032
Conversation
|
@noblepaul @chatman Can you have a review on this please? 😊 |
solr/core/src/test/org/apache/solr/cloud/TestSizeLimitedDistributedMap.java
Outdated
Show resolved
Hide resolved
Removed unused import
@risdenk would you mind to take a second look please? Many thanks! 🙇🏼 |
zookeeper.delete(dir + "/" + child, -1, true); | ||
if (onOverflowObserver != null) | ||
onOverflowObserver.onChildDelete(child.substring(PREFIX.length())); | ||
} catch (KeeperException.NoNodeException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: naming the exception ignore
or ignored
I think makes IDEs not care that the exception isn't used. It doesn't change the behavior at all just clear we meant to ignore the exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏼 changed
Can someone with write access please merge if it's all good? 😊 |
thanks @patsonluk |
https://issues.apache.org/jira/browse/SOLR-16412
Description
Details of the issue is described in the JIRA issue linked above.
Although we could enforce synchronization to prevent threads from purging the same set of child nodes, it might not be desirable to add extra blocking.
Instead, we should be more forgiving
SizeLimitedDistributedMap#shrinkIfNeeded
if any items/nodes in the map no longer exist.Solution
Catch
KeeperEx1ception.NoNodeException
inSizeLimitedDistributedMap#shrinkIfNeeded
if such node is already deleted, which is likely triggered by concurrentshrinkIfNeeded
call.Tests
Added unit test case
TestSizeLimitedDistributedMap#testConcurrentCleanup
, it's shown that without the current fix, the exception will be thrownChecklist
Please review the following and check all that apply:
main
branch../gradlew check
.