-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PrimaryNode: add configurable timeout to waitForAllRemotesToClose #11822
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making the change, overall LGTM, just have some small comments.
Also please don't forget to add an entry in CHANGES.txt
lucene/replicator/src/java/org/apache/lucene/replicator/nrt/PrimaryNode.java
Outdated
Show resolved
Hide resolved
lucene/replicator/src/java/org/apache/lucene/replicator/nrt/PrimaryNode.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes total sense -- tying Primary to the behavior of Replicas is dangerous coupling. It would be great to fix/improve the design so that Primary never has to wait for Replicas to do anything, but let's start with this great baby step.
lucene/replicator/src/java/org/apache/lucene/replicator/nrt/PrimaryNode.java
Outdated
Show resolved
Hide resolved
lucene/replicator/src/java/org/apache/lucene/replicator/nrt/PrimaryNode.java
Outdated
Show resolved
Hide resolved
a4ae16a
to
4280bb7
Compare
I updated this PR to rename the field to include I don't think this is a great candidate for a random test - randomization is wonderful for perturbing data and finding edge cases in algorithms. In this case, it is just an int we compare to a clock, so randomizing doesn't seem likely to uncover any helpful edge cases. Please let me know if this is still desired. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @stevenschlansker LGTM, I think this change contains no backward breaking API change so we should make it in 9.5?
4280bb7
to
b8c22d7
Compare
I merged it but seems there're test failure
But seems not related to this PR, I'll create another issue if that failure is reproducible. |
OK - I did run |
I tried to reproduce the issue but couldn't. So likely a transient or
extremely rare test failure, and should not be related to the PR
…On Wed, Oct 19, 2022, 16:44 Steven Schlansker ***@***.***> wrote:
OK - I did run ./gradlew check so I don't think I broke anything, but
please let me know if it does end up being related!
—
Reply to this email directly, view it on GitHub
<#11822 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEFSB7EMFIIHD5IKGTMNEKLWECBXJANCNFSM6AAAAAAQWFSERU>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
Adds a configurable timeout for PrimaryNode waiting for remotes to close.
The default matches existing behavior. In the long run, maybe this wait loop goes away entirely, but I elected to make the most compatible change first.
Fixes #11674