New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LUCENE-8962: Merge segments on getReader #1623
Conversation
@mikemccand @msokolov @msfroh I tried to build the minimal necessary code to enable this feature. I do actually think that the refresh / getReader change is much simpler than the commit part since we don't need to keep a cloned SegmentInfos in sync. I opened this really as a question if it can be that simple? |
Wow! This is incredibly simple! Thanks to the clean approach @s1monw worked out for commit-on-merge, awesome! I think tests would exercise this due to |
So this would merge small commits on refresh? That's cool. I wonder if it would be more obvious to users if we call the MergeTrigger REFRESH? I see that the trigger method is IndexWriter.getReader, but it seems like ultimately the higher level event that is more familiar is refresh. |
Small segments, yes.
+1, refresh is more recognized in the outside world :) I have been beasting this and uncovering small test failures, in tests that are confused that they do not have the expected number of segments. I'll push some fixes for those ... |
Here's a fun tragic failure test that repros:
|
A different, reproducing, test failure, likely from the same cause:
|
this assertion checks that we don't hold the full flush lock. I am trying to remember why I added this assertion. I understand the assertion above which also has a comment but I am unsure about the full flush lock it should not cause any deadlocks or any issues. I will keep thinking about it |
@mikemccand I do understand the issue now why holding the flushLock is illegal here. The problem is again the lock ordering in combination with the commitLock. One option that we have here is to remove the flushLock altogether and replace it's usage with the commitLock. I guess we need to find a better or new name for it but I don't see where having two different locks buys us much since they are both really just used to sync on administration of the IW. I personally also don't see why it would buys us anything in terms of concurrency. WDYT |
@mikemccand @msokolov @msfroh I pushed a new and slightly more complex but afaik correct approach to do the merge during getReader. Would be great to get some feedback. I think it's still pretty contained. |
Awesome, thanks @s1monw! I will try to have a look soon. I kicked off beasting of all Lucene (core + modules) tests with this change ... no failures yet after 31 iterations. |
OK, it has run 1061 iterations now of all Lucene (core + modules) tests, and some interesting (~15) failures:
This one does reproduce with this PR (but not on mainline). BTW, we need to fix mainline's This one also repros, only with this PR, but is maybe a test issue? Not sure:
|
Ugh, also these test failures that I put onto the wrong PR: #1743 (comment) |
I went through the code yesterday to try to understand what's going on. I'm not familiar with the pooled reader stuff, but it seems reasonable to me. I trust the tests (especially w/ added test coverage) far more than I trust my knowledge of that part of the code. |
I pushed a new commit and fixed yet I can't reproduce this one: |
@mikemccand nevermind I think I found the issue. I think it's ready now. |
Another failure that reproduces only on the PR:
|
Those were the only two failures found after 1033 test iterations! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR looks great to me! But looks like there are a couple very rare test failures...
@@ -404,7 +404,7 @@ private PendingDeletes newPendingDeletes(SegmentReader reader, SegmentCommitInfo | |||
private boolean noDups() { | |||
Set<String> seen = new HashSet<>(); | |||
for(SegmentCommitInfo info : readerMap.keySet()) { | |||
assert !seen.contains(info.info.name); | |||
assert !seen.contains(info.info.name) : "seen twice: " + info.info.name ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love seeing diffs like this one, adding a String
message to an otherwise cryptic assert
! It makes me realize you must have had a hellacious debugging session!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
many fun issues in this PR to be honest. IW is tricky as hell in some places like we are incRefing files in StandardDirectoryReader but not in IW for NRT readers is mindblowing :D
@mikemccand I pushed a fix for the failures. I will add a dedicated test to make sure it's covered too |
…ed away before we can reopen the reader on it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks Simon. I left two small comments.
… register and merge execution
I will restart beasting on the latest PR now. Thanks @s1monw! |
@mikemccand I think we are ready here WDYT? |
+1! Thanks @s1monw! I will continue beasting. The failures are very rare now ... I ran 196 iters of slow + nightly Lucene core + modules tests, and hit only ~4 interesting failures. E.g. this failing seed repros on the PR but not on mainline:
|
Hmm, not that I am using the larger ( |
@mikemccand this test reproduces on master too I opened https://issues.apache.org/jira/browse/LUCENE-9477 |
Great, thanks @s1monw! Given that beasting is now uncovering pre-existing issues I think we should push this PR! Thank you! |
Add IndexWriter merge-on-refresh feature to selectively merge small segments on getReader, subject to a configurable timeout, to improve search performance by reducing the number of small segments for searching. Co-authored-by: Mike McCandless <mikemccand@apache.org>
Add IndexWriter merge-on-refresh feature to selectively merge small segments on getReader, subject to a configurable timeout, to improve search performance by reducing the number of small segments for searching. Co-authored-by: Mike McCandless <mikemccand@apache.org>
Add IndexWriter merge-on-refresh feature to selectively merge small segments on getReader, subject to a configurable timeout, to improve search performance by reducing the number of small segments for searching.