New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alternative approach to LUCENE-8962 #1576
Conversation
see #1552 for reference |
|
||
boolean await(long timeout, TimeUnit unit) { | ||
for (OneMerge merge : merges) { | ||
if (merge.await(timeout, unit) == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks suspicious when there is more than one merge. Shouldn't the timeout decrease as time is used up by earlier merges?
In practice, when is there more than one? I've been confused on this matter when I developed a custom MP/MS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in a real change that's correct. in a prototype as this is it's really just there to visualize the idea. I didn't do this on purpose to not discuss impl details. that's not the point of this it's really just a PR to make commenting simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed this in the followup PR #1585
This approach makes sense to me. I like how much simpler the addition of MergeSpecification.await() makes things, versus the CountDownLatch hackery of the previous approach. Also, updatePendingMerges returning the MergeSpecification is much cleaner than explicitly computing a merge from within prepareCommitInternal. |
superseded by #1585 |
final int numMerges = spec.merges.size(); | ||
for(int i=0;i<numMerges;i++) { | ||
final MergePolicy.OneMerge merge = spec.merges.get(i); | ||
merge.maxNumSegments = maxNumSegments; | ||
} | ||
} | ||
} else { | ||
spec = mergePolicy.findMerges(trigger, segmentInfos, this); | ||
switch (trigger) { | ||
case COMMIT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an inconsistency here that suggests something is wrong, or at least confusing enough to deserve a comment. For case COMMIT, we call a findFullFlushMerges. Shouldn't it be on case FULL_FLUSH to be consistent with the method we are calling? Or should findFullFlushMerges be called findCommitMerges?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Full flush happens for refresh and commit.
But we have only implemented "merge on commit" so far.
I would love to also add "merge on refresh", but until we do so, I think it's correct for IndexWriter
to separate out the COMMIT
case here like this.
this is just a prototype - only intended for discussion