-
Notifications
You must be signed in to change notification settings - Fork 790
SOLR-17725: Add merge policy to block older segments from participating in merges #3883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thank you for sharing this! 🥳 |
dsmiley
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If no merge is found, perhaps it should seek a possible merge of the previous major version? Or just leave that as a future improvement possibility, I suppose.
I could see this being contributed to Lucene.
| * Only allows latest version segments to be considered for merges. That way a snapshot of older | ||
| * segments can remain consistent | ||
| */ | ||
| public class LatestVersionFilterMergePolicy extends MergePolicy { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you seen FilterMergePolicy for delegation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great suggestion. Much better for delegation.
| import org.apache.lucene.util.Version; | ||
|
|
||
| /** | ||
| * Only allows latest version segments to be considered for merges. That way a snapshot of older |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest this wording instead:
Prevents merging segments with differing Lucene major version number origins. This assists in upgrading to a future Lucene major version.
| @Override | ||
| public MergeSpecification findMerges( | ||
| MergeTrigger mergeTrigger, SegmentInfos infos, MergeContext mergeContext) throws IOException { | ||
| /*we don't want to remove from the original SegmentInfos, else the segments may not carry forward upon a commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could merely say that input SegmentInfos should be treated as immutable, and that modifying it is dangerous (ideally would be impossible). An FYI. Your text had me thinking for a moment that the concern was specific to this MP but it's a general statement across Lucene, I see. When I see other usages in Lucene, I don't see dire warnings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rephrased to not sound alarming
| Map<SegmentCommitInfo, Boolean> segmentsToMerge, | ||
| MergeContext mergeContext) | ||
| throws IOException { | ||
| return delegatePolicy.findForcedMerges( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the constraint isn't applied, but shouldn't it?
| @Override | ||
| public MergeSpecification findForcedDeletesMerges( | ||
| SegmentInfos segmentInfos, MergeContext mergeContext) throws IOException { | ||
| return delegatePolicy.findForcedDeletesMerges(segmentInfos, mergeContext); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the constraint isn't applied, but shouldn't it?
|
Appreciate the review @dsmiley ! Still working through some of the details of this draft and will revise soon. Thanks. |
Including previous major version could mean an older version segment could potentially merge with a current version segment. Which means when we upgrade to the next Lucene version, the index may not open, since the check in Lucene now checks for SegmentCommitInfo.minVersion (oldest of the different versions of segments merging to form a given segment) and that may now fall below the N-1 version compatibility.
Makes sense. I'll try asking on the dev list. |
dsmiley
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I said this:
If no merge is found, perhaps it should seek a possible merge of the previous major version? Or just leave that as a future improvement possibility, I suppose.
I wasn't clear, based on your response. I mean, if there is no current version merge, then perhaps we could look to merge segments of the previous version. I definitely don't mean merging segments across major versions, since that's precisely what we want to avoid.
But on second thought, it's not so simple, based on the way Lucene works. Once Lucene decides that the previous version's segments don't need another merge (i.e. there aren't quite too many of them), it's never going to change. But maybe this strategy makes sense for a force merge (i.e. optimize).
| return in.findFullFlushMerges(mergeTrigger, getFilteredInfosClone(infos), mergeContext); | ||
| } | ||
|
|
||
| private SegmentInfos getFilteredInfosClone(SegmentInfos infos) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: clone is an implementation detail that doesn't belong in the method. For example, maybe we see that rebuilding this thing is expensive-ish and we decide to first check if all segments are current to not clone. Or if no segments are updated, and return an empty one.
https://issues.apache.org/jira/browse/SOLR-17725
Description
Please provide a short description of the changes you're making with this pull request.
Solution
Please provide a short description of the approach taken to implement your solution.
Tests
Please describe the tests you've developed or run to confirm this patch implements the feature or solves the problem.
Checklist
Please review the following and check all that apply:
mainbranch../gradlew check.