-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Backport #14607 to branch_10x (Index open performs version check on each segment, ignores indexCreatedVersionMajor) #15431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backport #14607 to branch_10x (Index open performs version check on each segment, ignores indexCreatedVersionMajor) #15431
Conversation
…ent, ignores indexCreatedVersionMajor) to brnach_10x
|
gradlew check passes fine, but the nightly tests in TestBinaryBackwardsCompatibility are failing. Specifically TestBinaryBackwardsCompatibility.testReadNMinusTwoCommit and TestBinaryBackwardsCompatibility.testReadNMinusTwoSegmentInfos. These try to test opening N-2 version (=8.x) and expect success. This used to work earlier with VERSION_74 in checkHeaderNoMagic(), but fails now (for version < 8.6.0) since we moved to VERSION_86. Staying with VERSION_74 complains about missing Lucene_70 codec (same as main). "main" doesn't have this problem because N-2=9.x I have not yet understood why we test for N-2 when we anyway don't support the index. And hence still contemplating the right way forward on these failures. To reproduce: |
|
I'm trying to understand what's going on here. One thing that confuses me is why we have: and also what is the difference between "supported" and "binary supported"? |
|
I think what happened is that CheckIndex is now able to read some more of the back-compat indexes that we previously said were incompatible. But this doesn't really make sense since the 10x branch does not include any additional backwards codecs that were removed from main. Some of these indexes cannot be opened, but CheckIndex is able to check them and reports they are clean. I was able to get tests passing by relaxing a few version numbers and by I changing the exception type when we are unable to read the segments file from IllegalArgumentException to IndexFormatTooOldException to match the expectations of the test. Maybe that was bad, but it seems pretty harmless to me? I'm not sure if this change is safe or not. |
|
Thanks for taking a look and for your thoughts on this @msokolov .
Both main and branch_10 include backward codecs starting 8.x. The tests in TestBinaryBackwardsCompatibility test for binary compatibility of indexes for version X-2. The list of versions tested is in https://github.com/apache/lucene/blob/main/lucene/backward-codecs/src/test/org/apache/lucene/backward_index/versions.txt. It begins with 9.x for main, and 8.x for branch_10x, which explains why we don't hit this issue on main, but encounter this on 10x the moment we bump up the min version of SegmentInfos to VERSION_86 in checkHeaderNoMagic().
I think this is since CheckIndex seems to be testing for whether it can actually read individual segments while ignoring the MIN_SUPPORTED_VERSION and is able to do so since the codecs are present (https://github.com/apache/lucene/blob/branch_10x/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java#L637) |
…odec() throws IndexFormatTooOldException if a default Lucene codec is not found
|
I think I agree with your solution here. Since certain tests are checking for N-2 version , might be prudent to not break that compatibility on a minor version. So I reverted the header check on SegmentInfos to VERSION_74. Looks like a lot of these N-2 tests were introduced in 9.10 in #13046. I get it that we probably want to see if we can read such unsupported indexes with the supplied backward-codes, but the motivation behind that is still not super clear to me. Also, one could argue that if a default backward Lucene codec is not found on the class path, the probability of it being a too old index is much higher than a missing backward-codecs.jar . So throwing an IndexFormatTooOldException in readCodec() should be acceptable. I have re-worked this patch. I may or may not have spent an unhealthy amount of time reasoning about this problem 😁 |
|
Thanks for merging this Mike. I was thinking maybe I can create another PR for main and push the SegmentInfos min version in checkHeaderNoMagic() back to VERSION_74 ? And throw the same IndexFormatTooOldException in readCodec() as here. That would help continue the support for 8.x indexes since the backward codec is present anyway? |
|
I think it makes sense, also to keep similar code between the two branches will help people in the future! |
|
Created #15454 |
Description
Backport #14607 to branch_10x