New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LUCENE-9669: Add an expert API to allow opening indices created < N-1 #2212
Conversation
Today we force indices that were created with N-2 and older versions of lucene to fail on open. This check doesn't even check if the codecs are available. In order to allow users to open older indices and for us to support N-2 versions this change adds an API on DirectoryReader to specify a mininum index version on a per reader basis. This doesn't apply for the IndexWriter which will fail on opening older indices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this change, expert users would have to pass the minimum major version that is required to both DirectoryReader#listCommits
and DirectoryReader#open
. I wonder if we really need it in listCommits
, maybe we could always return commits regardless of whether the index creation version is too old or not and only have the check in DirectoryReader#open
? What do you think?
...ne/backward-codecs/src/test/org/apache/lucene/backward_index/TestBackwardsCompatibility.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/index/DirectoryReader.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/index/StandardDirectoryReader.java
Outdated
Show resolved
Hide resolved
@jpountz thanks, great feedback. I pushed a new commit |
expectThrows(IllegalArgumentException.class, () -> DirectoryReader.listCommits(dir)); | ||
// TODO fix this once we have the codec for 7.0 recreated | ||
assertEquals( | ||
"Could not load codec 'Lucene70'. Did you forget to add lucene-backward-codecs.jar?", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised that this works, not all 7.x indices use Lucene70 as a codec, e.g. there must be some indices that use Lucene75 or something like that as a codec? What am I missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems like we only had LUCENE70
in this release https://github.com/apache/lucene-solr/tree/branch_8x/lucene/backward-codecs/src/java/org/apache/lucene/codecs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a few documentation nits
lucene/core/src/java/org/apache/lucene/index/DirectoryReader.java
Outdated
Show resolved
Hide resolved
* Expert: returns an IndexReader reading the index in the given {@link IndexCommit}. This method | ||
* allows to open indices that were created wih a Lucene version older than N-1 provided that all | ||
* all codecs for this index are available in the classpath and the segment file format used was | ||
* created with Lucene 7 or older. Users of this API must be aware that Lucene doesn't guarantee |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of "Lucene 7" should we reference the minSupportedMajorVersion
parameter here? Or .. is it Version.MIN_SUPPORTED_MAJOR
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is due to the fact that the segments info format only supports 7.0 and upwards
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The javadoc should read 'Lucene 7 or newer' I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! I left some minorish comments.
@@ -3900,6 +3907,12 @@ public static Options parseOptions(String[] args) { | |||
} | |||
i++; | |||
opts.dirImpl = args[i]; | |||
} else if ("-min_version_created".equals(args[i])) { | |||
if (i == args.length - 1) { | |||
throw new IllegalArgumentException("ERROR: missing value for -min_version_created"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, we should also update the Usage: ...
exception (around line 3928 in this modified version) to document this new option?
If a user tries to CheckIndex
a too-old index without this option they'll see a IndexFormatTooOldException
right? Should we catch that and rethrow w/ better message suggesting to use this option? Should we maybe by default just set this option (always allow CheckIndex
on a too-old index as long as you have the old Codecs around...)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am all for old indices and remove this option. WDOT?
lucene/core/src/java/org/apache/lucene/index/DirectoryReader.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/index/DirectoryReader.java
Outdated
Show resolved
Hide resolved
@mikemccand pushed changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks @s1monw!
I plan to merge this during the next 24 hours thanks for the reviews |
…apache#2212) Today we force indices that were created with N-2 and older versions of Lucene to fail on open. This check doesn't even check if the codecs are available. In order to allow users to open older indices and for us to support N-2 versions this change adds an API on DirectoryReader to specify a minimum index version on a per reader basis. This doesn't apply for the IndexWriter which will fail on opening older indices.
Today we force indices that were created with N-2 and older versions of lucene
to fail on open. This check doesn't even check if the codecs are available. In order
to allow users to open older indices and for us to support N-2 versions this change
adds an API on DirectoryReader to specify a mininum index version on a per reader basis.
This doesn't apply for the IndexWriter which will fail on opening older indices.