Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LUCENE-9669: Add an expert API to allow opening indices created < N-1 #2212

Merged
merged 14 commits into from Jan 19, 2021

Conversation

s1monw
Copy link
Member

@s1monw s1monw commented Jan 16, 2021

Today we force indices that were created with N-2 and older versions of lucene
to fail on open. This check doesn't even check if the codecs are available. In order
to allow users to open older indices and for us to support N-2 versions this change
adds an API on DirectoryReader to specify a mininum index version on a per reader basis.
This doesn't apply for the IndexWriter which will fail on opening older indices.

Today we force indices that were created with N-2 and older versions of lucene
to fail on open. This check doesn't even check if the codecs are available. In order
to allow users to open older indices and for us to support N-2 versions this change
adds an API on DirectoryReader to specify a mininum index version on a per reader basis.
This doesn't apply for the IndexWriter which will fail on opening older indices.
Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, expert users would have to pass the minimum major version that is required to both DirectoryReader#listCommits and DirectoryReader#open. I wonder if we really need it in listCommits, maybe we could always return commits regardless of whether the index creation version is too old or not and only have the check in DirectoryReader#open? What do you think?

@s1monw
Copy link
Member Author

s1monw commented Jan 17, 2021

@jpountz thanks, great feedback. I pushed a new commit

@s1monw s1monw requested a review from jpountz January 17, 2021 13:08
expectThrows(IllegalArgumentException.class, () -> DirectoryReader.listCommits(dir));
// TODO fix this once we have the codec for 7.0 recreated
assertEquals(
"Could not load codec 'Lucene70'. Did you forget to add lucene-backward-codecs.jar?",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised that this works, not all 7.x indices use Lucene70 as a codec, e.g. there must be some indices that use Lucene75 or something like that as a codec? What am I missing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@msokolov msokolov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a few documentation nits

* Expert: returns an IndexReader reading the index in the given {@link IndexCommit}. This method
* allows to open indices that were created wih a Lucene version older than N-1 provided that all
* all codecs for this index are available in the classpath and the segment file format used was
* created with Lucene 7 or older. Users of this API must be aware that Lucene doesn't guarantee
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of "Lucene 7" should we reference the minSupportedMajorVersion parameter here? Or .. is it Version.MIN_SUPPORTED_MAJOR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is due to the fact that the segments info format only supports 7.0 and upwards

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The javadoc should read 'Lucene 7 or newer' I think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

@mikemccand mikemccand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I left some minorish comments.

@@ -3900,6 +3907,12 @@ public static Options parseOptions(String[] args) {
}
i++;
opts.dirImpl = args[i];
} else if ("-min_version_created".equals(args[i])) {
if (i == args.length - 1) {
throw new IllegalArgumentException("ERROR: missing value for -min_version_created");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, we should also update the Usage: ... exception (around line 3928 in this modified version) to document this new option?

If a user tries to CheckIndex a too-old index without this option they'll see a IndexFormatTooOldException right? Should we catch that and rethrow w/ better message suggesting to use this option? Should we maybe by default just set this option (always allow CheckIndex on a too-old index as long as you have the old Codecs around...)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am all for old indices and remove this option. WDOT?

@s1monw
Copy link
Member Author

s1monw commented Jan 18, 2021

@mikemccand pushed changes

Copy link
Member

@mikemccand mikemccand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks @s1monw!

@s1monw
Copy link
Member Author

s1monw commented Jan 18, 2021

I plan to merge this during the next 24 hours thanks for the reviews

@s1monw s1monw merged commit c1ae6dc into apache:master Jan 19, 2021
@s1monw s1monw deleted the LUCENE-9669 branch January 19, 2021 08:24
ctargett pushed a commit to ctargett/lucene-solr that referenced this pull request Jan 19, 2021
…apache#2212)

Today we force indices that were created with N-2 and older versions of Lucene
to fail on open. This check doesn't even check if the codecs are available. In order
to allow users to open older indices and for us to support N-2 versions this change
adds an API on DirectoryReader to specify a minimum index version on a per reader basis.
This doesn't apply for the IndexWriter which will fail on opening older indices.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants