Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LUCENE-8585: Index-time jump-tables for DocValues #525

Closed
wants to merge 24 commits into from

Conversation

tokee
Copy link
Contributor

@tokee tokee commented Dec 12, 2018

First complete attempt of LUCENE-8585, with all jump-tables at index-time and with the not-used-anymore lucene70 codec classes moved to backward-codecs.

…edDISI creation) and introduced secondary slices for jump-tables to avoid cache thrashing. The DENSE block rank structure handling is still in flux: Current implementations uses an explicit slice, but this might change depending on review discussion.
…es a slice. BaseDocValuesFormatTestCase.testSparseGCDCompression does not pass yet
Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tokee, I left some comments but I think it's close to being mergeable!

@jpountz
Copy link
Contributor

jpountz commented Dec 21, 2018

I forgot to mention: precommit complains about unused imports.


// Returns the offset to the jump-table for vBPV
private long writeValuesMultipleBlocks(SortedNumericDocValues values, long gcd) throws IOException {
long[] offsets = new long[ArrayUtil.oversize(100, Long.BYTES)]; // 100 blocks = 1.6M values
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you reduce the initial size so that testing exercises this line of code?

@@ -1216,7 +1219,7 @@ private void doTestNumericsVsStoredFields(double density, LongSupplier longs) th
doc.add(dvField);

// index some docs
int numDocs = atLeast(300);
int numDocs = atLeast((int) (minDocs*1.172));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this 1.172 factor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that should have been explained. It is because I made the minDocs variable. Originally the test was fixed with 256 as minDocs. 256*1.172=300, so with this trick the numDocs/minDocs-factor stayed constant for the old tests. I have probably been overthinking this. Should I set it to something less puzzling like 1.5 or 2?

@tokee
Copy link
Contributor Author

tokee commented Jan 21, 2019

Closing pull request as this has been committed to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants