Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LUCENE-9877: Allow up to 7 exceptions in PForUtil (instead of 3) #48

Merged
merged 3 commits into from
Mar 30, 2021

Conversation

gsmiller
Copy link
Contributor

Description

This change allows PForUtil to encode up to 7 exceptions per encoded block (instead of the 3 allowed today) in order to reduce index size.

Solution

I tested the impact of 7 exceptions vs. 3 using a new tool I wrote to directly inspect the contents of PFOR encoded index blocks (details in the Jira) as well as inspected overall index size reduction. I also tested QPS impact using luceneutil benchmark tools. All details are in the Jira issue.

Note that this change should be fully backwards compatible, even though I don't think that needs to be a requirement. Details of this are also mentioned in the Jira issue.

Tests

All existing tests pass. No new tests seemed necessary as TestPForUtil already exercises the code.

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Lucene maintainers access to contribute to my PR branch. (optional but recommended)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good to me. Can you add a CHANGES entry? Also, should we update the 8.x postings format too?

@gsmiller
Copy link
Contributor Author

Thanks @jpountz. I updated CHANGES.txt just now. As for updating 8.x, I'm a bit naive on how to go about suggesting that change. The change is backwards-compatible so this should be possible. I think this would involve opening an PR against the old repo and then also updating the backwards_codecs version of PForUtil in this PR?

@gsmiller
Copy link
Contributor Author

I've opened a separate PR to backport this into branch_8x as well since it's fully backwards compatible. I updated this PR to also include the change in "backwards_codecs", assuming that we want to backport. If we decide not to, I can remove that change. Thanks!

@jpountz jpountz merged commit fd79f97 into apache:main Mar 30, 2021
gsmiller added a commit to gsmiller/lucene that referenced this pull request Mar 30, 2021
gsmiller added a commit to gsmiller/lucene that referenced this pull request Mar 30, 2021
gsmiller added a commit to gsmiller/lucene that referenced this pull request Mar 30, 2021
janhoy pushed a commit to cominvent/lucene that referenced this pull request May 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants