Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LUCENE-10378 Implement Weight#count for PointRangeQuery #658

Merged
merged 11 commits into from Feb 16, 2022

Conversation

gautamworah96
Copy link
Contributor

Description

Implement Weight#count on PointRangeQuery. Also fixed a small style inconsistency that I noticed

Issue: https://issues.apache.org/jira/browse/LUCENE-10378

Solution

Use a similar approach to what we've done for TermQuery or NormsFieldExistsQuery. Added initial checks for validating the input and then checking if all documents have at-least one point, the field is single dimensional and the number of points equals the number of documents.

Note: @jpountz I think I've misinterpreted your by only counting matches on the two leaves that cross with the query. comment and have implemented a brute force approach.

Tests

Tests have not been added. I'm in the process of writing them

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Lucene maintainers access to contribute to my PR branch. (optional but recommended)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some minor comments in addition to Ignacio's.

Still needs some logic to handle leaf nodes
Refactored common check args to a separate function
…at matched, fix leafNode counting logic, edit comment about the condition under which our optimization is fired
Copy link
Contributor Author

@gautamworah96 gautamworah96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not added any new tests. This new Weight#count implementation merely changes the flow of the code and not the actual correctness itself. The upstream caller aka IndexSearcher#count still has numerous tests for PointRangeQuery in TestPointQueries

…BiFunction and Predicate for internal node and leaf node respectively. Simplify code overall
@gautamworah96 gautamworah96 marked this pull request as ready for review February 10, 2022 10:16
Copy link
Contributor

@iverase iverase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is very close. I think we should add a test for this change and then we are ready to merge.

Copy link
Contributor

@iverase iverase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@iverase
Copy link
Contributor

iverase commented Feb 15, 2022

I will push this tomorrow if it is ok with you @gautamworah96

@gautamworah96
Copy link
Contributor Author

Yes. I was wondering why you had kept it open :)

@iverase iverase merged commit dd25fab into apache:main Feb 16, 2022
iverase pushed a commit to iverase/lucene that referenced this pull request Feb 16, 2022
Implement Weight#count for PointRangeQuery to provide a faster way to calculate
the number of matching range docs when each doc has at-most one point and the 
points are 1-dimensional.
@gautamworah96
Copy link
Contributor Author

gautamworah96 commented Feb 16, 2022

@iverase I see you already opened a backport PR. Thanks! I'll approve that as well

iverase added a commit that referenced this pull request Feb 16, 2022
Implement Weight#count for PointRangeQuery to provide a faster way to calculate
the number of matching range docs when each doc has at-most one point and the 
points are 1-dimensional.
@gautamworah96 gautamworah96 deleted the LUCENE-10378 branch February 23, 2022 21:56
dantuzi pushed a commit to SeaseLtd/lucene that referenced this pull request Mar 10, 2022
Implement Weight#count for PointRangeQuery to provide a faster way to calculate
the number of matching range docs when each doc has at-most one point and the 
points are 1-dimensional.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants