Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LUCENE-10456: Implement Weight#count for MultiRangeQuery #731

Merged
merged 13 commits into from Apr 5, 2022

Conversation

wjp719
Copy link
Contributor

@wjp719 wjp719 commented Mar 5, 2022

Description

for one dimension MultiRangeQuery, we can firstly merge overlapping ranges to have some unconnected ranges, then the doc count of this multiRangeQuery is the sum of each doc count of these unconnected range. For each unconnected range, we can take advantage of PointRangeQuery count() method.

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Lucene maintainers access to contribute to my PR branch. (optional but recommended)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.

@wjp719 wjp719 changed the title LUCENE-10456: support MultiRangeQuery#count() LUCENE-10456: Implement Weight#count for MultiRangeQuery Mar 5, 2022
@wjp719
Copy link
Contributor Author

wjp719 commented Mar 15, 2022

@iverase Hi, can you help to review this pr, thanks

Copy link
Contributor

@gautamworah96 gautamworah96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spotting this opportunity and submitting a patch @wjp719. I've added some comments regarding tests, general code flow etc. The range merging logic, idea overall looks sound to me.

I would also pre-emptively run ./gradlew cleanTest test --tests TestClassName.TestMethod -Ptests.iters=<some_large_num> to catch any edge cases we may have missed in the review by forcing gradle to "beast" out multiple runs of the same test method

* main: (38 commits)
  remove obsolete image/description from luke/README.md
  Upgrade to forbiddenapis 3.3 (apache#768)
  LUCENE-10393: Unify binary dictionary and dictionary writer in kuromoji and nori (apache#740)
  LUCENE-9651 Update benchmark module docs (apache#759)
  LUCENE-10458: BoundedDocSetIdIterator may supply error count in Weigth#count(LeafReaderContext) when missingValue enables (apache#736)
  LUCENE-10481: FacetsCollector will not request scores if it does not use them (apache#760)
  LUCENE-10477: mention 'call multiple times' in Query.rewrite javadoc (apache#758)
  Add back-compat indices for 9.1.0.
  Synchronize CHANGES.
  LUCENE-10464, LUCENE-10477: WeightedSpanTermExtractor.extractWeightedSpanTerms to rewrite sufficiently (apache#737)
  Add version 9.1.0.
  DOAP changes for release 9.1.0
  LUCENE-10422: Make errorprone happy
  LUCENE-10478: mark Test4GBStoredFields as @monster (apache#757)
  LUCENE-10422: Read-only monitor implementation (apache#679)
  LUCENE-10473: Make tests a bit faster when running nightly. (apache#754)
  LUCENE-9905: Fix check in TestPerFieldKnnVectorsFormat#testMergeUsesNewFormat
  LUCENE-9614: Fix rare TestKnnVectorQuery failures
  LUCENE-10472: Fix TestMatchAllDocsQuery#testEarlyTermination (apache#753)
  LUCENE-10418: Move CHANGES to the correct section.
  ...
@wjp719
Copy link
Contributor Author

wjp719 commented Mar 28, 2022

@gautamworah96 Hi, I have refactor the code as you suggest, please help to review it again, thanks.

Copy link
Contributor

@gautamworah96 gautamworah96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some comments regarding javadocs and tests. We are getting closer! Thanks for persisting @wjp719. Could you also resolve some comments of mine that you have already addressed?

@wjp719
Copy link
Contributor Author

wjp719 commented Mar 29, 2022

@gautamworah96 I refactor the code again, please review it again, thanks

Copy link
Contributor

@gautamworah96 gautamworah96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks close to completion! I've added some minor comments on the javadoc etc.
Can you verify that the gradle beast run for the two new tests we are adding here passes successfully (command posted in a previous review comment)? Just run them both for say 2000 iterations...

Tagging some project maintainers for more feedback @iverase @jpountz

@wjp719
Copy link
Contributor Author

wjp719 commented Mar 29, 2022

@gautamworah96 thank you for your patient reviews. I have run the two new added tests 2000 iterations and all passed.

@jpountz @iverase please help to review this pr, thanks.

@wjp719 wjp719 requested a review from jpountz March 30, 2022 03:26
@wjp719
Copy link
Contributor Author

wjp719 commented Mar 30, 2022

@jpountz Hi, I have refactor the code according to your suggestions, please review it again, thanks.

@jpountz
Copy link
Contributor

jpountz commented Mar 30, 2022

I just looked at the cloning logic, it looks good to me, thanks for making this change!

@wjp719 wjp719 requested a review from jpountz March 30, 2022 11:26
@wjp719
Copy link
Contributor Author

wjp719 commented Mar 30, 2022

@jpountz I refactor the test cases, please help to review again, thanks a lot.

@wjp719 wjp719 requested a review from madrob March 30, 2022 15:09
Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some minor comments but otherwise it looks good to me.

@jpountz
Copy link
Contributor

jpountz commented Mar 31, 2022

Can you also add a CHANGES entry under version 9.2?

@wjp719 wjp719 requested a review from jpountz March 31, 2022 10:55
@wjp719
Copy link
Contributor Author

wjp719 commented Mar 31, 2022

@jpountz I have added the CHANGES log, thanks

@wjp719
Copy link
Contributor Author

wjp719 commented Apr 5, 2022

@jpountz please help to merge this pr, thanks

@jpountz jpountz merged commit 898ec16 into apache:main Apr 5, 2022
@wjp719 wjp719 deleted the feature/multi_range_count branch April 5, 2022 07:29
wjp719 added a commit to wjp719/lucene that referenced this pull request Apr 5, 2022
* main:
  LUCENE-10456: Implement Weight#count for MultiRangeQuery (apache#731)
  LUCENE-10484: Move CHANGES entry to 9.2.
  LUCENE-10484: Add support for concurrent facets random sampling (apache#765)
  LUCENE-10002: Replaces usages of FacetsCollector with FacetsCollectorManager (apache#764)
  LUCENE-10466: Move changelog entry to 9.2
  LUCENE-10466: Ensure IndexSortSortedNumericDocValuesRangeQuery handles sort types besides LONG
  LUCENE-10498: don't count num hits nor accumulate scores unless necessary (apache#782)
  LUCENE-10184: mention of opening a Jira issue (apache#781)
  Update link to contribution guide
  Remove redundant index (apache#776)
  Implement method to bulk add all collection elements to a PriorityQueue (apache#770)
  LUCENE-10491: Fix correctness bug in TaxonomyFacetSumValueSource score providing (apache#775)
  Replace usages of search(Query, Collector) in CheckHits (apache#763)
  LUCENE-10002: Add FixedBitSetCollector and corresponding collector manager to test framework (apache#766)
  LUCENE-10475: Merge o.a.l.a.[ja|ko].util into o.a.l.a.[ja|ko].dict (apache#772)
  LUCENE-10184: add CONTRIBUTING.md; reorganize README. (apache#771)
  Add CHANGES entry for LUCENE-10325
  LUCENE-10325: Add getTopDims functionality to Facets (apache#747)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants