Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make ConjunctionDISI package-private and add ConjunctionUtils factory class #148

Merged
merged 6 commits into from
May 25, 2021

Conversation

romseygeek
Copy link
Contributor

ConjunctionDISI is really an internal implementation of DocIdSetIterator,
and would ideally be package-private. However, it is used in a few
other places:

  • directly in ConjunctionSpans
  • as a utility in the facet and join modules

This commit refactors Spans to use their own copy of ConjunctionDISI,
and adds a public helper class ConjunctionUtils that allows easy
intersection of iterators for use by other modules. This means that
ConjunctionDISI itself can become package-private. It also removes
a reference to Spans from core classes, which will make it easier to
migrate Spans to the queries module.

@romseygeek romseygeek requested a review from jpountz May 22, 2021 15:20
@romseygeek romseygeek self-assigned this May 22, 2021
Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Let's add tests for the Spans fork of ConjunctionDISI?

@mikemccand
Copy link
Member

+1 to this change -- looks great! But:

refactors Spans to use their own copy of ConjunctionDISI

it looks like you didn't have to do this (fork / make a full copy of ConjunctionDISI for spans)? It looks like spans just also use the new (public) ConjunctionUtils, which provide (safe) public access to the package-private ConjunctionDISI?

@romseygeek
Copy link
Contributor Author

It looks like spans just also use the new (public) ConjunctionUtils, which provide (safe) public access to the package-private ConjunctionDISI?

Unfortunately no, if we do this then Spans don't use two-phase iteration any more. What might work instead though would be to make ConjunctionDISI.createConjunction(List<DocIdSetIterator>,List<TwoPhaseIterator>) public through ConjunctionUtils - I'll give that a try and report back!

@romseygeek
Copy link
Contributor Author

We need to add three more methods to ConjunctionUtils but it does mean that we can remove the specialised Spans class, so I think it's a win. It should also be possible to re-use these for intervals as well, which have a similar specialised implementation.

@romseygeek
Copy link
Contributor Author

I've pushed the intervals change as well, turns out it's very simple and allows us to remove more duplication.

Copy link
Member

@mikemccand mikemccand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks @romseygeek!

@romseygeek romseygeek merged commit 5e0e7a5 into apache:main May 25, 2021
@romseygeek romseygeek deleted the conjunctionutils branch May 25, 2021 11:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants