Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add WrappedCandidateMatcher for composing matchers #13109

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

bjacobowitz
Copy link

Description

Create a WrappedCandidateMatcher, along with a factory for creating these objects and a default implementation of the factory, to assist users in creating their own composite matchers.

Problem

The current access protections on CandidateMatcher make it infeasible for a user of the Lucene Monitor library to create their own composite CandidateMatcher (for example, an alternate version of ParallelMatcher) on top of existing CandidateMatcher instances if they are outside the package.

As an example, the existing implementation of ParallelMatcher makes use of delegate CandidateMatcher instances, but a user would not be able to write their own version of this in their own package, because matchQuery(), finish() and reportError() all have package-private access (see this gist, comments labeled "PROBLEM").

Solution

The simplest solution to this problem would be to increase the visibility of those functions in CandidateMatcher (either to protected or public), but it runs the risk of violating encapsulation, so I've tried to avoid that here.

The solution I've put together here introduces a WrappedCandidateMatcher, which can be used to increase visibility when creating a composite matcher. WrappedCandidateMatcher does NOT extend from CandidateMatcher but has nearly the same interface, with increased visibility on matchQuery(), finish() and reportError(). A user building a new CandidateMatcher by composing existing CandidateMatcher instances could make full use of those CandidateMatchers by wrapping them with WrappedCandidateMatcher, allowing them to be used internally to the composite matcher (see this gist, comments labeled "SOLUTION").

On a certain level, this solution also violates existing encapsulation, but it requires the user to jump through a few hoops to get there, so it would limit the potential for misuse of a CandidateMatcher. If we would instead prefer to increase the visibility on CandidateMatcher, let me know and I can do that instead.

Merge Request

If this PR gets merged, can you please use my bjacobowitz1@bloomberg.net email address for the squash+merge. Thank you.

Copy link

github-actions bot commented Mar 1, 2024

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the dev@lucene.apache.org list. Thank you for your contribution!

@github-actions github-actions bot added the Stale label Mar 1, 2024
@cpoerschke
Copy link
Contributor

Thanks @bjacobowitz for a very detailed pull request here!

I'm not really familiar with this area of the code but intuitively would be curious what the alternative (you mentioned) of increasing the visibility on CandidateMatcher would look like.

@romseygeek
Copy link
Contributor

The protected visibility on matchQuery() should already be fine here - you can override or call protected methods from within subclasses. I think making reportError() and finish() protected final would be the neatest solution.

@github-actions github-actions bot removed the Stale label Mar 2, 2024
Copy link

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the dev@lucene.apache.org list. Thank you for your contribution!

@github-actions github-actions bot added the Stale label Mar 16, 2024
@bjacobowitz
Copy link
Author

The protected visibility on matchQuery() should already be fine here - you can override or call protected methods from within subclasses. I think making reportError() and finish() protected final would be the neatest solution.

@romseygeek On further reflection, I think protected visibility may not be sufficient for a practical solution here, even when using subclasses. I'll again use ParallelMatcher as an example, since I think it's a fairly representative use case.

ParallelMatcher takes in a MatcherFactory to create instances of CandidateMatcher, with the intent to take advantage of the matchQuery implementation in the factory's generated instances of CandidateMatcher. If we try to reimplement ParallelMatcher outside the Lucene Monitor package, we can't call matchQuery on the factory-built CandidateMatcher from outside (i.e. on an instance other than this) because of matchQuery's protected access -- the only reason the existing ParallelMatcher implementation can do this is because it is in the Lucene Monitor and therefore has package-local access to the protected functions on another object in the same package.

To access and invoke the implementation of matchQuery, we would need to create a subclass of the concrete CandidateMatcher implementation generated by the factory, which isn't directly possible, because the factory pattern creates a situation where the concrete class isn't known.

We could approach a solution using an alternate factory implementation that would return a subclass of a known concrete type with hidden functions exposed. For example, we could create a subclass of CollectingMatcher (let's call it ExposedCollectingMatcher) with a public function exposedMatchQuery which calls this.matchQuery(), and then we could create a factory to return that subclass (it might be called ExposedCollectingMatcherFactory).

The problem with that approach is that it breaks down in cases where the concrete implementation is not available to be subclassed, as is the case for several anonymous CandidateMatcher implementations (e.g. instances generated by MATCHER factory in ExplainingMatch).

Ideally we would like a factory that returns a more generic class that extends the functionality of a CandidateMatcher, which is what WrappedCandidateMatcher provides.

Of course another solution would be to simply make these functions public, but I understand the desire to avoid that if possible.

@github-actions github-actions bot removed the Stale label Mar 26, 2024
Copy link

github-actions bot commented Apr 9, 2024

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the dev@lucene.apache.org list. Thank you for your contribution!

@github-actions github-actions bot added the Stale label Apr 9, 2024
@cpoerschke
Copy link
Contributor

... Of course another solution would be to simply make these functions public, but I understand the desire to avoid that if possible.

Again, I'm not really familiar with this area of the code but would be curious what making the functions public would look like and if making them public is necessarily problematic.

And echo-ing @mkhludnev's apache/solr#2315 (comment) comment w.r.t. just-compile test to retain extensibility in future.

@github-actions github-actions bot removed the Stale label Apr 11, 2024
Copy link

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the dev@lucene.apache.org list. Thank you for your contribution!

@github-actions github-actions bot added the Stale label Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants