Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LUCENE-10414: Add fn:fuzzyTerm interval function to flexible query parser #668

Merged
merged 4 commits into from
Feb 10, 2022

Conversation

dweiss
Copy link
Contributor

@dweiss dweiss commented Feb 9, 2022

No description provided.

* <li>{@link #multiterm(CompiledAutomaton, String)}
* </ul>
*/
public static final int DEFAULT_MAX_EXPANSIONS = 128;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've extracted this as a constant. It's functionally the same but it's good to have a reference to the defaults that you can rely on in external code (other than javadoc just mentioning it).


public Wildcard(String wildcard) {
public Wildcard(String wildcard, int maxExpansions) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit piggybacks an additional parameter maxExpansions to fn:wildcard function.

@@ -144,7 +144,7 @@ public ScoreMode scoreMode() {
}

/**
* Tests that a query matches the an expected set of documents using Hits.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a test code cleanup. I changed the requested hits count to be twice the requested size (and not an arbitrary 1000) and used smaller iterators in the code.


@Override
public IntervalsSource toIntervalSource(String field, Analyzer analyzer) {
var fuzzyQuery = new FuzzyQuery(new Term(field, term), maxEdits);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could move this bit to Intervals... the problem is that I wanted to reuse the same logic FuzzyQuery does and it requires a field reference (internally it doesn't but the fuzzy automaton builder is package-private). So we'd have to open up something to be able to reference this machinery from within a method like Intervals.fuzzyTerm().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep we do this all over the place in ES as well. Maybe we should just make FuzzyAutomatonBuilder public?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should just make FuzzyAutomatonBuilder public?

I'm +1 for this as it'd make the API more complete - we could move this logic to Intervals.fuzzyTerm(). I can make FuzzyAutomatonBuilder public or make a public static method exposing its result (an automaton for a term, maxEdits pair) on FuzzyQuery. Let me know which one works for you better and I'll reshape the code under this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A static method on FuzzyQuery fits with what we have elsewhere for PrefixQuery and WildcardQuery, let's do that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have already - see commit 86c9756

Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks @dweiss!

@dweiss dweiss merged commit f6cebac into apache:main Feb 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants