Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move addNode to FSTCompiler #12646

Merged
merged 5 commits into from
Oct 10, 2023
Merged

Move addNode to FSTCompiler #12646

merged 5 commits into from
Oct 10, 2023

Conversation

dungba88
Copy link
Contributor

@dungba88 dungba88 commented Oct 10, 2023

Description

Currently FSTCompiler and FST has a circular dependencies to each other. FSTCompiler creates an instance of FST, and on adding node (add(IntsRef input, T output)), it delegates to FST.addNode() and passing itself as a variable. This introduces a circular dependency and mixes up the FST constructing and traversing code.

To make matter worse, this implies one can call FST.addNode with an arbitrary FSTCompiler (as it's a parameter), but in reality it should be the compiler which creates the FST.

This PR move the addNode method to FSTCompiler instead.

@romseygeek
Copy link
Contributor

Thanks for opening @dungba88! This FST building code is very hairy and this is a nice start at cleaning it up.

Given how expert this code is and that the relevant methods are all package-private I don't see a problem with backporting this to 9x - what do you think @mikemccand?

Copy link
Member

@mikemccand mikemccand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great cleanup! I just had one question about a newly public method.

Thanks @dungba88!

/**
* Gets the number of bytes required to flag the presence of each arc in the given label range,
* one bit per arc.
*/
private static int getNumPresenceBytes(int labelRange) {
public static int getNumPresenceBytes(int labelRange) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this package private instead (remove public)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good catch! I have removed it

@mikemccand
Copy link
Member

Given how expert this code is and that the relevant methods are all package-private I don't see a problem with backporting this to 9x - what do you think @mikemccand?

+1 to backport to 9x -- this is nice refactoring that does not change any API and is low risk.

In general, we should strive to backport all changes to 9.x by default unless there is a solid reason not to (e.g. unavoidable / complex API break).

Thanks @dungba88 and @romseygeek!

@dungba88
Copy link
Contributor Author

+1 to backport to 9x -- this is nice refactoring that does not change any API and is low risk.

I can do this. Wondering if creating a PR to lucene-9_x branch would suffice?

@romseygeek
Copy link
Contributor

I'm happy to merge and backport @dungba88. Can you also add an entry to CHANGES.txt in the 9.9 section?

@dungba88
Copy link
Contributor Author

Thanks @romseygeek I have added the entry (under API change section).

@romseygeek romseygeek merged commit 04f38dd into apache:main Oct 10, 2023
4 checks passed
@romseygeek
Copy link
Contributor

Thanks @dungba88!

romseygeek pushed a commit that referenced this pull request Oct 10, 2023
Currently FSTCompiler and FST have circular dependencies to each
other. FSTCompiler creates an instance of FST, and on adding node
(add(IntsRef input, T output)), it delegates to FST.addNode() and passes
itself as a variable. This introduces a circular dependency and mixes up
the FST constructing and traversing code.

To make matter worse, this implies one can call FST.addNode with an
arbitrary FSTCompiler (as it's a parameter), but in reality it should be
the compiler which creates the FST.

This commit moves the addNode method to FSTCompiler instead.

Co-authored-by: Anh Dung Bui <buidun@amazon.com>
@dungba88 dungba88 deleted the fst-refactor branch November 18, 2023 13:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants