Skip to content

Conversation

rmuir
Copy link
Member

@rmuir rmuir commented Feb 14, 2025

When the user opts-in to tolerateAffixRuleCountMismatches(), it tolerates some, but not all affix mismatches. Support two cases:

"EOF case": count is wrong and then we encounter end of file. previously the exception looked like this:

java.text.ParseException: Premature end of rules for SFX B Y 2

"stampede case": in this case count is wrong and another instruction follows (such as another affix's rule).
Previously, the exception looked like this:

java.text.ParseException: Affix rule mismatch. Header: SFX A Y 2; rule: SFX B Y 2

This PR should fix the issue in CI with the spanish dictionary (I think, we'll see). I will followup with the upstream.

…ems.

When the user opts-in to tolerateAffixRuleCountMismatches(), it tolerates
some, but not all affix mismatches. Support two cases:

"EOF case": count is wrong and then we encounter end of file.
previously the exception looked like this:

  java.text.ParseException: Premature end of rules for SFX B Y 2

"stampede case": in this case count is wrong and another instruction
follows (such as another affix's rule).
Previously, the exception looked like this:

  java.text.ParseException: Affix rule mismatch. Header: SFX A Y 2; rule: SFX B Y 2
@rmuir
Copy link
Member Author

rmuir commented Feb 14, 2025

using the mark()/reset() like this can be invitation for trouble, but the situation is contained: the parser will always make forward progress so it can't go crazy or infinite. Also, the user opted into the lenient parse.

Copy link
Contributor

@donnerpeter donnerpeter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thank you!

@rmuir rmuir added this to the 10.2.0 milestone Feb 14, 2025
@rmuir rmuir merged commit 5b3b6af into apache:main Feb 15, 2025
7 checks passed
rmuir added a commit to rmuir/lucene that referenced this pull request Feb 15, 2025
…ems (apache#14239)

When the user opts-in to tolerateAffixRuleCountMismatches(), it tolerates
some, but not all affix mismatches. Support two cases:

"EOF case": count is wrong and then we encounter end of file.
previously the exception looked like this:

  java.text.ParseException: Premature end of rules for SFX B Y 2

"stampede case": in this case count is wrong and another instruction
follows (such as another affix's rule).
Previously, the exception looked like this:

  java.text.ParseException: Affix rule mismatch. Header: SFX A Y 2; rule: SFX B Y 2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants