Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lookupCompound() doesn't allow to look for 2 correctly spelled terms with only missed space #53

Open
aTan-aka-Xellos opened this issue Jun 10, 2020 · 0 comments

Comments

@aTan-aka-Xellos
Copy link

Precondition:
SpellCheckSettings is initiated with maxEditDistance > 0.

I want to separately cover corner case with missed space, but only between correct words (maxEditdistance=0 for each word separately).
But it's impossible to do with the same SymSpell if it was created with SpellCheckSettings with maxEditDistance > 0.

To cover the case with missed space, lookupCompound() has method lookupSplitWords().
Inside it split a word into part1 and part2.
For each lookup() is called. It has the following code:

    if (maxEditDistance <= 0) {
      maxEditDistance = spellCheckSettings.getMaxEditDistance();
    }

Now the scenario:
query: {applewatch}

Scenario:
I want to lookup for missed space between only correctly spelled words, which means maxEditDistance = 1 (missed space).

With the current implementation, SymSpell will look for extra space between 2 words with additional edit distance by 1. And there is no way to prevent this.
Total maxEditDistance:
lookup(part1, maxEditDistance) = 1
lookup(part2, maxEditDistance) = 1
lookupCompaund(part1+part1) = 1
Total = 3

Depending on the dictionary following results are possible:
{apple watch}, editDistance = 1
{apple patch}, editDistance = 2 (if watch is not present in the dictionary)
{apply patch}, editDistance = 3 (if both apple and watch is not present in the dictionary)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant