Issue with compound search #49

nagavardhan1 · 2018-10-31T17:49:56Z

if I search "whatareyou" it is giving "what you"

wolfgarbe · 2018-10-31T21:12:05Z

LookupCompound can insert only a single space into a token (string fragment separated by existing spaces). It is intended for spelling correction of word segmented text but can fix an occasional missing space. There are fewer variants to generate and evaluate because of the single space restriction per token. Therefore it is faster and the quality of the correction is usually better.

WordSegmentation can insert as many spaces as required into a token. Therefore it is suitable also for long strings without any space. The drawback is a slower speed and correction quality, as many more potential variants exist, which need to be generated, evaluated and chosen from.

nagavardhan1 · 2018-11-01T12:14:10Z

Is there a way I can increase it to 2 spaces instead of one?

aashish-amber-abz · 2018-11-02T20:17:00Z

@wolfgarbe I can't find any port in python for WordSegmentation, all the ports listed by you on the read me page is for LookupCompound. Do you know any port for WordSegmentation in python ?

wolfgarbe · 2018-11-03T19:42:38Z

@aashish-amber-abz No, I don't know a Python port of SymSpell which includes WordSegmentation.
But there are other word segmentation approaches in Python available (word segmentation only, without spelling correction) e.g.: https://github.com/grantjenks/python-wordsegment

wolfgarbe · 2018-11-03T19:53:13Z

@nagavardhan1 There is no easy way to increase the number of spaces for LookupCompound (for performance reasons). The algorithm requires significant modification. The modified algorithm, which can deal with a unlimited number of spaces to be inserted is called WordSegmentation. Please note, that it can still correct spelling errors.

wolfgarbe closed this as completed Oct 31, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with compound search #49

Issue with compound search #49

nagavardhan1 commented Oct 31, 2018

wolfgarbe commented Oct 31, 2018 •

edited

nagavardhan1 commented Nov 1, 2018

aashish-amber-abz commented Nov 2, 2018

wolfgarbe commented Nov 3, 2018

wolfgarbe commented Nov 3, 2018

Issue with compound search #49

Issue with compound search #49

Comments

nagavardhan1 commented Oct 31, 2018

wolfgarbe commented Oct 31, 2018 • edited

nagavardhan1 commented Nov 1, 2018

aashish-amber-abz commented Nov 2, 2018

wolfgarbe commented Nov 3, 2018

wolfgarbe commented Nov 3, 2018

wolfgarbe commented Oct 31, 2018 •

edited