Broken keyboard groups logic for initial edits #41

tvquizphd · 2021-01-08T05:55:35Z

To save time, you can skip the examples and go straight to the solution.

I've found a longstanding bug in nspell/main in the suggest function. In short, lines 93 and 94 are to blame.

Unexpectedly, a position from each group is used to slice the value. This is like slicing oranges with an apple slicer. I could show the problem for any word, at any point as we iterate through value and iterate through groups.

#

A problem in long words

set value === "Frivolouy"
loop to value[8] === "y" and group === "sy"
- we find position === 1 because group[1] === "y"
so, we slice "Frivolouy" at position 1 and get "Fsivolouy".

Here is the main problem with edits like "Fsivolouy" instead of "Frivolous". In our replacement for value[8], we actually replaced value[1]! So, we sliced at the wrong index. What's the big deal? Well, none of the replacements in the edit phase will ever change that "y" end of "Frivolouy".

In fact, the "edit phase" only touches the first few characters in words.

#

A more severe problem in short words

set value === "Bo"
iterate to value[0] === "B" and group === "yxcvbnm"
- we find position === 4 because group[4] === "b"
so, we slice "Bo" at position 4... (past the end) and get words like "BoY" and "BoX".

This problem actually results in "BoY" and "BoX" in the final result of suggest, falling in line with the capitalization issue I found a few days ago. But it shows another problem beyond capitalization...

The "edit phase" often appends to short words instead of replacing characters.

#

So what?

I'll admit that the generate function tends to operate just fine despite the strange and widespread bugs in edits. However, I'm a big nerdo and wanted to do something to add method to the madness.

A logical fix

This one commit simply altered two lines of code to fix the long-word problem, the short-word problem, and the capitalization issue.

However, the fix caused the failure of three tests. Two of the tests only failed with changes to the order of results, but test #46 failed big time due to the test's reliance on the combination of the long-word problem and the capitalization issue.

A compromise

I've added a new commit that matches all the tests (except #46) by reproducing the long-word problem without the confusing slice of value based on position within each group. I believe it more clearly expresses what's really going on with the keyboard replacements. As I've mentioned in a comment on PR #39, I plan to rework this approach to be even more explicit about the fact that the "edit phase" only touches the first few characters in words.

What are the benefits?

Namely, we can completely fix the short-word problem as well as the capitalization issue. To keep compatibility with the tests, I believe it is wise to allow the long-word problem to remain as an "optimization": we can be explicit that we only visit the first few characters of every word (characters 0-8 with defaultKeyboardLayout) when making the edits.

The text was updated successfully, but these errors were encountered:

wooorm · 2021-01-08T07:57:17Z

Thanks for finding that! I’m not sure how that bug slipped in!

I’m fine with removing the broken code, changing a few orderings of the suggestions, and even otherwise changing suggestions if they‘re much better?

tvquizphd changed the title ~~Broken keyboard groups logic for suggesting edits~~ Broken keyboard groups logic for initial edits Jan 8, 2021

This was referenced Jan 8, 2021

Unexpected capital letters returned for certain capitalized misspellings #37

Closed

Fix caps, keyboard group logic #39

Merged

Unexpected capital letters returned for certain capitalized misspellings retextjs/retext-spell#20

Closed

This was referenced Jan 9, 2021

Reduce backwards compatibility in favor of more logical edits tvquizphd/nspell#1

Merged

Fix keyboard edits, fix capitalization, and update tests #45

Merged

Fix keyboard edits #42

Closed

wooorm closed this as completed in 9c002d2 Jan 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Broken keyboard groups logic for initial edits #41

Broken keyboard groups logic for initial edits #41

tvquizphd commented Jan 8, 2021 •

edited

Loading

wooorm commented Jan 8, 2021

Broken keyboard groups logic for initial edits #41

Broken keyboard groups logic for initial edits #41

Comments

tvquizphd commented Jan 8, 2021 • edited Loading

A problem in long words

In fact, the "edit phase" only touches the first few characters in words.

A more severe problem in short words

The "edit phase" often appends to short words instead of replacing characters.

So what?

A logical fix

A compromise

What are the benefits?

wooorm commented Jan 8, 2021

tvquizphd commented Jan 8, 2021 •

edited

Loading