-
-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected capital letters returned for certain capitalized misspellings #37
Comments
Awesome, thanks for digging in and doing work on these two PRs! I see that both PRs are WIP: what can we do to make them actionable? How can I help? |
The most helpful feedback you could give right now would be how I should approach the test cases:
After that, I'd like to hear your thoughts on what seems to be broken keyboard groups logic. For example, the input I've only set the WIP status because the tests are failing. If backwards compatibility for certain arbitrary test cases is not an issue, then we would just need to update the tests. PR #38 is failing the tests less severely than PR #39, though I think PR #39 makes more logical sense. I've spent some time today looking into how to make PR #39 match the tests more closely, and it really seems that one test case in particular (test #46) is really reliant on replacements made by what seems like a buggy implementation of replacements based on keyboard groups. |
In relation to a new issue I've raised, check out updates I've made on PR #39. I think it's almost ready! |
For steps to quickly reproduce with a related gist, see the original issue I've opened on retext-spell.
TLDR: see PR 38 and PR 39 that I've opened against this repo.
Background
I have found over 200 5-letter dictionary words for which
retext-spell
produces suggestions such as the following:/^Thre[f-ln-racuvxyz]$/
/^Ma[d-ho-raluwxz]or$/
/^Ch[d-hj-np-xabz]me$/
There is nothing to suggest this problem is limited to 5-letter dictionary words, as I've stumbled across the following:
/^Tinp[bcdfhlmqvx]$/
One example I've tested
a
is "t" at index 0before
="Twin" andafter
= ""b
="e"This is how we arrive at the confusing suggestion of "TwinE".
Problem
When creating the initial array of edits based on keyboard groups, the current logic of
nspell
is as follows:a
a
is uppercase in the input worda
a
within the groupa
's position within the group (note, this defaults to splitting at the very end of the word when the character's position in the group exceeds the length of the input word)b
!=a
b
if the input charactera
is uppercaseThis results in uppercase characters inserted at seemingly random positions in the text because the position of character
a
within the group is not the index of charactera
within the input word.Note, the notion of "Keyboard groups" doesn't really make sense in the current approach in
main
. Thesuggest
function inserts each replacement character at a position (ofa
in group) that is unrelated to the original position (ofa
in word). This "root problem" is ignored in PR 38 and addressed in PR 39.The text was updated successfully, but these errors were encountered: