You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 22, 2019. It is now read-only.
While using TokenParser to correct typos in identifiers I constantly bump into mistakes like HTMLElement -> htmle lement.
To me it looks like in that case (several uppercase letters in a row) it would be better to add the last letter to the next token. I've seen many cases when this would be wise, and almost no when it would break the logic.
E.g. token 'lement' is one of the most frequent typoed ones that gets to be split-out. And here's where it comes from (top-10 examples):
While using TokenParser to correct typos in identifiers I constantly bump into mistakes like
HTMLElement -> htmle lement
.To me it looks like in that case (several uppercase letters in a row) it would be better to add the last letter to the next token. I've seen many cases when this would be wise, and almost no when it would break the logic.
E.g. token 'lement' is one of the most frequent typoed ones that gets to be split-out. And here's where it comes from (top-10 examples):
And here're the right parses for comparison:
TLDR Can I add this case to the TokenParser? It will be possible to switch it off in the beginning, and I would want to try it with typos.
The text was updated successfully, but these errors were encountered: