Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spell checker is active for partial words. #166

Closed
vithiri opened this issue Oct 18, 2017 · 8 comments
Closed

Spell checker is active for partial words. #166

vithiri opened this issue Oct 18, 2017 · 8 comments
Labels
Milestone

Comments

@vithiri
Copy link

vithiri commented Oct 18, 2017

I am not certain if this is a feature or a bug, but the spell checker is active even when a word has not been completed.

If I type "paragraph", the spell checker will kick in and highlight the word at "parag" since it's not a proper word (yet).

Could a better behaviour be to only check words for spelling once a space or a special character has been added by the end of it?

@olivierkes olivierkes added this to the Future milestone Oct 18, 2017
@olivierkes
Copy link
Owner

Good point. I tried a few options, but wasn't able to find a satisfying one.

The relevant code is this:

# Spell checking
# Based on http://john.nachtimwald.com/2009/08/22/qplaintextedit-with-in-line-spell-check/
WORDS = '(?iu)[\w\']+'
if hasattr(self.editor, "spellcheck") and self.editor.spellcheck:
for word_object in re.finditer(WORDS, text):
if self.editor._dict and not self.editor._dict.check(word_object.group()):

WORDS = '(?iu)([\w\']+)\W' is the regex used to match words ((?iu) means: case insensitive, unicode). Now if I use '(?iu)[\w\']+\W', so I add "followed by a non-word character" \W (code has to be slightly modified, see below), then it will work as desired when typing, but it won't spellcheck words at the end of line:

image

(Code has to be modify like that:)

WORDS = '(?iu)([\w\']+)\W'  # ← added a capturing group on the word
if hasattr(self.editor, "spellcheck") and self.editor.spellcheck:
    for word_object in re.finditer(WORDS, text):
        if self.editor._dict and not self.editor._dict.check(word_object.group(1)):  # ← spell check the group

Maybe somebody has a better idea?

@olivierkes
Copy link
Owner

olivierkes commented Oct 19, 2017

Maybe somebody has a better idea?

Well actually I do... I add a space at the end of every checked line, except if the text cursor is there:

# Following algorithm would not check words at the end of line.
# This hacks adds a space to every line where the text cursor is not
# So that it doesn't spellcheck while typing, but still spellchecks at
# end of lines. See github's issue #166.
textedText = text
if self.currentBlock().position() + len(text) != \
self.editor.textCursor().position():
textedText = text + " "

Please test (on branch future) and confirm whether that behavior is better or not :)

@vithiri
Copy link
Author

vithiri commented Oct 19, 2017

The behavior is clearly much better now, but an observation would be that it extends the wavy line to include the word terminator as well, whether that's the tailing space or a non word character -- but seemingly only the first one.

spellchecker

Here is a word that is behaving strangely, most likely due to the "special character" actually being a part of the word -- the wavy line goes away after adding a space by the end of the word:

shouldnt

@olivierkes
Copy link
Owner

it extends the wavy line to include the word terminator as well (...)

Yep, stupid mistake. Fixed :)

Here is a word that is behaving strangely (...)

Previous commit should fix that too:

test

It showed me another issue: in some cases, right-click does not select the same word to suggest corrections:

test

Memo: I need to correct that here: WordUnderCursor does not allow for quotation mark ' it seems:

# Select the word under the cursor.
# But only if there is no selection (otherwise it's impossible to select more text to copy/cut)
cursor = self.textCursor()
if not cursor.hasSelection():
cursor.select(QTextCursor.WordUnderCursor)
self.setTextCursor(cursor)

@vithiri
Copy link
Author

vithiri commented Oct 20, 2017

I noticed that both FocusWriter and Gedit suffer from the same issue where the word will be underlined as misspelled if i.e. shouldn' is typed, so perhaps this behavior is simply inherent in the spell checking.

@olivierkes olivierkes modified the milestones: Future, 0.6.0 Nov 14, 2017
@olivierkes
Copy link
Owner

(That was closed in 0.5.0 but I forgot to close it)

@olivierkes olivierkes modified the milestones: 0.6.0, 0.5.0 Nov 14, 2017
@vithiri
Copy link
Author

vithiri commented Sep 18, 2019

@gedakc @olivierkes Did this issue nestle its way back in again at some point?

@gedakc
Copy link
Collaborator

gedakc commented Sep 18, 2019

Good catch @vithiri. I preformed a git bisect and it appears that the problem with the spell checker being active while typing a word came back in 0.7.0 with commit 63b471e which fixed issue #283.

I will take a look at this.

gedakc added a commit to gedakc/manuskript that referenced this issue Sep 19, 2019
See PR #<to-be-inserted-later>

This commit restores the functionality that prevents spell checking a
word that is being actively typed at the end of a paragraph.

The goals for the spell check word match regexp are:

A. Words should include those with an apostrophe
   *E.g., can't*
B. Words should exclude underscore
   *E.g., hello_world is two words*
C. Words in other languages should be recognized
   *E.g., French word familiarisé*
D. Spell check should include word at absolute end of line with no
   trailing space or punctuation
   *E.g., tezt*
E. Spell check should ignore partial words in progress (user typing)
   *E.g., paragr while midway through typing paragraph*

This commit addresses all five of the above goals.

HISTORY:
- See issue olivierkes#166 and commit 6ec0c19 in the 0.5.0 release.
- See issue olivierkes#283 and commit 63b471e in the 0.7.0 release.

Also fix minor incorrect utf-8 encoding at top of source file.
gedakc added a commit to gedakc/manuskript that referenced this issue Sep 19, 2019
See PR olivierkes#651

This commit restores the functionality that prevents spell checking a
word that is being actively typed at the end of a paragraph.

The goals for the spell check word match regexp are:

A. Words should include those with an apostrophe
   *E.g., can't*
B. Words should exclude underscore
   *E.g., hello_world is two words*
C. Words in other languages should be recognized
   *E.g., French word familiarisé*
D. Spell check should include word at absolute end of line with no
   trailing space or punctuation
   *E.g., tezt*
E. Spell check should ignore partial words in progress (user typing)
   *E.g., paragr while midway through typing paragraph*

This commit addresses all five of the above goals.

HISTORY:
- See issue olivierkes#166 and commit 6ec0c19 in the 0.5.0 release.
- See issue olivierkes#283 and commit 63b471e in the 0.7.0 release.

Also fix minor incorrect utf-8 encoding at top of source file.
gedakc added a commit that referenced this issue Sep 22, 2019
See PR #651

This commit restores the functionality that prevents spell checking a
word that is being actively typed at the end of a paragraph.

The goals for the spell check word match regexp are:

A. Words should include those with an apostrophe
   *E.g., can't*
B. Words should exclude underscore
   *E.g., hello_world is two words*
C. Words in other languages should be recognized
   *E.g., French word familiarisé*
D. Spell check should include word at absolute end of line with no
   trailing space or punctuation
   *E.g., tezt*
E. Spell check should ignore partial words in progress (user typing)
   *E.g., paragr while midway through typing paragraph*

This commit addresses all five of the above goals.

HISTORY:
- See issue #166 and commit 6ec0c19 in the 0.5.0 release.
- See issue #283 and commit 63b471e in the 0.7.0 release.

Also fix minor incorrect utf-8 encoding at top of source file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants