Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spellchecker uses incorrect codepage #40

Closed
xarx00 opened this issue Nov 16, 2015 · 9 comments
Closed

Spellchecker uses incorrect codepage #40

xarx00 opened this issue Nov 16, 2015 · 9 comments

Comments

@xarx00
Copy link

xarx00 commented Nov 16, 2015

I'm using the cs_CZ dictionaries, gImageReader 3.0.2 on Win7 x64.

In the Output window, if a word gets marked as incorrect, the spellchecker rarely offers (on right-click) an alternative word that contains accented characters (unaccented words seem to be preferred). But if it does, the word is presented in an incorrect codepage, and if the word is clicked, the same incorrect characters are inserted into the output text.

Details:
The words in the dictionary are encoded in UTF8 (e.g. "psychickým"). But the spellchecker presents and inserts the words in a byte encoding ("psychickĂ˝m")

@manisandro
Copy link
Owner

Oh bummer, this is caused by an incorrect conversion from std::string to QString in QtSpell.

@manisandro
Copy link
Owner

I've released a new version of qtspell which fixes this. Windows users will only get it with with the next version of gImageReader however. In the meantime, you should be able to just replace the qtspell-qt5-0.dll in the gImageReader/bin folder with the one here: http://smani.fedorapeople.org/tmp/qtspell.zip
(Back up the old one just in case)

@xarx00
Copy link
Author

xarx00 commented Nov 16, 2015

That was fast! But the fix does not work. Now:

  1. all correctly spelled Czech words are marked as incorrect
  2. In the list of alternatives, the words are still displayed with an incorrect codepage

Please, reopen the issue.

@manisandro
Copy link
Owner

Uhm, this is very odd since the only thing I've changed is how the strings returned by the enchant speller are converted to QString, so it should have no effect whatsoever on whether a word is detected as correctly or incorrectly spelled. Admittedly, I've only tested it on Linux, will quickly check on Windows too.

@manisandro manisandro reopened this Nov 16, 2015
@manisandro
Copy link
Owner

Working on Windows also as far as I can see...

screen

@xarx00
Copy link
Author

xarx00 commented Nov 16, 2015

This is the original dll before fix:
orig

Here is the same document with fixed dll you pointed me to. Note that almost everything is marked as incorrectly spelled:
fixed

And here is how a spell checker menu looks like on this text:
fixed_menu

@manisandro
Copy link
Owner

I'll need to find a Windows 7 machine to test this on, in the meantime could you please specify whether you are using the spelling dictionary from [1] or from somewhere else, and just in case also attach the image you are working with.

[1] http://cgit.freedesktop.org/libreoffice/dictionaries/tree/cs_CZ

@xarx00
Copy link
Author

xarx00 commented Nov 16, 2015

You're right. I used a year-old dictionary, and it contained a bug. In cs_CZ.aff, there was specified the charset ISO-8859-2, while the dictionary was in fact in UTF-8. The dictionary from your link works fine, and after fixing the charset specification in my dictionary, it works fine too. I've been using my dictionary with an office suite (not OO), and never had a problem with it. The suite perhaps ignores the charset specification header. Thank you, please close the bug now.

@manisandro
Copy link
Owner

Ah cool, that explains it. Thanks for reporting the issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants