Spellchecker uses incorrect codepage #40

xarx00 · 2015-11-16T11:21:52Z

I'm using the cs_CZ dictionaries, gImageReader 3.0.2 on Win7 x64.

In the Output window, if a word gets marked as incorrect, the spellchecker rarely offers (on right-click) an alternative word that contains accented characters (unaccented words seem to be preferred). But if it does, the word is presented in an incorrect codepage, and if the word is clicked, the same incorrect characters are inserted into the output text.

Details:
The words in the dictionary are encoded in UTF8 (e.g. "psychickým"). But the spellchecker presents and inserts the words in a byte encoding ("psychickĂ˝m")

manisandro · 2015-11-16T12:14:31Z

Oh bummer, this is caused by an incorrect conversion from std::string to QString in QtSpell.

manisandro · 2015-11-16T12:42:13Z

I've released a new version of qtspell which fixes this. Windows users will only get it with with the next version of gImageReader however. In the meantime, you should be able to just replace the qtspell-qt5-0.dll in the gImageReader/bin folder with the one here: http://smani.fedorapeople.org/tmp/qtspell.zip
(Back up the old one just in case)

xarx00 · 2015-11-16T12:56:40Z

That was fast! But the fix does not work. Now:

all correctly spelled Czech words are marked as incorrect
In the list of alternatives, the words are still displayed with an incorrect codepage

Please, reopen the issue.

manisandro · 2015-11-16T13:00:22Z

Uhm, this is very odd since the only thing I've changed is how the strings returned by the enchant speller are converted to QString, so it should have no effect whatsoever on whether a word is detected as correctly or incorrectly spelled. Admittedly, I've only tested it on Linux, will quickly check on Windows too.

manisandro · 2015-11-16T13:09:11Z

Working on Windows also as far as I can see...

xarx00 · 2015-11-16T16:54:34Z

This is the original dll before fix:

Here is the same document with fixed dll you pointed me to. Note that almost everything is marked as incorrectly spelled:

And here is how a spell checker menu looks like on this text:

manisandro · 2015-11-16T21:25:40Z

I'll need to find a Windows 7 machine to test this on, in the meantime could you please specify whether you are using the spelling dictionary from [1] or from somewhere else, and just in case also attach the image you are working with.

[1] http://cgit.freedesktop.org/libreoffice/dictionaries/tree/cs_CZ

xarx00 · 2015-11-16T22:14:04Z

You're right. I used a year-old dictionary, and it contained a bug. In cs_CZ.aff, there was specified the charset ISO-8859-2, while the dictionary was in fact in UTF-8. The dictionary from your link works fine, and after fixing the charset specification in my dictionary, it works fine too. I've been using my dictionary with an office suite (not OO), and never had a problem with it. The suite perhaps ignores the charset specification header. Thank you, please close the bug now.

manisandro · 2015-11-16T22:15:14Z

Ah cool, that explains it. Thanks for reporting the issue!

manisandro closed this as completed Nov 16, 2015

manisandro reopened this Nov 16, 2015

manisandro closed this as completed Nov 16, 2015

SantosSi mentioned this issue Nov 27, 2017

hOCR PDF export: prevent users from overwriting any input image PDF file #243

Closed

napasa mentioned this issue Dec 26, 2017

newest master code occur exception when export pdf #276

Closed

SantosSi mentioned this issue Dec 27, 2017

Qt5,Debian,libtesseract4: Crash on recognition #279

Closed

TeoColuccio mentioned this issue Apr 19, 2020

Glibmm-error, detected trace/breakpoint #445

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spellchecker uses incorrect codepage #40

Spellchecker uses incorrect codepage #40

xarx00 commented Nov 16, 2015

manisandro commented Nov 16, 2015

manisandro commented Nov 16, 2015

xarx00 commented Nov 16, 2015

manisandro commented Nov 16, 2015

manisandro commented Nov 16, 2015

xarx00 commented Nov 16, 2015

manisandro commented Nov 16, 2015

xarx00 commented Nov 16, 2015

manisandro commented Nov 16, 2015

Spellchecker uses incorrect codepage #40

Spellchecker uses incorrect codepage #40

Comments

xarx00 commented Nov 16, 2015

manisandro commented Nov 16, 2015

manisandro commented Nov 16, 2015

xarx00 commented Nov 16, 2015

manisandro commented Nov 16, 2015

manisandro commented Nov 16, 2015

xarx00 commented Nov 16, 2015

manisandro commented Nov 16, 2015

xarx00 commented Nov 16, 2015

manisandro commented Nov 16, 2015