Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding issue when using system hunspell dictionary #55

Closed
tillea opened this issue Nov 15, 2023 · 1 comment
Closed

Encoding issue when using system hunspell dictionary #55

tillea opened this issue Nov 15, 2023 · 1 comment

Comments

@tillea
Copy link

tillea commented Nov 15, 2023

Hi,
the Debian packaged version of the hunspell R package received a bug report to use the dictionaries provided by Debian. I followed that hint but this leads two to encoding issues in the test suite:

── Failure ('test-encodings.R:16:3'): Dictionaries are found ───────────────────
hunspell_info("en_US")$wordchars not equal to "’".
1/1 mismatches
x[1]: "0123456789’"
y[1]: "’"
── Failure ('test-encodings.R:17:3'): Dictionaries are found ───────────────────
hunspell_info("en_GB")$wordchars not equal to "’".
1/1 mismatches
x[1]: "0123456789’"
y[1]: "’"
[ FAIL 2 | WARN 0 | SKIP 0 | PASS 126 ]

You can also find a full build log reproducing these failures.

I wonder if you see any solution for this issue since I agree with the bug submitter that it is better to uses the dictionaries installed on the system.
Kind regards, Andreas.

@jeroen
Copy link
Member

jeroen commented Nov 15, 2023

This is not really an encoding issue, but simply a different dictionary with different values. Specifically in your dictionary, numbers are ignored, so words like P2P or 2nd can not exist. This will affect spell checking results as well.

This test is intended to verify that the dictionaries get parsed correctly. Given that you are patching the package to use other dictionaries than the ones we ship, you should probably just skip or adapt these tests (as you already did iiuc).

I don't see any meaningful way to run tests of the parser, if you replace our testing subject with something else...

@jeroen jeroen closed this as completed Nov 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants