Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aff-regex #18

Closed
doublex opened this issue Dec 29, 2021 · 6 comments
Closed

aff-regex #18

doublex opened this issue Dec 29, 2021 · 6 comments
Labels
question Further information is requested

Comments

@doublex
Copy link

doublex commented Dec 29, 2021

This AFF (czech) contains a wrong regex:
https://github.com/wooorm/dictionaries/blob/main/dictionaries/cs/index.aff#L2119

Therefore this line fails re.error: unterminated character set at position 36
https://github.com/zverok/spylls/blob/master/spylls/hunspell/data/aff.py#L266

@zverok
Copy link
Owner

zverok commented Jan 15, 2022

What are you suggesting here? What's the desired behavior for definitely-wrong dictionary files?

@zverok zverok added the question Further information is requested label Jan 15, 2022
@doublex
Copy link
Author

doublex commented Jan 15, 2022

You are right - the problem is the affix file.
But maybe there is an issue, this affix looks correct but fails:
https://github.com/wooorm/dictionaries/blob/main/dictionaries/uk/index.aff#L1464

@zverok
Copy link
Owner

zverok commented Jan 16, 2022

@doublex Ugh, this is more complicated. It seems I've never encountered dictionaries with () in conditions before, even when running smoke tests on all dictionaries that were available at the moment of spylls finalization (not even sure if Hunspell supports this syntax). I'll try to take a closer look in the next days.

@doublex
Copy link
Author

doublex commented Jan 16, 2022

They are a rare (strange?) case. Maybe simply remove ()?

@zverok
Copy link
Owner

zverok commented Jan 23, 2022

Surprisingly enough, this case, while indeed rare, made me rethink a bit why it is a problem... And simplify code for it not be it anymore :)
See f92f74b — there are significant simplifications in spylls/hunspell/data/aff.py, dropping the hacky regexp construction.
Released as 0.1.7, works with uk_UA as expected.

@doublex
Copy link
Author

doublex commented Jan 23, 2022

Thanks a lot for all your efforts!

@doublex doublex closed this as completed Jan 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants