-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for more document languages #488
Comments
Docspell 0.17.1 installed with the docker-compose method seems not to have the french language installed for tesseract.
|
Thanks @mrtnggnn – this is missing indeed! I'll create a new issue from your comment to fix this bug in the docker file. |
Next release will include the following languages for document processing: Spanish, Italian, Portuguese, Czech, Dutch, Danish, Finnish, Norwegian, Swedish, Russian, Romanian If you'd like others to be included, please let me know. |
Please consider to add Polish language processing support. |
Languages are currently english, german and french. This is what is supported by stanford-nlp. But other languages good be added easily without the nlp support. For these a fallback could be provided, then adding more languages is not hard. Maybe get rid of NLP alltogether.
See: #461
The text was updated successfully, but these errors were encountered: