Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Latvian language #679

Closed
lord-lawnmower opened this issue Mar 5, 2021 · 3 comments · Fixed by #694
Closed

Add Latvian language #679

lord-lawnmower opened this issue Mar 5, 2021 · 3 comments · Fixed by #694

Comments

@lord-lawnmower
Copy link

Could the Latvian language be added for tesseract ocr - short code - lav

@eikek
Copy link
Owner

eikek commented Mar 5, 2021

Yes sure!

Docspell has a date extractor that looks for month names of all languages to get dates like "23. september 2019" – would it be possible maybe for you to provide me these names? And while I'm at it :) maybe also what date format is most used, like dd.mm.yyyy or a different one? That would make it easier for me to add support for these patterns in latvian for date extraction. Thank you!

Edit: For reference, the current list is here

@lord-lawnmower
Copy link
Author

lord-lawnmower commented Mar 6, 2021

These would be the month names:

English - Latvian - Latvian short (with the dot at the end)

January - Janvāris - janv.
February - Februāris - febr.
March - Marts
April - Aprīlis - apr.
May - Maijs
June - Jūnijs - jūn.
July - Jūlijs - jūl.
August - Augusts - aug.
September - Septembris - sept.
October - Oktobris - okt.
November - Novembris - nov.
December - Decembris - dec.

Regarding the date format, the most widely used short date format would be dd.mm.yyyy and long format YYYY. [gada] D. MMMM ('gada' means 'year' (in specific conjugation) and it is common to include it in full format for date in Latvia.)like this:

2020. gada 30. jūlijs
2018. gada 10. maijs
2020.gada 30.oktobris

https://www.localeplanet.com/icu/lv-LV/index.html

@eikek
Copy link
Owner

eikek commented Mar 6, 2021

That's great! Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants