Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple Languages #154

Closed
kazuser opened this issue Jul 25, 2023 · 4 comments
Closed

Multiple Languages #154

kazuser opened this issue Jul 25, 2023 · 4 comments
Labels
bug Something isn't working
Milestone

Comments

@kazuser
Copy link

kazuser commented Jul 25, 2023

Hi! Thanks a lot for your "lingua"!

Could you please test it:

English language Английский язык

and

English language - Английский язык

?

lingua

My code is:

from lingua import Language, LanguageDetectorBuilder
languages = [Language.ENGLISH, Language.RUSSIAN]
detector = LanguageDetectorBuilder.from_languages(*languages).build()
sentence = '%text_from_memo%'
for result in detector.detect_multiple_languages_of(sentence): print(f"{result.language.name} {sentence[result.start_index:result.end_index]}")

But I'm on Delphi 11 now (+ Python 3.10.9), so I'm not sure who is the source of the problem :)

@kazuser
Copy link
Author

kazuser commented Jul 27, 2023

And another (empty) one:

from lingua import Language, LanguageDetectorBuilder
languages = [Language.ENGLISH, Language.KAZAKH, Language.RUSSIAN]
detector = LanguageDetectorBuilder.from_languages(*languages).build()
sentence = 'V төзімділік спорт'
for result in detector.detect_multiple_languages_of(sentence):
  print(f"{result.language.name} {sentence[result.start_index:result.end_index]}")

empty

@kazuser
Copy link
Author

kazuser commented Jul 27, 2023

Maybe something is wrong with the order? 😕

order

@pemistahl
Copy link
Owner

@kazuser I think I've fixed the underlying problem now. Will be part of the next release.

@pemistahl pemistahl added the bug Something isn't working label Sep 11, 2023
@pchr8
Copy link

pchr8 commented Sep 29, 2023

For completeness - I can reproduce this with

"Das ist mein Text, mit lange deutsche Wörter. Here is my text, it's clearly english text. Ось це мій текст.\n\n-\n\n\tSome more text.\n\nStop processing here - \n\n\nEND OF TEXT."

Stops at the - in 1.3.2, but works nicely in the 2-days-old 1.3.3!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants