You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
from lingua import Language, LanguageDetectorBuilder
languages = [Language.ENGLISH, Language.FRENCH, Language.GERMAN, Language.SPANISH]
detector = LanguageDetectorBuilder.from_languages(*languages).build()
confidence_values = detector.compute_language_confidence_values("Cereal Churros Sabor A Canela Kellogg´S 260 Gr")
for language, value in confidence_values:
print(f"{language.name}: {value:.2f}")
The documentation explains that the probability will sum to 1 which makes sense to me. But here, it seems that a binary classification is done and languages are ranked by the binary classification probability. Is there a bug or anything?
Also, if I have less languages to be classified to, does that make the results more accurate?
The text was updated successfully, but these errors were encountered:
Obviously, you are not using the latest release 1.3.* of the library. I've reworked the computation of confidence scores in this version. With Lingua 1.3.1, your code returns the following probabilities which sum to 1.0.
I ran a sample
And the output is
The documentation explains that the probability will sum to 1 which makes sense to me. But here, it seems that a binary classification is done and languages are ranked by the binary classification probability. Is there a bug or anything?
Also, if I have less languages to be classified to, does that make the results more accurate?
The text was updated successfully, but these errors were encountered: