Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

French not treated as possible language for character 'â' #115

Closed
bdecarne opened this issue Nov 12, 2021 · 2 comments
Closed

French not treated as possible language for character 'â' #115

bdecarne opened this issue Nov 12, 2021 · 2 comments
Labels
bug Something isn't working
Milestone

Comments

@bdecarne
Copy link

Hello !

Look at this example : the result should be Language.FRENCH without any doubt, but it's Language.UNKNOWN :

LanguageDetector detector = LanguageDetectorBuilder.fromLanguages(Language.ENGLISH, Language.FRENCH).build();
Language detectedLanguage = detector.detectLanguageOf("Découverte du château grâce à l'application visite virtuelle");
assertEquals(Language.FRENCH, detectedLanguage);
Expected :FRENCH
Actual   :UNKNOWN
@pemistahl pemistahl added the bug Something isn't working label Nov 14, 2021
@pemistahl
Copy link
Owner

Thank you for this report, @bdecarne. This is actually a bug in the rule-based filter engine which is supposed to classify the character â as a possible indicator for French. Unfortunately, I missed to include French as a possible language for this character. This is easy to fix and I will do that soon.

@pemistahl pemistahl changed the title False negative example French not treated as possible language for character 'â' Nov 14, 2021
@pemistahl pemistahl added this to the Lingua 1.1.1 milestone Nov 14, 2021
@bdecarne
Copy link
Author

Nice ! Thank you very much !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants