Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: ZeroDivisionError: float division by zero #102

Closed
jordimas opened this issue Jan 4, 2023 · 3 comments
Closed

Error: ZeroDivisionError: float division by zero #102

jordimas opened this issue Jan 4, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@jordimas
Copy link

jordimas commented Jan 4, 2023

Hello.

When running this code with lingua_language_detector version 1.3.0.

with open('text.txt') as fh:
    text = fh.read()
    detector = LanguageDetectorBuilder.from_all_languages().build()
    print(text)
    result = detector.detect_language_of(text)
    print(result)

I get this error:

Traceback (most recent call last):
  File "/home/jordi/sc/crux-top-lists-catalan/bug.py", line 9, in <module>
    result = detector.detect_language_of(text)
  File "/home/jordi/.local/lib/python3.10/site-packages/lingua/detector.py", line 272, in detect_language_of
    confidence_values = self.compute_language_confidence_values(text)
  File "/home/jordi/.local/lib/python3.10/site-packages/lingua/detector.py", line 499, in compute_language_confidence_values
    normalized_probability = probability / denominator
ZeroDivisionError: float division by zero

I attached the text file that triggers the problem. It works fine with others texts.
This happens often in a crawling application that I'm testing.

@jordimas
Copy link
Author

jordimas commented Jan 4, 2023

text.txt

@pemistahl pemistahl added the bug Something isn't working label Jan 4, 2023
@pemistahl
Copy link
Owner

Thank you @jordimas for using my library and for reporting this bug. It is fixed now in Lingua 1.3.1. The cause of the ZeroDivisionError was an internal numerical underflow of probabilities for long texts. Switching from floats to Decimals in the right spot fixed it.

@jordimas
Copy link
Author

jordimas commented Jan 5, 2023

Thanks so much for the quick fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants