Normalized probabilities: only 1.0 in output values #6

aleksandra-miletic · 2022-06-09T10:38:24Z

Hi Adrien,

I am currently testing py3langid and I noticed something strange: the normalized probability values in the output are systematically 1.0. I tested texts of different lengths (1 word to several paragraphs) in different languages. I'm using it with Python. Is this something you noticed before?

Thanks,
Aleksandra

adbar · 2022-06-14T12:38:28Z

Hi @aleksandra-miletic, thanks for your feedback. You're right, this is a bug.

I changed the formula to normalize probabilities along the way and apparently didn't check it properly.

The change in numpy data type also affects things a bit, and I didn't implement the possibility to write classify(self, datatype='uint32') although I wrote such an example in the readme... I'm going to fix this.

aleksandra-miletic · 2022-06-28T07:10:59Z

Great, thank you!

adbar added the bug Something isn't working label Jun 14, 2022

adbar added a commit that referenced this issue Jun 14, 2022

fix: restore old formula for prob normalization (#6)

bfa6121

adbar closed this as completed Jun 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Normalized probabilities: only 1.0 in output values #6

Normalized probabilities: only 1.0 in output values #6

aleksandra-miletic commented Jun 9, 2022

adbar commented Jun 14, 2022

aleksandra-miletic commented Jun 28, 2022

Normalized probabilities: only 1.0 in output values #6

Normalized probabilities: only 1.0 in output values #6

Comments

aleksandra-miletic commented Jun 9, 2022

adbar commented Jun 14, 2022

aleksandra-miletic commented Jun 28, 2022