Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Wordnet lch_similarity method raises a ZeroDivisionError when synset is compared with itself #301
The current implementation of Leacock & Chodorow's WordNet similarity metric does not correctly handle comparing synsets with themselves. According to the method's docstring, "If a Synset is compared with itself, the maximum score is returned, which varies depending on the taxonomy depth.", but currently it will just raise a ZeroDivisionError.
>>> from nltk.corpus import wordnet >>> tardy_synsets = wordnet.synsets('tardy') >>> tardy_synsets [Synset('belated.s.01')] >>> tardy_synsets.lch_similarity(tardy_synsets) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/opt/python/2.7/lib/python2.7/site-packages/nltk/corpus/reader/wordnet.py", line 650, in lch_similarity return -math.log((distance + 1) / (2.0 * depth)) ZeroDivisionError: float division by zero
This should be resolved with pull request #421. The issue is not, strictly speaking, with comparing a synset to itself, but when that synset additionally has no hypernyms, as is the case with "tardy". With a synset that does have hypernyms, the function behaves appropriately.
With the change in pull request #421, the function returns None when the synset has zero hypernyms.