sim.py - computes Lin similarity of a given noun and all other nouns in a given file
python sim.py INPUTFILE INPUTWORD
sim.py is a simple program which computes the Lin similarity of a given input noun and all others nouns in the given input file, whereby similarity is defined as
the ratio between the amount of information in the commonality and the amount of information in the description of the two objects.
Dependency triples are extracted from the given input file and stored as features of the nouns. The amount of information contained in every single feature is calculated accordingly. Pairwise similarity is computed between the given input noun and nouns with at least one similar feature. The fifty most similar words are displayed in descending order of their similarity.
The input file must be in CoNLL09 format.
The input word must be a noun.
$ python sim.py tiger_ release_ aug07.corrected.16012013.conll09 Mann
Mensch Frau Teil Regierung Million Prozent Land Experte Zahl Präsident ...
Melanie Tosik, firstname.lastname@example.org