# Python WordNet using NLTK

**(C) 2017 by [Damir Cavar](http://cavar.me/damir/)**

**Version:** 1.0, January 2017

**License:** [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/) ([CA BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/))

This is a tutorial related to the discussion of a WordSense disambiguation and various machine learning strategies discussed in the textbook [Machine Learning: The Art and Science of Algorithms that Make Sense of Data](https://www.cs.bris.ac.uk/~flach/mlbook/) by [Peter Flach](https://www.cs.bris.ac.uk/~flach/).

This tutorial was developed as part of my course material for the course Machine Learning for Computational Linguistics in the [Computational Linguistics Program](http://cl.indiana.edu/) of the [Department of Linguistics](http://www.indiana.edu/~lingdept/) at [Indiana University](https://www.indiana.edu/).

## Using WordNet

Importing *wordnet* from the NLTK module:

In [None]:
from nltk.corpus import wordnet

Asking for a synset in WordNet:

In [None]:
wordnet.synsets('cat')

A synset is identified with a 3-part name of the form: word.pos.nn. Except of the last synset, all other synsets of *dog* above are nouns with the *part-of-speech* tag *n*. We can pick a synset with a specific PoS:

In [None]:
wordnet.synsets('dog', pos=wordnet.VERB)

Besides VERB the other parts of speech are NOUN, ADJ and ADV.

We can select a specific synset from the list using the full 3-part name notation:

In [None]:
wordnet.synset('dog.n.01')

Fort this particular synset we can fetch the definition:

In [None]:
print(wordnet.synset('dog.n.01').definition())

Synsets might also have examples. We can count the number of examples for this concrete synset this way:

In [None]:
len(wordnet.synset('dog.n.01').examples())

We can print out the example using:

In [None]:
print(wordnet.synset('dog.n.01').examples()[0])

We can also output the lemmata for a specific synset:

In [None]:
wordnet.synset('dog.n.01').lemmas()

Using list comprehension we can convert this list to just the lemma list:

In [None]:
[str(lemma.name()) for lemma in wordnet.synset('dog.n.01').lemmas()]

We can also reference a concrete lemma:

In [None]:
wordnet.lemma('dog.n.01.dog')

### Multilingual Functions

The current version of WordNet in NLTK is multilingual. To see which languages are supported, use this command:

In [None]:
sorted(wordnet.langs())

We can ask for the Japanese names of synsets:

In [None]:
wordnet.synset('dog.n.01').lemma_names('pol')

We can fetch the English lemmata from different languages for a specific synset:

In [None]:
wordnet.lemmas('cane', lang='ita')

### Synonyms, hypernyms, holonyms

In [None]:
dog = wordnet.synset('dog.n.01')

In [None]:
dog.hypernyms()

In [None]:
dog.hyponyms()

In [None]:
dog.member_holonyms()

In [None]:
dog.root_hypernyms()

In [None]:
wordnet.synset('dog.n.01').lowest_common_hypernyms(wordnet.synset('cat.n.01'))

In [None]:
good = wordnet.synset('good.a.01')

In [None]:
good.lemmas()[0].antonyms()