# Semantic Word Networks: The English WordNet
See http://wordnetweb.princeton.edu/perl/webwn for online lookup.

In [None]:
import nltk
from nltk.corpus import wordnet as wn

With `as` you can introduce short names for imported objects or  modules.

## Find synsets (synonym sets) of a word
How many different meanings (=synsets) does the token "hearts" have?

In [None]:
wn.synsets('hearts')

The `Synset` class **abstracts** the relevant **data** and **methods** that contain synsets. 

Ok, but concretely, what are synsets technically as a data structure?

In [4]:
sense1 = wn.synsets('hearts')[0]
help(type(sense1))

Help on class Synset in module nltk.corpus.reader.wordnet:

class Synset(_WordNetObject)
 |  Synset(wordnet_corpus_reader)
 |  
 |  Create a Synset from a "<lemma>.<pos>.<number>" string where:
 |  <lemma> is the word's morphological stem
 |  <pos> is one of the module attributes ADJ, ADJ_SAT, ADV, NOUN or VERB
 |  <number> is the sense number, counting from 0.
 |  
 |  Synset attributes, accessible via methods with the same name:
 |  
 |  - name: The canonical name of this synset, formed using the first lemma
 |    of this synset. Note that this may be different from the name
 |    passed to the constructor if that string used a different lemma to
 |    identify the synset.
 |  - pos: The synset's part of speech, matching one of the module level
 |    attributes ADJ, ADJ_SAT, ADV, NOUN or VERB.
 |  - lemmas: A list of the Lemma objects for this synset.
 |  - definition: The definition for this synset.
 |  - examples: A list of example strings for this synset.
 |  - offset: The offset 

Okay, but what do these synsets actually mean? The method `definition()` gives human understandable information.
Expectedly, the search word is not always the canonical lemma for a synonym set.

In [5]:
for synset in wn.synsets('hearts'):
    print(synset,':',synset.definition())

Synset('hearts.n.01') : a form of whist in which players avoid winning tricks containing hearts or the queen of spades
Synset('heart.n.01') : the locus of feelings and intuitions
Synset('heart.n.02') : the hollow muscular organ located behind the sternum and between the lungs; its rhythmic contractions move the blood through the body
Synset('heart.n.03') : the courage to carry on
Synset('center.n.01') : an area that is approximately central within some larger region
Synset('kernel.n.03') : the choicest or most essential or most vital part of some idea or experience
Synset('heart.n.06') : an inclination or tendency of a certain kind
Synset('heart.n.07') : a plane figure with rounded sides curving inward at the top and intersecting at the bottom; conventionally used on playing cards and valentines
Synset('heart.n.08') : a firm rather dry variety meat (usually beef or veal)
Synset('affection.n.01') : a positive feeling of liking
Synset('heart.n.10') : a playing card in the major suit that

The type `Synset` is not to be confused with the lookup method `synsets()`.

In [6]:
help(wn.synsets)

Help on method synsets in module nltk.corpus.reader.wordnet:

synsets(lemma, pos=None, lang='eng', check_exceptions=True) method of nltk.corpus.reader.wordnet.WordNetCorpusReader instance
    Load all synsets with a given lemma and part of speech tag.
    If no pos is specified, all synsets for all parts of speech
    will be loaded.
    If lang is specified, all the synsets associated with the lemma name
    of that language will be returned.



## Calculate all synonym lemmas of a synset
Note: Synonyms are defined on the level of meanings, not words!

In [7]:
wn.synset('affection.n.01').lemma_names()

['affection',
 'affectionateness',
 'fondness',
 'tenderness',
 'heart',
 'warmness',
 'warmheartedness',
 'philia']

What are lemmas technically as a data structure?

In [8]:
wn.synset('affection.n.01').lemmas()

[Lemma('affection.n.01.affection'),
 Lemma('affection.n.01.affectionateness'),
 Lemma('affection.n.01.fondness'),
 Lemma('affection.n.01.tenderness'),
 Lemma('affection.n.01.heart'),
 Lemma('affection.n.01.warmness'),
 Lemma('affection.n.01.warmheartedness'),
 Lemma('affection.n.01.philia')]

In [9]:
lemma1 = wn.synset('affection.n.01').lemmas()[0]
lemma1

Lemma('affection.n.01.affection')

In [10]:
help(type(lemma1))

Help on class Lemma in module nltk.corpus.reader.wordnet:

class Lemma(_WordNetObject)
 |  Lemma(wordnet_corpus_reader, synset, name, lexname_index, lex_id, syntactic_marker)
 |  
 |  The lexical entry for a single morphological form of a
 |  sense-disambiguated word.
 |  
 |  Create a Lemma from a "<word>.<pos>.<number>.<lemma>" string where:
 |  <word> is the morphological stem identifying the synset
 |  <pos> is one of the module attributes ADJ, ADJ_SAT, ADV, NOUN or VERB
 |  <number> is the sense number, counting from 0.
 |  <lemma> is the morphological form of interest
 |  
 |  Note that <word> and <lemma> can be different, e.g. the Synset
 |  'salt.n.03' has the Lemmas 'salt.n.03.salt', 'salt.n.03.saltiness' and
 |  'salt.n.03.salinity'.
 |  
 |  Lemma attributes, accessible via methods with the same name:
 |  
 |  - name: The canonical name of this lemma.
 |  - synset: The synset that this lemma belongs to.
 |  - syntactic_marker: For adjectives, the WordNet string identifying t

## Calculate all hyponyms of a lemma
Remember that a lemma is always bound to a synset.

In [11]:
lemma1.synset().hyponyms()

[Synset('attachment.n.01'),
 Synset('protectiveness.n.01'),
 Synset('regard.n.06'),
 Synset('soft_spot.n.02')]

## Calculate all definitions of all hyponyms of a lemma

In [12]:
for synset in lemma1.synset().hyponyms():
    print(synset, ':',synset.definition())

Synset('attachment.n.01') : a feeling of affection for a person or an institution
Synset('protectiveness.n.01') : a feeling of protective affection
Synset('regard.n.06') : a feeling of friendship and esteem
Synset('soft_spot.n.02') : a sentimental affection


## Some semantic relations are defined on lemma level
E.g. antonyms or pertainyms


In [13]:
smart_lemma1 = wn.lemmas('smart',pos='a')[0]
smart_lemma1.antonyms()

[Lemma('stupid.a.01.stupid')]

## More advanced methods
Similarities between different meanings of two words as path distance in the word network

In [14]:
word1 = 'computer'
word2 = 'mouse'
for synset1 in wn.synsets(word1, pos="n"):
    for synset2 in wn.synsets(word2, pos="n"):
        print('SENSE 1:',synset1.definition())
        print('SENSE 2:',synset2.definition())
        print('SIMILARITY:',synset1.path_similarity(synset2,verbose=True))
        print('WUP_SIMILARITY',synset1.wup_similarity(synset2))
        print()

SENSE 1: a machine for performing calculations automatically
SENSE 2: any of numerous small rodents typically resembling diminutive rats having pointed snouts and small ears on elongated bodies with slender usually hairless tails
SIMILARITY: 0.06666666666666667
WUP_SIMILARITY 0.36363636363636365

SENSE 1: a machine for performing calculations automatically
SENSE 2: a swollen bruise caused by a blow to the eye
SIMILARITY: 0.05263157894736842
WUP_SIMILARITY 0.1

SENSE 1: a machine for performing calculations automatically
SENSE 2: person who is quiet or timid
SIMILARITY: 0.1
WUP_SIMILARITY 0.47058823529411764

SENSE 1: a machine for performing calculations automatically
SENSE 2: a hand-operated electronic device that controls the coordinates of a cursor on your computer screen as you move it around on a pad; on the bottom of the device is a ball that rolls on the surface of the pad
SIMILARITY: 0.2
WUP_SIMILARITY 0.7777777777777778

SENSE 1: an expert at calculation (or at operating calcu

# The Open Multilingual WordNet
Using 3-letter ISO-639 language codes

In [15]:
for synset in wn.synsets("cavallo", pos="n", lang="ita"):
    print(synset.definition())

solid-hoofed herbivorous quadruped domesticated since prehistoric times
a padded gymnastic apparatus on legs
a chessman shaped to resemble the head of a horse; can move two squares horizontally and one vertically (or vice versa)
a unit of power equal to 746 watts


In [None]:
sorted(wn.langs())

More under http://www.nltk.org/howto/wordnet.html