# Wordnet

Estonian WordNet API provides means to query Estonian WordNet. WordNet is a network of synsets, in which synsets are collections of synonymous words and are connected to other synsets via relations.

First, let's import the module and create a WordNet object:

In [1]:
from estnltk.wordnet import Wordnet

In [2]:
wn = Wordnet()

## Synsets

The most common use for the API is to query synsets. Synsets can be queried in several ways. The easiest way is to query all the synsets which match some conditions. For that we can either use:

In [3]:
wn.all_synsets()

["Synset('0-tüüpi grammatika.n.01')",
 "Synset('10.n.01')",
 "Synset('100..n.01')",
 "Synset('1000..n.01')",
 "Synset('10000.n.01')",
 "Synset('1000000..n.01')",
 "Synset('1000Base-T.n.01')",
 "Synset('11.n.01')",
 "Synset('12.n.01')",
 "Synset('12..n.01')",
 "Synset('13.n.01')",
 "Synset('13..n.01')",
 "Synset('13-silbik.n.01')",
 "Synset('14.n.01')",
 "Synset('14..n.01')",
 "Synset('15.n.01')",
 "Synset('15..n.01')",
 "Synset('16.n.01')",
 "Synset('16..n.01')",
 "Synset('16-bitine programm.n.01')",
 "Synset('17.n.01')",
 "Synset('17..n.01')",
 "Synset('18.n.01')",
 "Synset('18..n.01')",
 "Synset('19.n.01')",
 "Synset('19..n.01')",
 "Synset('1G.n.01')",
 "Synset('1. jõulupüha.n.01')",
 "Synset('1-tüüpi grammatika.n.01')",
 "Synset('20.n.01')",
 "Synset('20..n.01')",
 "Synset('2.5G.n.01')",
 "Synset('2-tüüpi grammatika.n.01')",
 "Synset('30.n.01')",
 "Synset('30..n.01')",
 "Synset('32-bitine programm.n.01')",
 "Synset('3D-film.n.01')",
 "Synset('3D-pilt.n.01')",
 "Synset('3-tüüpi gramm

which returns all the synsets there are or specify pos:

In [4]:
wn.all_synsets('v')

["Synset('aadeldama.v.01')",
 "Synset('aaderdama.v.01')",
 "Synset('ääristama.v.01')",
 "Synset('ääristama.v.02')",
 "Synset('äärmustama.v.01')",
 "Synset('aasama.v.01')",
 "Synset('aatlema.v.01')",
 "Synset('abielluma.v.01')",
 "Synset('abielu sõlmima.v.01')",
 "Synset('abistama.v.01')",
 "Synset('abistama.v.02')",
 "Synset('ablakteerima.v.01')",
 "Synset('ablakteerima.v.02')",
 "Synset('ablastuma.v.01')",
 "Synset('aboneerima.v.01')",
 "Synset('aborteerima.v.01')",
 "Synset('aborteeruma.v.01')",
 "Synset('abortima.v.02')",
 "Synset('abortima.v.03')",
 "Synset('absolutiseerima.v.01')",
 "Synset('abstraheerima.v.01')",
 "Synset('abstraheeruma.v.01')",
 "Synset('adapteerima.v.01')",
 "Synset('adapteerima.v.02')",
 "Synset('adjektiveeruma.v.01')",
 "Synset('adresseerima.v.01')",
 "Synset('adsorbeerima.v.01')",
 "Synset('adsorbeeruma.v.01')",
 "Synset('aega võtma.v.01')",
 "Synset('aega võtma.v.02')",
 "Synset('aeglustama.v.01')",
 "Synset('aeglustuma.v.01')",
 "Synset('aeguma.v.02')",
 "

which returns all the synset of which part of speech is “verb”. We can also query synsets by providing a lemma:

In [5]:
wn['laulma']

["Synset('laulma.v.01')", "Synset('laulma.v.02')"]

or provide both a lemma and pos:

In [6]:
wn['laulma', 'v']

["Synset('laulma.v.01')", "Synset('laulma.v.02')"]

In [7]:
wn[('laulma', 'v')]

["Synset('laulma.v.01')", "Synset('laulma.v.02')"]

The previous options return a list of synsets. However, it is also possible to query for a synset by its position in the list. For example, if you only want the second synset with the lemma 'laulma', you can specify it like this (this option will return a synset object):

In [8]:
wn['laulma', 2]

"Synset('laulma.v.02')"

It's also possible to retrieve a synset's details, like name and pos:

In [9]:
synset = wn['laulma'][0]
print(synset.name)
print(synset.pos)

laulma.v.01
v


## Relations

We can also query related synsets. There are relations, for which there are specific methods:

In [10]:
synset.hypernyms()

["Synset('häälitsema.v.01')"]

In [11]:
synset.hyponyms()

["Synset('tremoleerima.v.01')",
 "Synset('aiduraidutama.v.01')",
 "Synset('trallitama.v.01')",
 "Synset('joodeldama.v.01')",
 "Synset('leelotama.v.01')",
 "Synset('kõõrutama.v.02')",
 "Synset('kaasitama.v.01')",
 "Synset('joiguma.v.01')",
 "Synset('helletama.v.01')",
 "Synset('ümisema.v.02')",
 "Synset('üles laulma.v.01')"]

In [12]:
synset.holonyms()

[]

In [13]:
synset.meronyms()

[]

In [14]:
synset.member_holonyms()

[]

More specific relations can be queried with a universal method:

In [15]:
synset.get_related_synset("involved_agent")

["Synset('laulja.n.01')"]

We can also find all ancestors of a synset using a specified relation:

In [16]:
wn["jalats"][0].closure("hyponym")

["Synset('soome suss.n.01')",
 "Synset('papu.n.01')",
 "Synset('vahetusjalats.n.01')",
 "Synset('plätu.n.01')",
 "Synset('tanksaabas.n.01')",
 "Synset('vildik.n.01')",
 "Synset('kirsa.n.01')",
 "Synset('kroomsaabas.n.01')",
 "Synset('kummik.n.01')",
 "Synset('suss.n.02')",
 "Synset('sõidusaabas.n.01')",
 "Synset('ratsasaabas.n.01')",
 "Synset('kamass.n.01')",
 "Synset('unta.n.01')",
 "Synset('sukksaabas.n.01')",
 "Synset('patinka.n.01')",
 "Synset('tohusaabas.n.01')",
 "Synset('venekas.n.02')",
 "Synset('kauboisaabas.n.01')",
 "Synset('lumesaabas.n.01')",
 "Synset('matkasaabas.n.01')",
 "Synset('alpinistisaabas.n.01')",
 "Synset('mootorratturisaabas.n.01')",
 "Synset('seitsmepenikoormasaapad.n.01')",
 "Synset('stileto.n.01')",
 "Synset('rihmik.n.01')",
 "Synset('kõpsking.n.01')",
 "Synset('pätt 3.n.01')",
 "Synset('lapseking.n.01')",
 "Synset('hiiuranti king.n.01')",
 "Synset('libik.n.01')",
 "Synset('puukas.n.01')",
 "Synset('sandaal.n.01')",
 "Synset('botik.n.01')",
 "Synset('kaloss.

## Similarities

We can measure distance or similarity between two synsets in several ways. For calculating similarity, we provide path, Leacock-Chodorow and Wu-Palmer similarities:

In [17]:
synset = wn['aprill'][0]
target_synset = wn['mai'][0]

In [18]:
synset.path_similarity(target_synset)

0.3333333333333333

In [19]:
synset.lch_similarity(target_synset)

2.159484249353372

In [20]:
synset.wup_similarity(target_synset)

0.8

In addition, we can also find the closest common ancestor via hypernyms:

In [21]:
synset.lowest_common_hypernyms(target_synset)

["Synset('kuu.n.01')"]