<a href="https://colab.research.google.com/github/sandipanpaul21/NLP-using-Python/blob/master/05_Word_Net.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# Word Net

# WordNet is a lexical database for the English language, 
# which was created by Princeton, and is part of the NLTK corpus.
# You can use WordNet to find the meanings of words, synonyms, antonyms, and more. 

In [2]:
from nltk.corpus import wordnet
import nltk
nltk.download('wordnet')

[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


True

In [3]:
# Then, we're going to use the term "program" to find synsets like so:

# In metadata synset, is a group of data elements that are considered semantically equivalent
# for the purposes of information retrieval. 
# These data elements are frequently found in different metadata registries. 
# Although a group of terms can be considered equivalent, 
# metadata registries store the synonyms at a central location called the preferred data element.
# According to WordNet, a synset or synonym set is defined as a set of one or more synonyms
# that are interchangeable in some context without changing the truth value of the proposition
# in which they are embedded.
print("Group of Synset")
syns = wordnet.synsets("program")
print(syns)
print("\n")

print("An example of a synset:")
print(syns[0].name())

Group of Synset
[Synset('plan.n.01'), Synset('program.n.02'), Synset('broadcast.n.02'), Synset('platform.n.02'), Synset('program.n.05'), Synset('course_of_study.n.01'), Synset('program.n.07'), Synset('program.n.08'), Synset('program.v.01'), Synset('program.v.02')]


An example of a synset:
plan.n.01


In [4]:
print("An example of a synset:")
print(syns[0].name())
print("\n")

print("Just the word:")
print(syns[0].lemmas()[0].name())
print("\n")

print("Definition of that first synset:")
print(syns[0].definition())
print("\n")

print("Examples of the word in use:")
print(syns[0].examples())

An example of a synset:
plan.n.01


Just the word:
plan


Definition of that first synset:
a series of steps to be carried out or goals to be accomplished


Examples of the word in use:
['they drew up a six-step plan', 'they discussed plans for a new bond issue']


In [5]:
# Next, how might we discern synonyms and antonyms to a word? 
# The lemmas will be synonyms, and then you can use .antonyms to find the antonyms to the lemmas. 
# As such, we can populate some lists like:

synonyms = []
antonyms = []

for syn in wordnet.synsets("good"):
    for l in syn.lemmas():
        synonyms.append(l.name())
        if l.antonyms():
            antonyms.append(l.antonyms()[0].name())

print("Synonyms are ")
print(set(synonyms))
print("\n")
print("Antonymns are")
print(set(antonyms))

# Inference : 
# As you can see, we got many more synonyms than antonyms, 
# since we just looked up the antonym for the first lemma, 
# but you could easily balance this buy also doing the exact same process for the term "bad."

Synonyms are 
{'just', 'full', 'unspoilt', 'salutary', 'in_force', 'estimable', 'unspoiled', 'serious', 'well', 'undecomposed', 'commodity', 'good', 'goodness', 'near', 'sound', 'soundly', 'dear', 'skillful', 'expert', 'respectable', 'right', 'thoroughly', 'beneficial', 'honorable', 'upright', 'practiced', 'trade_good', 'effective', 'ripe', 'honest', 'adept', 'safe', 'secure', 'in_effect', 'skilful', 'dependable', 'proficient'}


Antonymns are
{'evilness', 'badness', 'evil', 'ill', 'bad'}


In [6]:
# Next, we can also easily use WordNet to compare the similarity of two words and their tenses,
# by incorporating the Wu and Palmer method for semantic related-ness.

# Let's compare the noun of "ship" and "boat:"
print("Similarity between 'Ship' & 'Boat' ")
w1 = wordnet.synset('ship.n.01')
w2 = wordnet.synset('boat.n.01')
print(w1.wup_similarity(w2))
print('\n')

print("Similarity between 'Ship' & 'Car' ")
w1 = wordnet.synset('ship.n.01')
w2 = wordnet.synset('car.n.01')
print(w1.wup_similarity(w2))
print("\n")

print("Similarity between 'Ship' & 'Cat' ")
w1 = wordnet.synset('ship.n.01')
w2 = wordnet.synset('cat.n.01')
print(w1.wup_similarity(w2))

Similarity between 'Ship' & 'Boat' 
0.9090909090909091


Similarity between 'Ship' & 'Car' 
0.6956521739130435


Similarity between 'Ship' & 'Cat' 
0.32
