<a href="https://colab.research.google.com/github/vikramkrishnan9885/MyColab/blob/master/Word2VecEmbeddings.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# WordNet - old fashioned NLP

* These approaches mainly can be categorized into two classes:
  * approaches that use external resources for representing words and 
  * approaches that do not. 
* Example of first is  WordNet — one of the most popular external resource-based approaches for representing words. 
* Then we will proceed to more localized methods (that is, those that do not rely on external resources), such as **one-hot encoding** and **Term Frequency-Inverse Document Frequency (TF-IDF)**.

## Imports

In [1]:
import nltk
nltk.download('wordnet')

from nltk.corpus import wordnet as wn

[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Unzipping corpora/wordnet.zip.


# WordNet – using an external lexical knowledge base for learning word representations

* WordNet is one of the most popular classical approaches or statistical NLP that deals with word representations. 
* It relies on an external lexical knowledge base that encodes the information about the definition, synonyms, ancestors, descendants, and so forth of a given word.
* First, WordNet uses the term synset to denote a group or set of synonyms. 
* Next, each synset has a definition that explains what the synset represents. 
* Synonyms contained within a synset are called lemmas.
* In WordNet, the word representations are modeled hierarchically, which forms a
complex graph between a given synset and the associations to another synset.
* These associations can be of two different categories: an is-a relationship or an is-made-of relationship. 
* First, we will discuss the is-a association.
* For a given synset, there exist two categories of relations: 
  * hypernyms and 
  * hyponyms.
* Hypernyms of a synset are the synsets that carry a general (high-level) meaning of he considered synset. For example, vehicle is a hypernym of the synset car. 
* Next, hyponyms are synsets that are more specific than the corresponding synset. For example, Toyota car is a hyponym of the synset car. 
* Now let's discuss the is-made-of relationships for a synset. 
  * Holonyms of a synset are the group of synsets that represents the whole entity of the considered synset. For example, a holonym of tires is the cars synset. 
  * Meronyms are an is-made-of category and represent the opposite of holonyms, where meronyms are the parts or substances synset that makes the corresponding synset.
  

In [3]:
word = "car"
car_syns = wn.synsets(word)
car_syns

[Synset('car.n.01'),
 Synset('car.n.02'),
 Synset('car.n.03'),
 Synset('car.n.04'),
 Synset('cable_car.n.01')]

In [6]:
# The definition of the first two synsets
syns_defs = [car_syns[i].definition() for i in range(len(car_syns))]
for i in range(len(car_syns)):
    print(car_syns[i].name(),': ',syns_defs[i])

car.n.01 :  a motor vehicle with four wheels; usually propelled by an internal combustion engine
car.n.02 :  a wheeled vehicle adapted to the rails of railroad
car.n.03 :  the compartment that is suspended from an airship and that carries personnel and the cargo and the power plant
car.n.04 :  where passengers ride up and down
cable_car.n.01 :  a conveyance for passengers or freight on a cable railway


In [7]:
for i in range(len(car_syns)):
  print(car_syns[i].name(),": ", car_syns[i].lemmas())

car.n.01 :  [Lemma('car.n.01.car'), Lemma('car.n.01.auto'), Lemma('car.n.01.automobile'), Lemma('car.n.01.machine'), Lemma('car.n.01.motorcar')]
car.n.02 :  [Lemma('car.n.02.car'), Lemma('car.n.02.railcar'), Lemma('car.n.02.railway_car'), Lemma('car.n.02.railroad_car')]
car.n.03 :  [Lemma('car.n.03.car'), Lemma('car.n.03.gondola')]
car.n.04 :  [Lemma('car.n.04.car'), Lemma('car.n.04.elevator_car')]
cable_car.n.01 :  [Lemma('cable_car.n.01.cable_car'), Lemma('cable_car.n.01.car')]


In [12]:
# Lemmas is a method
for i in range(len(car_syns[0].lemmas())):
  #print(car_syns[0].lemmas()[i])
  print(car_syns[0].lemmas()[i].name())

car
auto
automobile
machine
motorcar
