##Day 45 - DIY Solution

**Q1. Problem Statement: Stemming, Lemmatization and Word sense Disambiguation**

Perform the following tasks to get an understanding of stemming,
lemmatization and Word Sense Disambiguation (WSD) using the Natural Language
Tool Kit (NLTK) 
1.	Declare a list of words and perform stemming on each word using PorterStemmer() and LancasterStemmer()
2.	Declare a sentence and perform lemmatization on each word of the sentence using  WordNetLemmetizer()
3.	Declare two different sentences with homonyms and perform WSD to fetch the meanings of the homonyms in the context of their respective sentences

**Note:** Homonyms are words that have the same spelling and pronunciation, but different meanings.


**Step-1:** Importing necessary libraries.

In [None]:
from nltk.stem import PorterStemmer
from nltk.stem import LancasterStemmer

**Step-2:** Creating a PortStemmer() and LandcasterStemmer() objects.

In [None]:
porter = PorterStemmer()
lancaster=LancasterStemmer()

**Step-3:** Stemming the words using PortStemmer and LandcasterStemmer().

In [None]:
#A list of words to be stemmed
word_list = ["friend", "friendship", "friends", "friendships","stabil","destabilize","misunderstanding","railroad","moonlight","football"]
print("The Original Word sare:")
print(word_list)
print("After Stemming the Words using Porter Stemmer:")
print("{0:20}{1:20}{2:20}".format("Word","Porter Stemmer","lancaster Stemmer"))
for word in word_list:
    print("{0:20}{1:20}{2:20}".format(word,porter.stem(word),lancaster.stem(word)))

The Original Word sare:
['friend', 'friendship', 'friends', 'friendships', 'stabil', 'destabilize', 'misunderstanding', 'railroad', 'moonlight', 'football']
After Stemming the Words using Porter Stemmer:
Word                Porter Stemmer      lancaster Stemmer   
friend              friend              friend              
friendship          friendship          friend              
friends             friend              friend              
friendships         friendship          friend              
stabil              stabil              stabl               
destabilize         destabil            dest                
misunderstanding    misunderstand       misunderstand       
railroad            railroad            railroad            
moonlight           moonlight           moonlight           
football            footbal             footbal             


**Step-4:** Downloading/Importing necessary modules to Lemmatize sentences. 

In [None]:
import nltk
nltk.download('wordnet')
nltk.download('punkt')
from nltk.stem import WordNetLemmatizer


[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


**Step-5:** Creating a WordNetLemmetizer() object.

In [None]:
wordnet_lemmatizer = WordNetLemmatizer()

**Step-6:** Declaring a sentence as a string and Lemmatizing the same sentance using WordNetLemmatizer().

In [None]:
sentence = "He was running and eating at same time. He has bad habit of swimming after playing long hours in the Sun."
punctuations="?:!.,;"
sentence_words = nltk.word_tokenize(sentence)
for word in sentence_words:
    if word in punctuations:
        sentence_words.remove(word)

print("The Original Sentence is:")
print(sentence)
print("After Lemmatizing words using the WordNetLemmatizer:")

sentence_words
print("{0:20}{1:20}".format("Words","Lemmatized words"))
for word in sentence_words:
    print ("{0:20}{1:20}".format(word,wordnet_lemmatizer.lemmatize(word)))

The Original Sentence is:
He was running and eating at same time. He has bad habit of swimming after playing long hours in the Sun.
After Lemmatizing words using the WordNetLemmatizer:
Words               Lemmatized words    
He                  He                  
was                 wa                  
running             running             
and                 and                 
eating              eating              
at                  at                  
same                same                
time                time                
He                  He                  
has                 ha                  
bad                 bad                 
habit               habit               
of                  of                  
swimming            swimming            
after               after               
playing             playing             
long                long                
hours               hour                
in                  in              

**Step-7:** Importing necessary libraries for Word Sense Disambiguation (WSD).

In [None]:
from nltk.wsd import lesk
from nltk.tokenize import word_tokenize

**Step-9:** Declaring two different sentances with homonyms and performing WSD to fetch the meanings of the homonyms in context to thier respective sentences.

In [None]:
sentence1 = "This device is used to jam the signal"
print("Sentence-1:")
print(sentence1)
print("Meaning of the jam word in Sentence-1 is:")
a1= lesk(word_tokenize('sentence1'),'jam')

print(a1,a1.definition())

sentence2 = "This device is used to jam the signal"
print("Sentence-2:")
print(sentence2)
print("Meaning of the jam word in Sentence-2 is:")
a2 = lesk(word_tokenize('I am stuck in a traffic jam'),'jam')

print(a2,a2.definition())

Sentence-1:
This device is used to jam the signal
Meaning of the jam word in Sentence-1 is:
Synset('throng.v.01') press tightly together or cram
Sentence-2:
This device is used to jam the signal
Meaning of the jam word in Sentence-2 is:
Synset('jam.v.05') get stuck and immobilized


**Step-9:** Testing WSD with different sentences.

In [None]:
# testing with some other data 

b1= lesk(word_tokenize('Apply the salt and other spices to the chicken to season it'),'season')

print(b1,b1.definition())

Synset('season.n.02') one of the natural periods into which the year is divided by the equinoxes and solstices or atmospheric conditions


In [None]:
# testing with some data

b1= lesk(word_tokenize("It'll be too humid inside in the rainy season "),'season')

print(b1,b1.definition())

Synset('season.n.01') a period of the year marked by special events or activities in some field
