<a href="https://colab.research.google.com/github/HiranmaiKaredla/NLP/blob/main/wsd.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
'''
Develop a decision list model for disambiguating the word "car."
Assume there are two morphological forms: ["car", "cars"]. “car”
has five word senses. Focus on identifying the word
features that can be used to disambiguate the senses
[“car.n.01”, “car.n.02”, “car.n.04”, “cable_car.n.01”], for the
provided sentences. For brevity, ignore the word sense
“car.n.03”, related to their ships.

Start by identifying the features (words) that will be used to
disambiguate the senses. Then create a set of if-else statements
to perform the disambiguation using the identified features.

'''

ambiguous_word = 'car'

In [None]:
import sys
from nltk.corpus import wordnet as wn
import nltk
from nltk.tokenize import word_tokenize, sent_tokenize, RegexpTokenizer


In [None]:

!{sys.executable} -m nltk.downloader 'punkt'
!{sys.executable} -m nltk.downloader 'wordnet'
!{sys.executable} -m nltk.downloader 'omw-1.4'


[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data] Downloading package omw-1.4 to /root/nltk_data...


In [None]:
sentences = [ \
  'I drove the car to the store.',
  'My car go to flat tire.',
  'The train had many cars.',
  'The last train car is the caboose.',
  'The elevator ride was scary because the car kept vibrating.',
  'Someone pressed all the buttons in the elevator car.',
  'We took the cable car to the top of the mountain.',
  'I got motion sick in the gondola, because the car kept swinging back and forth.',]
for i, sent in enumerate(sentences):
  print(i, sent)

0 I drove the car to the store.
1 My car go to flat tire.
2 The train had many cars.
3 The last train car is the caboose.
4 The elevator ride was scary because the car kept vibrating.
5 Someone pressed all the buttons in the elevator car.
6 We took the cable car to the top of the mountain.
7 I got motion sick in the gondola, because the car kept swinging back and forth.


In [None]:
# Get all senses
senses = wn.synsets(ambiguous_word)

#print(lemma)
print(len(senses), senses)

5 [Synset('car.n.01'), Synset('car.n.02'), Synset('car.n.03'), Synset('car.n.04'), Synset('cable_car.n.01')]


In [None]:
# Print all sense definitions
for s in senses:
  print(s, s.definition())

Synset('car.n.01') a motor vehicle with four wheels; usually propelled by an internal combustion engine
Synset('car.n.02') a wheeled vehicle adapted to the rails of railroad
Synset('car.n.03') the compartment that is suspended from an airship and that carries personnel and the cargo and the power plant
Synset('car.n.04') where passengers ride up and down
Synset('cable_car.n.01') a conveyance for passengers or freight on a cable railway


In [None]:
# Tokenize sentences
tokenized = [word_tokenize(sent) for sent in sentences]

for i, sent in enumerate(tokenized):
  print(i, sent)

0 ['I', 'drove', 'the', 'car', 'to', 'the', 'store', '.']
1 ['My', 'car', 'go', 'to', 'flat', 'tire', '.']
2 ['The', 'train', 'had', 'many', 'cars', '.']
3 ['The', 'last', 'train', 'car', 'is', 'the', 'caboose', '.']
4 ['The', 'elevator', 'ride', 'was', 'scary', 'because', 'the', 'car', 'kept', 'vibrating', '.']
5 ['Someone', 'pressed', 'all', 'the', 'buttons', 'in', 'the', 'elevator', 'car', '.']
6 ['We', 'took', 'the', 'cable', 'car', 'to', 'the', 'top', 'of', 'the', 'mountain', '.']
7 ['I', 'got', 'motion', 'sick', 'in', 'the', 'gondola', ',', 'because', 'the', 'car', 'kept', 'swinging', 'back', 'and', 'forth', '.']


In [None]:
# Create function that resolves the word sense for an ambiguous word in context
def disambiguate_sense(sentence, ambiguous_word):

    s = 'unknown'
    #  ADD IF-ELSE statements code here
    if  set(["drove", "store", "flat", "tire"]).intersection(set(sentence)):
      return 'car.n.01'
    elif  set(["train", "railway", "cable"]).intersection(set(sentence)):
      return 'cable_car.n.01'
    elif  set(["elevator", "lift", "swinging"]).intersection(set(sentence)):
      return 'cable_car.n.04'
    # elif  set(["elevator", "lift"]).issubset(sentence):
    #   return 'cable_car.n.04'


    return s

In [None]:
# Apply sense disambiguation function

for i, sent in enumerate(tokenized):
    s = disambiguate_sense(sent, ambiguous_word=ambiguous_word)
    print(f'{i}: {ambiguous_word}-sense = {s}; sentence={sent}')

0: car-sense = car.n.01; sentence=['I', 'drove', 'the', 'car', 'to', 'the', 'store', '.']
1: car-sense = car.n.01; sentence=['My', 'car', 'go', 'to', 'flat', 'tire', '.']
2: car-sense = cable_car.n.01; sentence=['The', 'train', 'had', 'many', 'cars', '.']
3: car-sense = cable_car.n.01; sentence=['The', 'last', 'train', 'car', 'is', 'the', 'caboose', '.']
4: car-sense = cable_car.n.04; sentence=['The', 'elevator', 'ride', 'was', 'scary', 'because', 'the', 'car', 'kept', 'vibrating', '.']
5: car-sense = cable_car.n.04; sentence=['Someone', 'pressed', 'all', 'the', 'buttons', 'in', 'the', 'elevator', 'car', '.']
6: car-sense = cable_car.n.01; sentence=['We', 'took', 'the', 'cable', 'car', 'to', 'the', 'top', 'of', 'the', 'mountain', '.']
7: car-sense = cable_car.n.04; sentence=['I', 'got', 'motion', 'sick', 'in', 'the', 'gondola', ',', 'because', 'the', 'car', 'kept', 'swinging', 'back', 'and', 'forth', '.']
