# Week 6 (Part 2): Supervised WSD

In the first part of this week we will be looking at corpus-based methods for carrying out word sense disambiguation.  In particular, we will:
* introduce SemCor, a sense-tagged subsection of the Brown Corpus.
* build Naive Bayes classifiers to carry out sense disambiguation for words with two senses

First some preliminary imports

In [49]:
#from google.colab import drive
#drive.mount('/content/drive')
import nltk
nltk.download('wordnet')
nltk.download('wordnet_ic')
nltk.download('punkt')
nltk.download('stopwords')
nltk.download('semcor')

from nltk.corpus import wordnet as wn
from nltk.corpus import wordnet_ic as wn_ic
from nltk.corpus import semcor
from nltk.stem.wordnet import WordNetLemmatizer
import sys
import operator

#make sure that the path to your utils.py file is correct for your computer
#sys.path.append('/content/drive/My Drive/NLE Notebooks/Week4LabsSolutions/')
#from utils import *
#from sussex_nltk.corpus_readers import AmazonReviewCorpusReader



[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package wordnet_ic to /root/nltk_data...
[nltk_data]   Package wordnet_ic is already up-to-date!
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package semcor to /root/nltk_data...
[nltk_data]   Package semcor is already up-to-date!


## SemCor
SemCor is a collection of 352 documents which have been annotated in various ways (annotations include POS tags and WordNet synsets for individual words

`semcor.fileids()` returns a list of all of the individual document ids in SemCor

In [50]:
allfiles=semcor.fileids() #list of fileids
len(allfiles)

352

`semcor.raw(fileid)` returns the raw text of the given file.  Note that this is marked-up using XML and is probably best avoided unless there is no other way to access the information you require from the file!

In [51]:
semcor.raw(allfiles[0])



Other potentially useful SemCor functions include:

*`semcor.words(fileid)`: returns a list of tokens for each file
*`semcor.chunks(fileid)`: returns a list of *chunks* for each file, where a chunk identifies multiword (generally non-compositional) phrases
*`semcor.tagged_chunks(fileid,tagtype)`: returns the tagged chunks of the file where the tagtype can be *pos* or *sem*.  We are interested in the *sem* tags which are the WOrdNet synsets
*`semcor.tagged_sentences(fileid,tagtype)`: maintains the sentence boundaries within the file and therefore returns a list of lists (one for each sentence)


In [52]:
semcor.words(allfiles[0])

['The', 'Fulton', 'County', 'Grand', 'Jury', 'said', ...]

In [53]:
len(semcor.words(allfiles[0]))

2255

In [54]:
semcor.chunks(allfiles[0])

[['The'], ['Fulton', 'County', 'Grand', 'Jury'], ...]

In [55]:
semcor.tagged_chunks(allfiles[0],tag='sem')

[['The'], Tree(Lemma('group.n.01.group'), [Tree('NE', ['Fulton', 'County', 'Grand', 'Jury'])]), ...]

In [56]:
tagged_sentences=semcor.tagged_sents(allfiles[0],tag='sem')
tagged_sentences[0]

[['The'],
 Tree(Lemma('group.n.01.group'), [Tree('NE', ['Fulton', 'County', 'Grand', 'Jury'])]),
 Tree(Lemma('state.v.01.say'), ['said']),
 Tree(Lemma('friday.n.01.Friday'), ['Friday']),
 ['an'],
 Tree(Lemma('probe.n.01.investigation'), ['investigation']),
 ['of'],
 Tree(Lemma('atlanta.n.01.Atlanta'), ['Atlanta']),
 ["'s"],
 Tree(Lemma('late.s.03.recent'), ['recent']),
 Tree(Lemma('primary.n.01.primary_election'), ['primary', 'election']),
 Tree(Lemma('produce.v.04.produce'), ['produced']),
 ['``'],
 ['no'],
 Tree(Lemma('evidence.n.01.evidence'), ['evidence']),
 ["''"],
 ['that'],
 ['any'],
 Tree(Lemma('abnormality.n.04.irregularity'), ['irregularities']),
 Tree(Lemma('happen.v.01.take_place'), ['took', 'place']),
 ['.']]

For the purposes of this exercise, we are interested in single words which have been tagged with a WordNet Lemma or synset.  We now define a couple of functions to help us extract this information.

In [57]:
def extract_tags(taggedsentence):
    '''
    For a tagged sentence in SemCor, identify single words which have been tagged with a WN synset
    taggedsentence: a list of items, some of which are of type wordnet.tree.Tree
    :return: a list of pairs, (word,synset)
    
    '''
    alist=[]
    for item in taggedsentence:
        if isinstance(item,nltk.tree.Tree):   #check with this is a Tree
            if isinstance(item.label(),nltk.corpus.reader.wordnet.Lemma) and len(item.leaves())==1:
                #check whether the tree's label is Lemma and whether the tree has a single leaf
                #if so add the pair (lowercased leaf,synsetlabel) to output list
                alist.append((item.leaves()[0].lower(),item.label().synset()))
    return alist
            

def extract_senses(fileid_list):
    '''
    apply extract_tags to all sentences in all documents in a list of file ids
    fileid_list: list of ids
    :return: list of list of (token,tag) pairs, one for each sentence in corpus
    '''
    sentences=[]
    for fileid in fileid_list:
        print("Processing {}".format(fileid))
        sentences+=[extract_tags(taggedsentence) for taggedsentence in semcor.tagged_sents(fileid,tag='sem')]
    return sentences

Lets test this on the first document in the fileid list.  Notice that it takes a while to process a single file in this way.

In [58]:
some_sentences=extract_senses([allfiles[0]])
some_sentences

Processing brown1/tagfiles/br-a01.xml


[[('said', Synset('state.v.01')),
  ('friday', Synset('friday.n.01')),
  ('investigation', Synset('probe.n.01')),
  ('atlanta', Synset('atlanta.n.01')),
  ('recent', Synset('late.s.03')),
  ('produced', Synset('produce.v.04')),
  ('evidence', Synset('evidence.n.01')),
  ('irregularities', Synset('abnormality.n.04'))],
 [('jury', Synset('jury.n.01')),
  ('further', Synset('far.r.02')),
  ('said', Synset('state.v.01')),
  ('term', Synset('term.n.02')),
  ('end', Synset('end.n.02')),
  ('presentments', Synset('presentment.n.01')),
  ('had', Synset('own.v.01')),
  ('over-all', Synset('overall.s.02')),
  ('charge', Synset('mission.n.03')),
  ('election', Synset('election.n.01')),
  ('deserves', Synset('deserve.v.01')),
  ('praise', Synset('praise.n.01')),
  ('thanks', Synset('thanks.n.01')),
  ('manner', Synset('manner.n.01')),
  ('election', Synset('election.n.01')),
  ('conducted', Synset('conduct.v.01'))],
 [('september', Synset('september.n.01')),
  ('october', Synset('october.n.01')),


### Exercise 1.1
Write a function `find_sense_distributions()` which finds the distribution of senses for every word in a list of sentences (in the format returned by `extract_senses()`).  Your output should be a dictionary of dictionaries.  The key to the outermost dictionary should be the word_form and the key to the inner dictionaries should be the sense tag.

Test your function on `some_sentences`

In [59]:
def find_sense_distributions(some_sentences):
    allwords={}
    for sentence in some_sentences:
        for (word,sense) in sentence:
            thisword=allwords.get(word,{})
            thisword[sense]=thisword.get(sense,0)+1
            allwords[word]=thisword
    return allwords
    

In [60]:
sense_dists=find_sense_distributions(some_sentences)
sense_dists

{'10': {Synset('ten.s.01'): 1},
 '100': {Synset('hundred.s.01'): 1},
 '13th': {Synset('thirteenth.s.01'): 1},
 '29': {Synset('twenty-nine.n.01'): 1},
 '3': {Synset('three.s.01'): 1},
 '30': {Synset('thirty.s.01'): 1},
 '4': {Synset('four.s.01'): 1},
 '5': {Synset('five.n.01'): 1},
 '50': {Synset('fifty.s.01'): 1},
 'achieve': {Synset('achieve.v.01'): 1},
 'act': {Synset('act.n.01'): 1, Synset('act.v.01'): 1},
 'action': {Synset('action.n.01'): 3},
 'actions': {Synset('action.n.01'): 1},
 'added': {Synset('add.v.02'): 3},
 'adjournment': {Synset('adjournment.n.01'): 2},
 'adjustments': {Synset('adjustment.n.01'): 1},
 'administration': {Synset('administration.n.01'): 2},
 'administrators': {Synset('administrator.n.01'): 1},
 'afternoon': {Synset('afternoon.n.01'): 1},
 'age': {Synset('age.n.01'): 2},
 'agriculture': {Synset('agribusiness.n.01'): 1},
 'aid': {Synset('aid.n.03'): 1},
 'airport': {Synset('airport.n.01'): 2},
 'all': {Synset('all.a.01'): 1},
 'allotted': {Synset('accord.v.0

### Exercise 1.2
Write a function which returns a list of words which only occur with one sense in the corpus, ordered by frequency (most frequent first).

Test your function on `some_sentences`.  You should find that the fourth most frequently occurring seemingly monosemous word is *georgia* which occurs 6 times in this sample.

In [61]:

def find_monosemous(sense_dists):
    mono=[]
    for key,worddict in sense_dists.items():
        if len(worddict.keys())==1:
            mono.append((key,sum(worddict.values())))
    return sorted(mono,key=operator.itemgetter(1),reverse=True)

find_monosemous(sense_dists)
            
    
            

[('jury', 15),
 ('election', 12),
 ('resolution', 9),
 ('georgia', 6),
 ('new', 6),
 ('not', 6),
 ('atlanta', 5),
 ('fulton', 5),
 ('mayor', 5),
 ('republicans', 5),
 ('county', 5),
 ('campaign', 5),
 ('monday', 5),
 ('bonds', 5),
 ('friday', 4),
 ('voters', 4),
 ('recommended', 4),
 ('legislators', 4),
 ('one', 4),
 ('night', 4),
 ('petition', 4),
 ('candidate', 4),
 ('vote', 4),
 ('expected', 4),
 ('house', 4),
 ('irregularities', 3),
 ('term', 3),
 ('primary', 3),
 ('practices', 3),
 ('however', 3),
 ('two', 3),
 ('law', 3),
 ('also', 3),
 ('legislature', 3),
 ('court', 3),
 ('added', 3),
 ('listed', 3),
 ('race', 3),
 ('highway', 3),
 ('bond', 3),
 ('vandiver', 3),
 ('action', 3),
 ('increase', 3),
 ('pelham', 3),
 ('polls', 3),
 ('anonymous', 3),
 ('manner', 2),
 ('reports', 2),
 ('laws', 2),
 ('commented', 2),
 ('other', 2),
 ('administration', 2),
 ('personnel', 2),
 ('urged', 2),
 ('implementation', 2),
 ('provide', 2),
 ('program', 2),
 ('counties', 2),
 ('exception', 2),
 ('m

### Exercise 1.3
Write a function `find_candidates()` which will find words which 
* have 2 senses in the sample, 
* occurrences are roughly balanced between the two classes (between 30% and 70%)
* are as frequent as possible

Test it on `some_sentences`

In [62]:
def find_candidates(sense_dists):
    cands=[]
    for key,worddict in sense_dists.items():
        if len(worddict.keys())==2:
            freq=sum(worddict.values())
            p=list(worddict.values())[0]/freq
            if p>0.3 and p<0.7:
                cands.append((key,freq,p))
    return sorted(cands,key=operator.itemgetter(1),reverse=True)


    

In [63]:
find_candidates(sense_dists)

[('said', 24, 0.5416666666666666),
 ('million', 6, 0.3333333333333333),
 ('number', 3, 0.6666666666666666),
 ('are', 3, 0.6666666666666666),
 ('some', 3, 0.6666666666666666),
 ('elected', 3, 0.6666666666666666),
 ('be', 3, 0.3333333333333333),
 ('being', 3, 0.3333333333333333),
 ('worth', 3, 0.6666666666666666),
 ('end', 2, 0.5),
 ('charged', 2, 0.5),
 ('interest', 2, 0.5),
 ('act', 2, 0.5),
 ('operated', 2, 0.5),
 ('follow', 2, 0.5),
 ('take', 2, 0.5),
 ('political', 2, 0.5),
 ('permit', 2, 0.5),
 ('home', 2, 0.5),
 ('been', 2, 0.5),
 ('asked', 2, 0.5),
 ('voted', 2, 0.5),
 ('time', 2, 0.5),
 ('privilege', 2, 0.5),
 ('got', 2, 0.5)]

We now need to apply our functions to larger samples.  Here we will define two sets of sentences `training_sentences` and `testing_sentences`.  We are going to a random sample of the documents for testing.  We can achieve this by randomly shuffling the fileids and then assigning documents in the first part of the list to training and documents in the second part of the list to testing.  By setting the random seed, we ensure reproducibility of our results (since the random shuffle will be the same each time we riun the cell)



In [64]:
import random
random.seed(37)
shuffled=list(allfiles)
random.shuffle(shuffled)
print(shuffled)

['brownv/tagfiles/br-a29.xml', 'brownv/tagfiles/br-l06.xml', 'brown2/tagfiles/br-e31.xml', 'brownv/tagfiles/br-c06.xml', 'brown2/tagfiles/br-j34.xml', 'brownv/tagfiles/br-e11.xml', 'brownv/tagfiles/br-a21.xml', 'brown1/tagfiles/br-j01.xml', 'brownv/tagfiles/br-a17.xml', 'brown1/tagfiles/br-l12.xml', 'brownv/tagfiles/br-e09.xml', 'brown2/tagfiles/br-g17.xml', 'brown2/tagfiles/br-g18.xml', 'brownv/tagfiles/br-g09.xml', 'brownv/tagfiles/br-l04.xml', 'brownv/tagfiles/br-l05.xml', 'brown2/tagfiles/br-l18.xml', 'brownv/tagfiles/br-d08.xml', 'brown1/tagfiles/br-k16.xml', 'brown2/tagfiles/br-f21.xml', 'brown2/tagfiles/br-n11.xml', 'brown1/tagfiles/br-j04.xml', 'brownv/tagfiles/br-e16.xml', 'brownv/tagfiles/br-a25.xml', 'brown2/tagfiles/br-n17.xml', 'brownv/tagfiles/br-g06.xml', 'brownv/tagfiles/br-e06.xml', 'brownv/tagfiles/br-a42.xml', 'brown1/tagfiles/br-g01.xml', 'brown1/tagfiles/br-j19.xml', 'brown2/tagfiles/br-n15.xml', 'brown1/tagfiles/br-j05.xml', 'brownv/tagfiles/br-b16.xml', 'brown1/t

In [65]:
#this cell will take 1-5 minutes to run - avoid rerunning it unecessarily
training_sentences=extract_senses(shuffled[:300])
testing_sentences=extract_senses(shuffled[300:])

Processing brownv/tagfiles/br-a29.xml
Processing brownv/tagfiles/br-l06.xml
Processing brown2/tagfiles/br-e31.xml
Processing brownv/tagfiles/br-c06.xml
Processing brown2/tagfiles/br-j34.xml
Processing brownv/tagfiles/br-e11.xml
Processing brownv/tagfiles/br-a21.xml
Processing brown1/tagfiles/br-j01.xml
Processing brownv/tagfiles/br-a17.xml
Processing brown1/tagfiles/br-l12.xml
Processing brownv/tagfiles/br-e09.xml
Processing brown2/tagfiles/br-g17.xml
Processing brown2/tagfiles/br-g18.xml
Processing brownv/tagfiles/br-g09.xml
Processing brownv/tagfiles/br-l04.xml
Processing brownv/tagfiles/br-l05.xml
Processing brown2/tagfiles/br-l18.xml
Processing brownv/tagfiles/br-d08.xml
Processing brown1/tagfiles/br-k16.xml
Processing brown2/tagfiles/br-f21.xml
Processing brown2/tagfiles/br-n11.xml
Processing brown1/tagfiles/br-j04.xml
Processing brownv/tagfiles/br-e16.xml
Processing brownv/tagfiles/br-a25.xml
Processing brown2/tagfiles/br-n17.xml
Processing brownv/tagfiles/br-g06.xml
Processing b

### Exercise 1.4
Use the functionality you have already developed to identify:
* the ten most frequent monosemous words in the data
* the ten best candidates for evaluating binary classification algorithms for WSD

In [66]:
training_dist=find_sense_distributions(training_sentences)

In [67]:
find_monosemous(training_dist)

[('not', 1550),
 ('also', 366),
 ('many', 279),
 ('never', 202),
 ('again', 181),
 ('always', 152),
 ('almost', 149),
 ('nothing', 119),
 ('wife', 105),
 ('probably', 95),
 ('already', 92),
 ('perhaps', 90),
 ('usually', 84),
 ('available', 83),
 ('therefore', 83),
 ('sometimes', 74),
 ('ago', 73),
 ('few', 72),
 ('ideas', 68),
 ('anode', 65),
 ('merely', 64),
 ('before', 58),
 ('indeed', 58),
 ('husband', 58),
 ('especially', 57),
 ('difficult', 56),
 ('person', 56),
 ('century', 55),
 ('normal', 55),
 ('died', 54),
 ('car', 54),
 ('particularly', 52),
 ('dictionary', 52),
 ('jewish', 51),
 ('text', 51),
 ('summer', 50),
 ('soon', 50),
 ('certainly', 49),
 ('dominant', 49),
 ('hair', 49),
 ('vocational', 49),
 ('nearly', 48),
 ('maybe', 48),
 ('existence', 46),
 ('jess', 46),
 ('attitude', 45),
 ('kate', 45),
 ('income', 44),
 ('achieved', 43),
 ('trees', 42),
 ('elections', 42),
 ('equipment', 41),
 ('killed', 41),
 ('latter', 41),
 ('smiled', 41),
 ('college', 41),
 ('scotty', 40),


In [68]:
find_candidates(training_dist)

[('too', 232, 0.6379310344827587),
 ('thus', 119, 0.6974789915966386),
 ('really', 88, 0.5795454545454546),
 ('described', 60, 0.5333333333333333),
 ('basis', 57, 0.42105263157894735),
 ('months', 56, 0.5535714285714286),
 ('caused', 49, 0.6530612244897959),
 ('instead', 47, 0.3829787234042553),
 ('radiation', 46, 0.32608695652173914),
 ('cells', 43, 0.37209302325581395),
 ('won', 41, 0.34146341463414637),
 ('labor', 38, 0.5263157894736842),
 ('agreed', 37, 0.5675675675675675),
 ('becoming', 37, 0.6756756756756757),
 ('discussion', 36, 0.4722222222222222),
 ('asking', 36, 0.6666666666666666),
 ('department', 34, 0.5882352941176471),
 ('particles', 34, 0.6764705882352942),
 ('churches', 34, 0.5882352941176471),
 ('destroy', 33, 0.6363636363636364),
 ('spend', 32, 0.6875),
 ('extent', 32, 0.625),
 ('funds', 32, 0.59375),
 ('quickly', 30, 0.4),
 ('wondered', 29, 0.3103448275862069),
 ('neighborhood', 28, 0.6785714285714286),
 ('objective', 26, 0.6923076923076923),
 ('sleep', 26, 0.3461538

## Building Naive Bayes Classifiers for WSD
We are going to train and use a NB classifier to identify the correct sense of a word.

The functions below will get all of the sentences containing a word of choice and generate a Bernouilli bag-of-words representation suitable for a Naive Bayes classifier.

Try it out on one of the words you identified above.

In [69]:
def contains(sentence,astring):
    '''
    check whether sentence contains astring
    '''
    if len(sentence)>0:
        tokens,tags=zip(*sentence)
        return astring in tokens
    else:
        return False
    
def get_label(sentence,word):
    '''
    get the synset label for the word in this sentence
    '''
    count=0
    label="none"
    for token,tag in sentence:
        if token==word:
            count+=1
            label=str(tag)
    if count !=1:
        #print("Warning: {} occurs {} times in {}".format(word,count,sentence))
        pass
    return label
    
def get_word_data(sentences,word):
    '''
    select sentences containing words and construct labelled data set where each sentence is represented using Bernouilli event model
    '''
    selected_sentences=[sentence for sentence in sentences if contains(sentence,word)]
    word_data=[({token:True for (token,tag) in sentence},get_label(sentence,word)) for sentence in selected_sentences] 
    return word_data

In [70]:
get_word_data(training_sentences,"radiation")

[({'be': True,
   'coincidence': True,
   'combination': True,
   'components': True,
   'is': True,
   'merely': True,
   'observed': True,
   'planet': True,
   'possibility': True,
   'radiation': True,
   'result': True,
   'solid': True,
   'spectrum': True,
   'suggests': True,
   'surface': True,
   'thermal': True,
   'very': True},
  "Synset('radiation.n.02')"),
 ({'atmosphere': True,
   'black-body': True,
   'case': True,
   'combination': True,
   'components': True,
   'definitely': True,
   'earth': True,
   'is': True,
   'jupiter': True,
   'likely': True,
   'not': True,
   'radiation': True,
   'radiator': True,
   'reaching': True,
   'seems': True,
   'spectrum': True,
   'thermal': True,
   'very': True},
  "Synset('radiation.n.02')"),
 ({'3': True,
   'about': True,
   'agreement': True,
   'basis': True,
   'cm': True,
   'intensity': True,
   'is': True,
   'known': True,
   'mars': True,
   'observed': True,
   'predicted': True,
   'radiation': True,
   'reaso

We can now train and test a NaiveBayesClassifier.  Here we are going to use the nltk one, but feel free to try out your own developed in earlier labs.

In [77]:
from nltk.classify.naivebayes import NaiveBayesClassifier

training=get_word_data(training_sentences,"thus")
testing=get_word_data(testing_sentences,"thus")
aclassifier=NaiveBayesClassifier.train(training)

In [78]:
len(training)

119

In [79]:
len(testing)

12

### Exercise 2.1
Write a function to evaluate the accuracy of your classifier on some test data.

Test it using `testing`

In [110]:
def evaluate(cls,test_data):
    correct=0
    wrong=0
   
    for doc,label in test_data:
        prediction=cls.classify(doc)
        if prediction==label:
            correct+=1
        else:
            wrong+=1
    if correct+wrong > 0:
      acc=correct/(correct+wrong)
    else:
      acc=1
    print("Accuracy of NB classification on testing data is {} out of {} = {}".format(correct,correct+wrong,acc))

In [81]:
evaluate(aclassifier,testing)

Accuracy of NB classification on testing data is 7 out of 12 = 0.5833333333333334


In [82]:
evaluate(aclassifier,training)

Accuracy of NB classification on testing data is 118 out of 119 = 0.9915966386554622


### Exercise 2.2 (**EXTENSION**)
Write some code to determine the precision of each class.  You might want to adapt / reuse the ConfusionMatrix class from Lab_4_1

In [102]:
class ConfusionMatrix:
  def __init__(self,predictions,goldstandard,classes=("P","N")):
    (self.c1,self.c2)=classes 
    self.TP=0
    self.FP=0
    self.FN=0
    self.TN=0
    for p,g in zip(predictions,goldstandard):
      
      if g==self.c1:
        if p==self.c1:
            self.TP+=1
        else:
            self.FN+=1
      elif p==self.c1:
          self.FP+=1
      else:
          self.TN+=1        
    
  def precision(self):
    #put your code to compute precision here
    if self.TP + self.FP > 0:
      p = self.TP / (self.TP + self.FP)
    else:
      p = 1
    
    return p
  
  def recall(self):
    r=0
    #put your code to compute recall here
    
    return r
  
  def f1(self):
    f1=0
    #put your code to compute f1 here
      
    return f1 
  def display(self):
    print("TP = {}, FN = {}".format(self.TP,self.FN))
    print("FP = {}, TN = {}".format(self.FP, self.TN))

In [103]:
docs,labels=zip(*testing)
predictions=[aclassifier.classify(doc) for doc in docs]
myCM=ConfusionMatrix(predictions,labels,classes=set(labels))

In [104]:
myCM.display()

TP = 3, FN = 4
FP = 1, TN = 4


In [105]:
myCM.precision()

0.75

In [106]:
classes=list(set(labels))
reversed(classes)

<list_reverseiterator at 0x7f259bc2e358>

In [115]:
def evaluate_precision(cls,test_data,classes=[]):
    if len(test_data)>0:
      docs,labels=zip(*test_data)
      predictions=[cls.classify(doc) for doc in docs]
      if len(classes)==0:
        classes=list(set(labels))
      print(classes)
      cm1 = ConfusionMatrix(predictions,labels,classes=classes)
      cm1.display()
      print("Precision for sense {} is {}".format(classes[0],cm1.precision()))
      cm2 = ConfusionMatrix(predictions,labels,classes=list(reversed(classes)))
      cm2.display()
      print("Precision for sense {} is {}".format(classes[1],cm2.precision()))
      return (cm1.precision(),cm2.precision())
    else:
      return(1,1)
      
evaluate_precision(aclassifier,testing)

["Synset('thus.r.02')", "Synset('therefore.r.01')"]
TP = 3, FN = 4
FP = 1, TN = 4
Precision for sense Synset('thus.r.02') is 0.75
TP = 4, FN = 1
FP = 4, TN = 3
Precision for sense Synset('therefore.r.01') is 0.5


(0.75, 0.5)

### Exercise 2.3 (**EXTENSION**)
Write a function `train_and_test()` which gets the appropriate training and testing data for a given word, builds a classifier and outputs the precision with which each class is predicted

In [114]:
def train_and_test(word):
    training=get_word_data(training_sentences,word)
    testing=get_word_data(testing_sentences,word)
    print(len(training),len(testing))
    classifier=NaiveBayesClassifier.train(training)
    evaluate(classifier,testing)
    #docs,labels=zip(*testing)
    #print(set(labels))
    train_docs,train_labels=zip(*training)
    print(set(train_labels))
    p1,p2=evaluate_precision(classifier,testing,classes=list(set(train_labels)))
    return (p1,p2)
    
train_and_test("too")

221 25
Accuracy of NB classification on testing data is 15 out of 25 = 0.6
{"Synset('excessively.r.01')", "Synset('besides.r.02')"}
["Synset('excessively.r.01')", "Synset('besides.r.02')"]
TP = 14, FN = 4
FP = 6, TN = 1
Precision for sense Synset('excessively.r.01') is 0.7
TP = 1, FN = 6
FP = 4, TN = 14
Precision for sense Synset('besides.r.02') is 0.2


(0.7, 0.2)

### Exercise 2.4 (**EXTENSION**)
* Run `train_and_test()` on the top 50 candidate words identified earlier in the exercise.  
* Display results in a pandas dataframe
* Calculate average precision

In [99]:
allwords=[word for (word,f,s) in find_candidates(find_sense_distributions(training_sentences))]
top50=allwords[:50]

In [100]:
top50

['too',
 'thus',
 'really',
 'described',
 'basis',
 'months',
 'caused',
 'instead',
 'radiation',
 'cells',
 'won',
 'labor',
 'agreed',
 'becoming',
 'discussion',
 'asking',
 'department',
 'particles',
 'churches',
 'destroy',
 'spend',
 'extent',
 'funds',
 'quickly',
 'wondered',
 'neighborhood',
 'objective',
 'sleep',
 'procedure',
 'containing',
 'university',
 'downtown',
 'listen',
 'minimum',
 'concrete',
 'payment',
 'flowers',
 'leaders',
 'intensity',
 'skill',
 'resulted',
 'sing',
 'reasonable',
 'democratic',
 'situations',
 'plants',
 'scientific',
 'exception',
 'task',
 'curve']

In [116]:
results=[]
for word in top50:
  print(word)
  p1,p2=train_and_test(word)
  results.append((word,p1,p2,(p1+p2)/2))



too
221 25
Accuracy of NB classification on testing data is 15 out of 25 = 0.6
{"Synset('excessively.r.01')", "Synset('besides.r.02')"}
["Synset('excessively.r.01')", "Synset('besides.r.02')"]
TP = 14, FN = 4
FP = 6, TN = 1
Precision for sense Synset('excessively.r.01') is 0.7
TP = 1, FN = 6
FP = 4, TN = 14
Precision for sense Synset('besides.r.02') is 0.2
thus
119 12
Accuracy of NB classification on testing data is 7 out of 12 = 0.5833333333333334
{"Synset('thus.r.02')", "Synset('therefore.r.01')"}
["Synset('thus.r.02')", "Synset('therefore.r.01')"]
TP = 3, FN = 4
FP = 1, TN = 4
Precision for sense Synset('thus.r.02') is 0.75
TP = 4, FN = 1
FP = 4, TN = 3
Precision for sense Synset('therefore.r.01') is 0.5
really
88 13
Accuracy of NB classification on testing data is 6 out of 13 = 0.46153846153846156
{"Synset('actually.r.01')", "Synset('truly.r.01')"}
["Synset('actually.r.01')", "Synset('truly.r.01')"]
TP = 1, FN = 3
FP = 4, TN = 5
Precision for sense Synset('actually.r.01') is 0.2
TP

In [118]:
results

[('too', 0.7, 0.2, 0.44999999999999996),
 ('thus', 0.75, 0.5, 0.625),
 ('really', 0.2, 0.625, 0.4125),
 ('described', 0.5833333333333334, 0.875, 0.7291666666666667),
 ('basis', 0.5714285714285714, 0.5, 0.5357142857142857),
 ('months', 1, 0.0, 0.5),
 ('caused', 0.0, 1.0, 0.5),
 ('instead', 0.5, 0.8571428571428571, 0.6785714285714286),
 ('radiation', 1, 1, 1.0),
 ('cells', 0.5, 0.2222222222222222, 0.3611111111111111),
 ('won', 0.8181818181818182, 1, 0.9090909090909092),
 ('labor', 1.0, 1, 1.0),
 ('agreed', 0.8571428571428571, 1, 0.9285714285714286),
 ('becoming', 1, 0.75, 0.875),
 ('discussion', 0.0, 0.3333333333333333, 0.16666666666666666),
 ('asking', 1.0, 1.0, 1.0),
 ('department', 0.0, 1.0, 0.5),
 ('particles', 1, 1, 1.0),
 ('churches', 1, 1, 1.0),
 ('destroy', 1, 0.6666666666666666, 0.8333333333333333),
 ('spend', 0.3333333333333333, 0.8888888888888888, 0.611111111111111),
 ('extent', 1.0, 0.6, 0.8),
 ('funds', 1, 1, 1.0),
 ('quickly', 1.0, 0.8, 0.9),
 ('wondered', 0.571428571428571

In [119]:
import pandas as pd
df=pd.DataFrame(results,columns=['word','sense 1 precision','sense 2 precision','average precision'])
display(df)

Unnamed: 0,word,sense 1 precision,sense 2 precision,average precision
0,too,0.7,0.2,0.45
1,thus,0.75,0.5,0.625
2,really,0.2,0.625,0.4125
3,described,0.583333,0.875,0.729167
4,basis,0.571429,0.5,0.535714
5,months,1.0,0.0,0.5
6,caused,0.0,1.0,0.5
7,instead,0.5,0.857143,0.678571
8,radiation,1.0,1.0,1.0
9,cells,0.5,0.222222,0.361111


In [120]:
df.describe()

Unnamed: 0,sense 1 precision,sense 2 precision,average precision
count,50.0,50.0,50.0
mean,0.705307,0.676365,0.690836
std,0.373347,0.396284,0.249866
min,0.0,0.0,0.166667
25%,0.5,0.375,0.5
50%,0.954545,0.881944,0.691558
75%,1.0,1.0,0.982143
max,1.0,1.0,1.0
