# CS1671 Assignment 4: Vector Space Models
### Jacob Emmerson
Due: Novemeber 20th, 2023 @ 11:59pm

This assignment primarily focuses arround the material from chapter 6 in *Speech and Language Processing* (3rd Ed.)

**Primary Question:** How good are vector space representations built using Shakespeare data?


*Sources*
- https://numpy.org/doc/stable/reference/routines.array-creation.html (NumPy Documentation for Array Initialization)
- https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_matrix.html (Compressed Sparse Rows for Memory Optimization during PPMI calculation)

---

In [39]:
from hw4_skeleton_jte27 import *
from scipy.stats import kendalltau
import pandas as pd

In [91]:
vocab_to_index = dict(zip(vocab, range(0, len(vocab))))
def get_ranks(matrix, words, vocab, vocab_to_index):
    rank_df = pd.DataFrame({'rank' : range(1,11)}).set_index('rank')
    for word in words:
        ranks, scores = rank_words(vocab_to_index[word], matrix)
        t10_words = [vocab[r] for r in ranks[:10]]
        t10_scores = np.round(scores[:10], 4)

        rank_df[word] = tuple(zip(t10_words,t10_scores))
    return rank_df

## Formatting Data

Due to the limited memory on my machine (WSL2, Max 6GB of Memory with 4 Processors), I subset the data. 

The subset of data will be used for each experiment. This technique works by selecting either the first $N$ documents or randomly sampling $N$ documents from the given document list. To maintain reproducibility, I use the first $N$ documents. Additionally, the vocabulary is subsetted to only the words that occur within the document subset to reduce memory usage when computing PPMI values.

In [74]:
t, dn, v = read_in_shakespeare()
tuples, document_names, vocab = subset_data(t, dn, 12, random = False)

In [75]:
print(f"Original |V| = {len(v)}")
print(f"Subset |V| = {len(vocab)}")
print(f"{len(v) - len(vocab)} words not included.")

Original |V| = 22602
Subset |V| = 14592
8010 words not included.


The following documents are used:

In [76]:
document_names

['Henry IV',
 'Alls well that ends well',
 'Loves Labours Lost',
 'Taming of the Shrew',
 'Antony and Cleopatra',
 'Coriolanus',
 'Hamlet',
 'A Midsummer nights dream',
 'Merry Wives of Windsor',
 'Romeo and Juliet',
 'Richard II',
 'King John']

---
## Creating Matrices

### Raw Frequency Matrices

In [7]:
print("Computing term document matrix...")
td_matrix = create_term_document_matrix(tuples, document_names, vocab)

Computing term document matrix...


In [8]:
print("Computing term context matrix...")
tc_matrix = create_term_context_matrix(tuples, vocab, context_window_size=2)

Computing term context matrix...


**Question:** Can a word co-occur with itself in a Term-Context Matrix? 

While it is possible to allow a word to co-occur with itself in a term-context matrix, I do not permit this to happen. The goal of a term-context matrix is to represent the relationships between a target word and all other words, so the inclusion of the word's occurence with itself is not necessary.

This circumstance changes if the target happens to be within the context window of itself. An instance of this would be a sentence like "He is ***very*** *very* excited" where the target word is emphasized. This example is unlikely to occur in one of Shakespeare's plays; however, it demonstrates an occurence in which a target word may occur with itself. With a large context window, we may see words such as "the" or to-be verbs occuring with themselves.

Additionally, since we are using cosine similarity as our metric of interest for our experiments, we would not want our frequencies for more common words to be inflated.

In [77]:
def ctcm(line_tuples, vocab, context_window_size=1):
    """Returns a numpy array containing the term context matrix for the input lines.

    Inputs:
      line_tuples: A list of tuples, containing the name of the document and
      a tokenized line from that document.
      vocab: A list of the tokens in the vocabulary

    # NOTE: THIS DOCSTRING WAS UPDATED ON JAN 24, 12:39 PM.

    Let n = len(vocab).

    Returns:
      tc_matrix: A nxn numpy array where A_ij contains the frequency with which
          word j was found within context_window_size to the left or right of
          word i in any sentence in the tuples.
    """
    n = len(vocab)
    cws = context_window_size

    word_index = dict(zip(vocab, range(len(vocab))))
    tc_matrix = np.zeros(shape = (n,n), dtype = np.int32)

    for line in line_tuples:
        sentence = line[1] # don't care about documents

        for i in range(len(sentence)):
            target_index = word_index[sentence[i]] # target word (row index)
            
            L_win = sentence[max(0, i - cws):(i)] # upper is exclusive
            if i == len(sentence): # if we are at the end of the sentence, no upper window 
              #(i + 1) throws an error
                U_win = []
            else: 
                U_win = sentence[(i):(i + cws + 1)] 

            window = L_win + U_win
            for word in window: # add the word instances to the tc_matrix
                wj = word_index[word] #context index
                tc_matrix[target_index,wj] += 1

    return tc_matrix

In [78]:
bad_tc = ctcm(tuples, vocab, context_window_size=2)

*A version of `create_term_context_matrix` with self-inclusion

**Top 10 Similar Words using a Self-Inclusive Term Context Matrix**

In [92]:
get_ranks(bad_tc, ['juliet'], vocab, vocab_to_index)

Unnamed: 0_level_0,juliet
rank,Unnamed: 1_level_1
1,"(pined, 0.5597)"
2,"(waken, 0.4198)"
3,"(and, 0.2526)"
4,"(drinkings, 0.2021)"
5,"(scamble, 0.1999)"
6,"(metheglins, 0.1999)"
7,"(wills, 0.1941)"
8,"(bleeding, 0.1933)"
9,"(wakes, 0.1871)"
10,"(enter, 0.186)"


**Exclusive Term Context Matrix**

In [93]:
get_ranks(tc_matrix, ['juliet'], vocab, vocab_to_index)

Unnamed: 0_level_0,juliet
rank,Unnamed: 1_level_1
1,"(helena, 0.7085)"
2,"(therefore, 0.6935)"
3,"(nurse, 0.6871)"
4,"(demetrius, 0.6843)"
5,"(bertram, 0.681)"
6,"(costard, 0.6781)"
7,"(falstaff, 0.6771)"
8,"(caesar, 0.676)"
9,"(her, 0.6709)"
10,"(others, 0.6697)"


From our above ranks using the word 'juliet', we can see that a term-context matrix which allows for words to co-occure with themselves tends to rank irrelvant words much higher. We can see that these words tend to be more common such as "and" as well as "enter". Additionally, these words tend to have less of an intuitive reasoning for the similarity. 

With our exclusive term-context matrix, we see words ranked with a higher cosine similarity which additionally make more sense as being related to "juliet". Explanations for these ranks will be expanded on in **Model Evaluation**

### Weighted Matrices

In [9]:
print("Computing tf-idf matrix...")
tf_idf_matrix = create_tf_idf_matrix(td_matrix)

Computing tf-idf matrix...


To save memory, I directly calculate the probabilities where possible.

Let $N$ denote the total number of counts in a $w \times c$ term-context matrix 

\begin{align*}
PMI(w,c) &= \log_2 \frac{P(w,c)}{P(w)P(c)} \\
         &= \log_2 \frac{counts(w,c)/N}{(counts(w)/N) \cdot (counts(c)/N)} \\
         &= \log_2 \frac{counts(w,c)}{(1/N)counts(w)counts(c)} \\
         &= \log_2 \frac{N \cdot counts(w,c)}{counts(w) counts(c)}
\end{align*}

PPMI can now be efficiently calculated using vector arithmetic and numpy's methods for dealing with negative values and $nans$

In [10]:
print("Computing ppmi matrix...")
ppmi_matrix = create_ppmi_matrix(tc_matrix)

Computing ppmi matrix...


  return np.true_divide(self.todense(), other)
  ppmi_mat = np.log2((tcm * total) / (word_counts * context_counts))


*Ignore the runtime warnings, these are due to log(0) which are handled during the removal of negative PMI values

---
## Model Evaluation

To evaluate each vector space model, I will find the top 10 most similar words to 
- juliet
- romeo
- royal
- evil
- wicked
- he

Note that these rankings will likely change if the full dataset is used. These were computed using a subset of 12 documents from the 36 document pool.

In [105]:
words = ['juliet', 'romeo', 'royal', 'evil', 'wicked', 'he']

The function `get_ranks` computes the most similar words using the `rank_words` function as outlined in the assignment guidelines and stores them into a dataframe where each column is our choice word and the rows are tuples of the highest ranked word with a cosine similarity score.

### Term-Document Matrix

In [106]:
print(
    '\nThe 10 most similar words to "%s" using cosine-similarity on term-document frequency matrix are:'
    % (words)
)
get_ranks(td_matrix, words, vocab, vocab_to_index)


The 10 most similar words to "['juliet', 'romeo', 'royal', 'evil', 'wicked', 'he']" using cosine-similarity on term-document frequency matrix are:


Unnamed: 0_level_0,juliet,romeo,royal,evil,wicked,he
rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1,"(displant, 1.0)","(gossamer, 1.0)","(sovereign, 0.9676)","(suit, 0.922)","(offence, 0.9118)","(him, 0.9928)"
2,"(heretics, 1.0)","(dowdy, 1.0)","(forego, 0.9416)","(beam, 0.9152)","(question, 0.9032)","(have, 0.9894)"
3,"(shapen, 1.0)","(rooteth, 1.0)","(boast, 0.9395)","(employment, 0.9058)","(conscience, 0.9003)","(had, 0.9886)"
4,"(hopeful, 1.0)","(engrossing, 1.0)","(crown, 0.9385)","(foolery, 0.9014)","(image, 0.8997)","(cannot, 0.9874)"
5,"(tilts, 1.0)","(scaring, 1.0)","(highness, 0.9354)","(object, 0.887)","(top, 0.8986)","(would, 0.986)"
6,"(vowel, 1.0)","(carelessly, 1.0)","(burthen, 0.9301)","(rags, 0.8825)","(gifts, 0.8943)","(known, 0.9796)"
7,"(grievance, 1.0)","(hildings, 1.0)","(throw, 0.927)","(physic, 0.8825)","(hell, 0.8918)","(them, 0.9791)"
8,"(puffs, 1.0)","(stakes, 1.0)","(sorrow, 0.9238)","(quoted, 0.8825)","(distemper, 0.8911)","(they, 0.975)"
9,"(behests, 1.0)","(heareth, 1.0)","(kingly, 0.9174)","(breach, 0.8825)","(ambition, 0.8858)","(fellow, 0.9729)"
10,"(stakes, 1.0)","(earthen, 1.0)","(glory, 0.917)","(german, 0.8825)","(effect, 0.8813)","(than, 0.9712)"


**Question**: In our term-document matrix, the rows are word vectors of  dimensions. Do you think that’s enough to represent the meaning of words?

I do not think that is enough to represent a good approximation of the meaning of words. Clearly document frequency, regardless of the number of documents, would not be able to cover every caveat of the English language, particularly Shakespearean language where words are frequently being made up. However, we can see that for simple words such as "royal" that it is able to select generally similar words. In the case of a proper noun such as 'juliet' or 'romeo', 12 documents is not enough to represent similar words.

### Term-Context Matrix

In [107]:
print(
    '\nThe 10 most similar words to "%s" using cosine-similarity on term-context frequency matrix are:'
    % (words)
)
get_ranks(tc_matrix, words, vocab, vocab_to_index)


The 10 most similar words to "['juliet', 'romeo', 'royal', 'evil', 'wicked', 'he']" using cosine-similarity on term-context frequency matrix are:


Unnamed: 0_level_0,juliet,romeo,royal,evil,wicked,he
rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1,"(helena, 0.7085)","(she, 0.7231)","(presence, 0.7248)","(only, 0.5112)","(most, 0.6687)","(she, 0.9206)"
2,"(therefore, 0.6935)","(dead, 0.6864)","(hand, 0.7151)","(this, 0.4996)","(wall, 0.661)","(it, 0.8611)"
3,"(nurse, 0.6871)","(there, 0.6861)","(head, 0.6831)","(which, 0.4959)","(great, 0.6495)","(who, 0.7644)"
4,"(demetrius, 0.6843)","(here, 0.6851)","(honour, 0.6822)","(t, 0.49)","(such, 0.642)","(there, 0.7499)"
5,"(bertram, 0.681)","(he, 0.6815)","(state, 0.6804)","(loves, 0.4854)","(sweet, 0.6391)","(so, 0.7484)"
6,"(costard, 0.6781)","(tybalt, 0.6634)","(sight, 0.6795)","(but, 0.4851)","(pit, 0.6332)","(that, 0.7287)"
7,"(falstaff, 0.6771)","(caesar, 0.6598)","(blood, 0.6755)","(aught, 0.4834)","(fair, 0.6249)","(dead, 0.7256)"
8,"(caesar, 0.676)","(lucentio, 0.6394)","(fair, 0.6754)","(something, 0.4806)","(woodcock, 0.6233)","(this, 0.7239)"
9,"(her, 0.6709)","(juliet, 0.6382)","(great, 0.6713)","(himself, 0.4801)","(commoner, 0.6228)","(which, 0.7124)"
10,"(others, 0.6697)","(hamlet, 0.6164)","(company, 0.6693)","(has, 0.4794)","(form, 0.6228)","(indeed, 0.7111)"


In regard to the more accurate term-context matrix, we can see that the term-context matrix does slightly better at finding similar words to proper nouns. Juliet's most similar word is compared to helena which is another fictional character involved in a complicated love story. It is slightly surprising Romeo's top word is "she"; however, in Shakespeare's play he was quite the romantic poet, so it is unsurprising his name would co-occur with "she" somewhat frequently. Looking at the non-proper nouns, the word 'royal' has decently similar words in both representations. In the term-document matrix, words such as 'sovereign', 'crown', and 'highness' are words which I would consider to be very similar to 'royal'. However, these do not appear in the term-context matrix. That being said, the term-context matrix finds other words which are still similar but in a slightly different sense. Royal in the term-context is more similar to words which represent ruling or governing postitions such as "head", "state", and "presence". 

Both representations seem to struggle with representing "evil" where neither seemed to find words which I would truly consider to be similar to one another.

Overall, both vector representations have their issues; however, it is difficult to say whether one is objectively better than the other. Perhaps with more documents these would be different; however, depending on the type of similarity a person is looking for, one method of representation may be preferred over the other.

To elaborate on the previous question of whether words can self occur, we can get the ranks using the "bad" term-context matrix as well.

In [114]:
get_ranks(bad_tc, words, vocab, vocab_to_index)

Unnamed: 0_level_0,juliet,romeo,royal,evil,wicked,he
rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1,"(pined, 0.5597)","(retorts, 0.5655)","(couplement, 0.5771)","(excels, 0.3483)","(converts, 0.3434)","(heareth, 0.8155)"
2,"(waken, 0.4198)","(jour, 0.497)","(chasing, 0.4883)","(is, 0.2083)","(dissembling, 0.3235)","(capers, 0.7866)"
3,"(and, 0.2526)","(hist, 0.4324)","(presences, 0.4883)","(this, 0.2029)","(commoner, 0.2873)","(numbered, 0.7407)"
4,"(drinkings, 0.2021)","(snatching, 0.3885)","(selves, 0.4078)","(he, 0.1994)","(corpulent, 0.2752)","(parallels, 0.7286)"
5,"(scamble, 0.1999)","(lengthens, 0.3729)","(occupation, 0.3987)","(unbuckles, 0.1947)","(copatain, 0.2682)","(stirreth, 0.6895)"
6,"(metheglins, 0.1999)","(exiled, 0.3415)","(trumpeters, 0.3844)","(trusts, 0.1947)","(a, 0.2635)","(nill, 0.646)"
7,"(wills, 0.1941)","(rosemary, 0.2875)","(fronts, 0.3332)","(recorded, 0.1947)","(o, 0.2614)","(replenished, 0.6291)"
8,"(bleeding, 0.1933)","(slew, 0.2864)","(bail, 0.3206)","(default, 0.1947)","(fiend, 0.2573)","(covetous, 0.6291)"
9,"(wakes, 0.1871)","(banished, 0.2321)","(empress, 0.247)","(drivelling, 0.1947)","(charitable, 0.2538)","(tilts, 0.6044)"
10,"(enter, 0.186)","(doff, 0.2143)","(lists, 0.2177)","(abhominable, 0.1947)","(combination, 0.2448)","(enjoys, 0.6044)"


We can see again that the most similar words do not seem to make much sense. It was best then to use a matrix which does not allow words to occur with themselves.

### Term Frequency Inverse Document Frequency Matrix

In [108]:
print(
    '\nThe 10 most similar words to "%s" using cosine-similarity on term-context frequency matrix are:'
    % (words)
)
get_ranks(tf_idf_matrix, words, vocab, vocab_to_index)


The 10 most similar words to "['juliet', 'romeo', 'royal', 'evil', 'wicked', 'he']" using cosine-similarity on term-context frequency matrix are:


Unnamed: 0_level_0,juliet,romeo,royal,evil,wicked,he
rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1,"(gadding, 1.0)","(worshipp, 1.0)","(confound, 0.9622)","(beam, 0.9656)","(owner, 0.9214)","(worthy, 0)"
2,"(prevails, 1.0)","(scathe, 1.0)","(highness, 0.9533)","(physic, 0.9124)","(damned, 0.9063)","(clearness, 0)"
3,"(advances, 1.0)","(blubbering, 1.0)","(sovereign, 0.9495)","(shoot, 0.9048)","(hole, 0.9005)","(meet, 0)"
4,"(collar, 1.0)","(cheering, 1.0)","(gracious, 0.9445)","(foolery, 0.9016)","(step, 0.8971)","(pain, 0)"
5,"(festering, 1.0)","(easter, 1.0)","(kingly, 0.9423)","(suit, 0.8957)","(everlasting, 0.8969)","(pruning, 0)"
6,"(benedicite, 1.0)","(angelica, 1.0)","(crown, 0.9376)","(employment, 0.8956)","(start, 0.8822)","(dainty, 0)"
7,"(cherishing, 1.0)","(riddling, 1.0)","(ends, 0.935)","(meed, 0.894)","(manners, 0.8791)","(reproach, 0)"
8,"(absolved, 1.0)","(hoarse, 1.0)","(majesty, 0.9308)","(object, 0.8784)","(cursed, 0.8766)","(buds, 0)"
9,"(heartless, 1.0)","(perjuries, 1.0)","(superfluous, 0.9228)","(rags, 0.8706)","(fee, 0.8763)","(slily, 0)"
10,"(poultice, 1.0)","(usest, 1.0)","(war, 0.9132)","(behaviors, 0.8706)","(laid, 0.8755)","(william, 0)"


### Positive Pointwise Mutual Information

In [109]:
print(
    '\nThe 10 most similar words to "%s" using cosine-similarity on term-context frequency matrix are:'
    % (words)
)
get_ranks(ppmi_matrix, words, vocab, vocab_to_index)


The 10 most similar words to "['juliet', 'romeo', 'royal', 'evil', 'wicked', 'he']" using cosine-similarity on term-context frequency matrix are:


Unnamed: 0_level_0,juliet,romeo,royal,evil,wicked,he
rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1,"(capulet, 0.1819)","(mercutio, 0.2296)","(encountering, 0.1827)","(discoveries, 0.3079)","(spectacle, 0.2403)","(hath, 0.1525)"
2,"(tybalt, 0.1721)","(hist, 0.2165)","(carters, 0.1406)","(ministering, 0.2747)","(bombast, 0.2196)","(that, 0.1416)"
3,"(katarina, 0.1595)","(bon, 0.1892)","(dismantled, 0.1226)","(clerks, 0.2669)","(absurd, 0.1914)","(it, 0.1257)"
4,"(paris, 0.1534)","(jour, 0.1836)","(peril, 0.1184)","(crawl, 0.2594)","(disperse, 0.1841)","(is, 0.1251)"
5,"(executioners, 0.1521)","(booted, 0.1673)","(prescription, 0.1175)","(interrupt, 0.25)","(thwarted, 0.1778)","(when, 0.1185)"
6,"(spit, 0.1503)","(kinsman, 0.1543)","(itches, 0.1173)","(unfit, 0.2277)","(angelical, 0.1754)","(she, 0.1136)"
7,"(demurely, 0.1469)","(tybalt, 0.1538)","(commended, 0.1166)","(unseemly, 0.2229)","(diverted, 0.1718)","(him, 0.1105)"
8,"(stabbed, 0.1466)","(slew, 0.1457)","(detestable, 0.1144)","(hunting, 0.2194)","(cools, 0.1652)","(me, 0.1105)"
9,"(rosencrantz, 0.1447)","(pined, 0.145)","(usurped, 0.114)","(conspires, 0.2166)","(resolutely, 0.1626)","(his, 0.1099)"
10,"(behove, 0.1437)","(friar, 0.1343)","(maculate, 0.1124)","(brine, 0.2037)","(obscene, 0.1574)","(himself, 0.1022)"


---
## Extra Credit

In [110]:
simlex = pd.read_csv('./SimLex-999/SimLex-999.txt', sep = '\t')
simlex.head()

Unnamed: 0,word1,word2,POS,SimLex999,conc(w1),conc(w2),concQ,Assoc(USF),SimAssoc333,SD(SimLex)
0,old,new,A,1.58,2.72,2.81,2,7.25,1,0.41
1,smart,intelligent,A,9.2,1.75,2.46,1,7.11,1,0.67
2,hard,difficult,A,8.77,3.76,2.21,2,5.94,1,1.19
3,happy,cheerful,A,9.55,2.56,2.34,1,5.85,1,2.18
4,hard,easy,A,0.95,3.76,2.07,2,5.82,1,0.93


We are primarily interested in the SimLex999 column which are the human annotated scores on the scale from [0,10]

In [111]:
print(f"{simlex.shape = }")

simlex.shape = (999, 10)


In [112]:
sims = {
    'words' : [],
    'Term-Document' : [],
    'Term-Context' : [],
    'TF-IDF' : [],
    'PPMI' : [],
    'simlex' : []
}
missed = []

for row in range(simlex.shape[0]):
    word1 = simlex['word1'][row]
    word2 = simlex['word2'][row]
    if (word1 not in vocab) or (word2 not in vocab):
        missed.append((word1,word2))
        continue

    w1i = vocab_to_index[word1]
    w2i = vocab_to_index[word2]

    sims['words'].append((word1,word2))
    sims['Term-Document'].append(compute_cosine_similarity(td_matrix[w1i,:],td_matrix[w2i,:]))
    sims['Term-Context'].append(compute_cosine_similarity(tc_matrix[w1i,:],tc_matrix[w2i,:]))
    sims['TF-IDF'].append(compute_cosine_similarity(tf_idf_matrix[w1i,:],tf_idf_matrix[w2i,:]))
    sims['PPMI'].append(compute_cosine_similarity(ppmi_matrix[w1i,:],ppmi_matrix[w2i,:]))
    sims['simlex'].append(simlex['SimLex999'][row])

In [113]:
print(f"H0 : tau = 0")
print(f"H1 : tau != 0\n")

for k,v in sims.items():
    if k in ['words', 'simlex']: continue

    kt = kendalltau(v, sims['simlex'], alternative='two-sided')
    print(f"Corr. between Cosine Similarity with {k} and SimLex999 Human Annotations = {kt.statistic}")
    print(f"p-value = {round(kt.pvalue,6)}\n")

H0 : tau = 0
H1 : tau != 0

Corr. between Cosine Similarity with Term-Document and SimLex999 Human Annotations = -0.10189564170793766
p-value = 0.000297

Corr. between Cosine Similarity with Term-Context and SimLex999 Human Annotations = -0.07814316791152155
p-value = 0.005397

Corr. between Cosine Similarity with TF-IDF and SimLex999 Human Annotations = -0.03419435165347337
p-value = 0.255945

Corr. between Cosine Similarity with PPMI and SimLex999 Human Annotations = -0.024001290161916693
p-value = 0.393026

