# SI630 Homework 2: Word2vec Vector Analysis

*Important Note:* Start this notebook only after you've gotten your word2vec model up and running!

Many NLP packages support working with word embeddings. In this notebook you can work through the various problems assigned in Task 3. We've provided the basic functionality for loading word vectors using [Gensim](https://radimrehurek.com/gensim/models/keyedvectors.html), a good library for learning and using word vectors, and for working with the vectors. 

One of the fun parts of word vectors is getting a sense of what they learned. Feel free to explore the vectors here! 

In [1]:
from gensim.models import KeyedVectors
from gensim.test.utils import datapath

In [2]:
word_vectors = KeyedVectors.load_word2vec_format('word2vec.kv', binary=False)

In [4]:
word_vectors.similar_by_word("books")

[('novels', 0.8401198983192444),
 ('trilogies', 0.8302552103996277),
 ('volumes', 0.7945647239685059),
 ('offerings', 0.7857593297958374),
 ('comics', 0.7842567563056946),
 ('biographies', 0.782633900642395),
 ('novellas', 0.7824888229370117),
 ("lamb's", 0.7815375328063965),
 ('anthologies', 0.7699565887451172),
 ('westerns', 0.7669244408607483)]

In [10]:
target_words = ["school", "ponder", "positive", "banana", "entrepreneur",
                "enthusiasm", "facilitate", "hierarchy", "illusion", "narcissism"]

for word in target_words:
    print(word + ": " + ", ".join([w[0] for w in word_vectors.most_similar(word)]))

school: college, freshman, schoolers, junior, schooler, grad, med, elementary, sophomore, preschool
ponder: bureaucrats, dint, westerners, immaterial, motivates, rationally, encouraged, comforting, deem, heretofore
positive: negative, favorable, constructive, encouraged, ponder, glowing, raving, positively, honest, unbiased
banana: pudding, beef, caramel, blackberry, stroganoff, chips, roast, cookies, chair, advent
entrepreneur: adventurers, adoptees, swimmer, pioneers, doting, pathologist, befriends, senators, wizardry, gymnast
enthusiasm: trajectory, transformative, affirmative, underpinnings, flare, candor, ebb, soul-searching, motivation, manifestations
facilitate: broaden, solidify, hypnosis, counseling, maximize, strict, memorization, guiding, mindfulness, retain
hierarchy: commanders, semitic, degradation, flavored, perpetrators, belittles, drawl, upheavals, statesman, starvation
illusion: permit, situated, sting, culprits, minimizing, befall, migraines, perpetrators, fabricatio

In [11]:
def get_analogy(a, b, c):
    return word_vectors.most_similar(positive=[b, c], negative=[a])[0][0]

In [16]:
get_analogy('team', 'profession', 'soccer')

'practitioner'

In [32]:
get_analogy('color', 'feeling', 'blue')

'depressed'

In [37]:
get_analogy('drink', 'food', 'coffee')

'combo'

In [51]:
get_analogy('justice', 'court', 'lawyer')

'scandal'

In [54]:
get_analogy('soft', 'art', 'paint')

"artist's"

In [56]:
get_analogy('hitler', 'dictator', 'trump')

'naturalist'

- soccer - team + profession = practitioner: 
The analogy could suggest that someone who plays soccer professionally could be thought of as a practitioner of the sport.

- blue - color + feeling = depressed:
I guess the analogy suggests that the feeling of being "blue" is equivalent to feeling "depressed." That makes sense to me.

- coffee - drink + food = combo:
This analogy may work in the context of fast food or restaurant combos, where a combination meal typically includes a main dish, side dish, and drink.

- lawyer - justice + court = scandal:
This word analogy doesn't seem to hold a valid logical connection. While "lawyer" is associated with "justice", and subtracting "justice" could suggest a different element related to the law profession, adding "court" doesn't seem to logically lead to "scandal". 

- paint - soft + art = artisit's: 
This word analogy seems to be valid and makes sense in the context of the English language. "Paint" is typically associated with "art", and subtracting "soft" could suggest a different quality related to painting or art in general. Adding "artist's" seems to be a valid completion of the analogy, as it connects the idea of art with the idea of someone who creates it.