# Word Vectors using pre-trained Word2Vec

The purpose of this exercise is to find analogies such as `king - man = queen - woman`, and nearest neighbors, using a pre-trained Word2Vec word vectors from https://github.com/mmihaltz/word2vec-GoogleNews-vectors

In [1]:
# We need this library to be able to parse the binary
# pre-trained Word2Vec GoogleNews
from gensim.models import KeyedVectors

Next we're gonna open and parse a massive file containing 3 million words and phrases, D = 300

In [2]:
# 3 million words and phrases
# D = 300

print('Parsing GoogleNews-vectors-negative300.bin...')
word_vectors = KeyedVectors.load_word2vec_format(
  'GoogleNews-vectors-negative300.bin',
  binary=True
)
print('Done!!')

Parsing GoogleNews-vectors-negative300.bin...
Done!!


In [3]:
def find_analogies(w1, w2, w3):
    r = word_vectors.most_similar(positive=[w1, w3], negative=[w2])
    print("%s - %s = %s - %s" % (w1, w2, r[0][0], w3))

In [4]:
def nearest_neighbors(w):
    r = word_vectors.most_similar(positive=[w])
    print("neighbors of: %s" % w)
    for word, score in r:
        print("\t%s" % word)

### Finding analogies

In [5]:
find_analogies('king', 'man', 'woman')
find_analogies('france', 'paris', 'london')
find_analogies('france', 'paris', 'rome')
find_analogies('paris', 'france', 'italy')
find_analogies('france', 'french', 'english')
find_analogies('japan', 'japanese', 'chinese')
find_analogies('japan', 'japanese', 'italian')
find_analogies('japan', 'japanese', 'australian')
find_analogies('december', 'november', 'june')
find_analogies('miami', 'florida', 'texas')
find_analogies('einstein', 'scientist', 'painter')
find_analogies('china', 'rice', 'bread')
find_analogies('man', 'woman', 'she')
find_analogies('man', 'woman', 'aunt')
find_analogies('man', 'woman', 'sister')
find_analogies('man', 'woman', 'wife')
find_analogies('man', 'woman', 'actress')
find_analogies('man', 'woman', 'mother')
find_analogies('heir', 'heiress', 'princess')
find_analogies('nephew', 'niece', 'aunt')
find_analogies('france', 'paris', 'tokyo')
find_analogies('france', 'paris', 'beijing')
find_analogies('february', 'january', 'november')
find_analogies('france', 'paris', 'rome')
find_analogies('paris', 'france', 'italy')

king - man = queen - woman
france - paris = england - london
france - paris = italy - rome
paris - france = lohan - italy
france - french = england - english
japan - japanese = tibet - chinese
japan - japanese = italy - italian
japan - japanese = queensland - australian
december - november = september - june
miami - florida = dallas - texas
einstein - scientist = jude - painter
china - rice = dinnerware - bread
man - woman = he - she
man - woman = uncle - aunt
man - woman = brother - sister
man - woman = son - wife
man - woman = actor - actress
man - woman = father - mother
heir - heiress = prince - princess
nephew - niece = uncle - aunt
france - paris = japan - tokyo
france - paris = chinese - beijing
february - january = april - november
france - paris = italy - rome
paris - france = lohan - italy


### Finding nearest neighbords

In [6]:
nearest_neighbors('king')
nearest_neighbors('france')
nearest_neighbors('japan')
nearest_neighbors('einstein')
nearest_neighbors('woman')
nearest_neighbors('nephew')
nearest_neighbors('february')
nearest_neighbors('rome')

neighbors of: king
	kings
	queen
	monarch
	crown_prince
	prince
	sultan
	ruler
	princes
	Prince_Paras
	throne
neighbors of: france
	spain
	french
	germany
	europe
	italy
	england
	european
	belgium
	usa
	serbia
neighbors of: japan
	japanese
	tokyo
	america
	europe
	germany
	chinese
	india
	hawaii
	usa
	korea
neighbors of: einstein
	nikki
	lmfao
	albert
	armstrong
	joan
	becky
	mcmahon
	conrad
	lori
	haley
neighbors of: woman
	man
	girl
	teenage_girl
	teenager
	lady
	teenaged_girl
	mother
	policewoman
	boy
	Woman
neighbors of: nephew
	son
	uncle
	brother
	grandson
	cousin
	father
	niece
	younger_brother
	nephews
	stepson
neighbors of: february
	january
	april
	september
	december
	july
	october
	november
	june
	feb
	norway
neighbors of: rome
	athens
	albert
	holmes
	donnie
	italy
	toni
	spain
	jh
	pablo
	malta
