# 第二课 词向量

褚则伟 zeweichu@gmail.com

第二课学习目标
- 学习词向量的概念
- 用Skip-thought模型训练词向量
- 学习使用PyTorch dataset和dataloader
- 学习定义PyTorch模型
- 学习torch.nn中常见的Module
    - Embedding
- 学习常见的PyTorch operations
    - bmm
    - logsigmoid
- 保存和读取PyTorch模型

在这一份notebook中，我们会（尽可能）尝试复现论文[Distributed Representations of Words and Phrases and their Compositionality](http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf)中训练词向量的方法. 我们会实现Skip-gram模型，并且使用论文中noice contrastive sampling的目标函数。

这篇论文有很多模型实现的细节，这些细节对于词向量的好坏至关重要。我们虽然无法完全复现论文中的实验结果，主要是由于计算资源等各种细节原因，但是我们还是可以大致展示如何训练词向量。

以下是一些我们没有实现的细节
- subsampling：参考论文section 2.3
- 分词仅仅用了split()有大量transformer模型可以使用

训练两个矩阵，输入和输出矩阵，input_embedding和output_embedding

拿输入矩阵作为词向量

输出矩阵就扔掉不要了

In [2]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.utils.data as tud
from torch.nn.parameter import Parameter
#必用包

from collections import Counter
#创建单词表，让我们知道某个单词出现了多少次
import numpy as np
import random
import math

import pandas as pd
import scipy
import sklearn
from sklearn.metrics.pairwise import cosine_similarity

USE_CUDA = torch.cuda.is_available()

# 为了保证实验结果可以复现，我们经常会把各种random seed固定在某一个值
#######################################################
random.seed(53113)
np.random.seed(53113)
torch.manual_seed(53113)
if USE_CUDA:
    torch.cuda.manual_seed(53113)
##########################################################
############################################################
###############################################################

# 设定一些超参数 hyper parameters
    
K = 100 #每出现一个正确词就出现100个错误词 number of negative samples,如果训练数据很大的话可以设置到20以下
C = 3 # nearby words threshold 定义中心词周围有三个单词
NUM_EPOCHS = 2 # The number of epochs of training
MAX_VOCAB_SIZE = 30000 # the vocabulary size词汇表大小
BATCH_SIZE = 128 # the batch size
LEARNING_RATE = 0.2 # the initial learning rate别的lr几乎没有效果
EMBEDDING_SIZE = 100#词向量维度
       
    
LOG_FILE = "word-embedding.log"

# tokenize函数，把一篇文本转化成一个个单词
def word_tokenize(text):
    return text.split()

In [3]:
with open("text8.train.txt", "r") as fin:
    text = fin.read()#一般要训练一个很好的词向量需要非常非常多单词
    
text = [w for w in word_tokenize(text.lower())]
vocab = dict(Counter(text).most_common(MAX_VOCAB_SIZE-1))#数一遍，知道单词出现数，留了一个位位置给unk

vocab["<unk>"] = len(text) - np.sum(list(vocab.values()))
#unk就是不常见的单词

#######################
#缺点：保留了很多垃圾词汇，比如and，or，所以提出了TF-IDF等，但TF-IDF也过时了
idx_to_word = [word for word in vocab.keys()] 
word_to_idx = {word:i for i, word in enumerate(idx_to_word)}

word_counts = np.array([count for count in vocab.values()], dtype=np.float32)
word_freqs = word_counts / np.sum(word_counts)


word_freqs = word_freqs ** (3./4.)
#论文手法，先乘到3/4，再normalize
word_freqs = word_freqs / np.sum(word_freqs) # 用来做 negative sampling




VOCAB_SIZE = len(idx_to_word)
VOCAB_SIZE

30000

# Dataloder实现一个batch一个batch的数据

### 实现Dataloader，Pythorth内部库



一个dataloader需要以下内容：

- 把所有text编码成数字，然后用subsampling预处理这些文字。(没实现subsample)

- 保存vocabulary，单词count，normalized word frequency
- 每个iteration sample一个中心词
- 根据当前的中心词返回context单词
- 根据中心词sample一些negative单词
- 返回单词的counts

这里有一个好的tutorial介绍如何使用[PyTorch dataloader](https://pytorch.org/tutorials/beginner/data_loading_tutorial.html).
为了使用dataloader，我们需要定义以下两个function:

- ```__len__``` function需要返回整个数据集中有多少个item
- ```__get__``` 根据给定的index返回一个item

有了dataloader之后，我们可以轻松随机打乱整个数据集，拿到一个batch的数据等等。

In [4]:
class WordEmbeddingDataset(tud.Dataset):
    def __init__(self, text, word_to_idx, idx_to_word, word_freqs, word_counts):
        ''' text: a list of words, all text from the training dataset
            word_to_idx: the dictionary from word to idx
            idx_to_word: idx to word mapping
            word_freq: the frequency of each word
            word_counts: the word counts
        '''
        super(WordEmbeddingDataset,self).__init__()
        #self.text_encoded=[word_to_idx.get(word,word_to_idx("<unk>")) for word in text]
        self.text_encoded = [word_to_idx.get(t, VOCAB_SIZE-1) for t in text]
        #全部encode
        self.text_encoded = torch.Tensor(self.text_encoded).long()
        self.word_to_idx = word_to_idx
        self.idx_to_word = idx_to_word
        self.word_freqs = torch.Tensor(word_freqs)
        self.word_counts = torch.Tensor(word_counts)
    def __len__(self):
        #这个数据集有多少个item
        return len(self.text_encoded)
        
    def __getitem__(self,idx):
        #没太懂
        
        #得到一个idx，返回一系列数据，先返回这个idx所在的中心词
        center_word=self.text_encoded[idx]#不能写成（id）因为'Tensor' object is not callable
        
        #window内单词的index
        pos_indices = list(range(idx-C, idx)) + list(range(idx+1, idx+C+1))
        
        #可能会超出文本长度
        pos_indices = [i%len(self.text_encoded) for i in pos_indices]
        
        #正例子
        pos_words = self.text_encoded[pos_indices]
        
        #负例子，随机采样，用torch.multinomial，多项式分布
        neg_words = torch.multinomial(self.word_freqs, K * pos_words.shape[0], True)#每个正确的word要采集k个负例子
        return center_word, pos_words, neg_words 
        
        
        
        

尝试一下dataset

In [5]:
dataset = WordEmbeddingDataset(text, word_to_idx, idx_to_word, word_freqs, word_counts)
dataloader = tud.DataLoader(dataset, batch_size=BATCH_SIZE, shuffle=True, num_workers=4)     

In [6]:
next(iter(dataloader))

[tensor([   48,  1495,  1799,  2312,  2329,  4869,     8,     0,  8746,    71,
           772,     7,  2837,     0, 11805,  1746,  7122,     6,    10,  1956,
          9429,    33,    17,   230, 29999,     5,   291,   640, 13454,   103,
          2505,   540,  1683,    25,   330,    52,  1890,   644,   219,     6,
         29999,    82, 27040,  2008,     1,  1832,  5685,     0,    53,  1214,
            58, 29999,  7677,     9,    14,     5,  7035, 12651,    73,   788,
          1159,     7,     0,    40,     3, 29999,    53,    57,   714,  5871,
          1056,  2756,     5,  5272,    26, 29999,    44,     8,    35,  6009,
            41,     5,   692,    17,   854,  2473,     4,    33,     1, 12482,
            48,     7,    30,   306,  7595,     6,     5,    56,  1461,     2,
           131,   167,   232,    11,  9131, 18990,     4,  4395,    32,    19,
             0,   847,    17,  6116,   608,  2573,  6378, 29999,    52, 17768,
            20,     0,  4113,  8883,  2380,   121,  

In [7]:
for i ,(center_word,pos_words,neg_words) in enumerate(dataloader):
    print(center_word,pos_words,neg_words)
    print("第",i,"次到此结束")
    if i > 5:
        break

tensor([  555,     6, 18239,    17,  1122,     6, 14662,   447,  1592,   843,
          787,   885,     1,    28,     0,   311, 26011, 29999,     0,    90,
          110,   155,    20, 29999,  6412,   982,  2002,    29,    38,    13,
         3553,   674,  1389,     3,  1139,     2, 29999,  2287,  2392,    23,
          719,   138,  5689,   284,    12,     4,    21,   387,     4,     7,
          534,     0,   886,  1611,  9551,     1,  6185,     0,   564,  4473,
          132,     0,    13,     2,  4853,  1826, 29999,   217,   770,   764,
            1,    83,   108,     2,     2, 29999,     0,  1342,    41,  2249,
            0, 23918,  1286,  2191,   295,     3,     2,     6,     2,    34,
            7,  2073,     2,   931, 14223,     8,    88,  3847,    15,  7286,
         1851,   638,     5,  3123,    40,    26,  4216,  1104,     0,  1139,
            2,   447,  2657,   958,   761,  8885,     0,     6,     8,     0,
         7186,   418,     7,   112,   211,    36,   635,    12])

tensor([    1,     3,  3348,    12,  6685,    18,   108,  5071,  1325,    20,
           22,  1873,   635,     5,  3010,    32,    88,   370,  2712,    91,
           15,  1766,   266,  2658,   419,  3650,     4,   141,     0,     4,
            0,  1536,   879, 29999,     6,    17,    77,   509,     5,  2588,
           13,     0,     6, 18407,    16, 15209,  7037,  2035,     0,    98,
           53,  3026,     1,  7948,     0,     6,   116,     3,     9,    11,
        16877,     0,  7018,  8348,  1184,   152,  2242,    50,     0,  3545,
          977, 29999, 29999, 29290,   247,     3,  2120,  1145,     9,     1,
         6937,     5,     6,     7,     0,    47, 29999,     0,   427,    13,
         7996,  6009,  1004,   112,  8114,  5801,     0,    76, 11567,     8,
         3200, 29999,     9,     4,  8451,  3069,   277,   131, 29999,    10,
           21,  3933,   214, 29999,   472, 15535,    41,     5,    24,    25,
         3453,  5838,  1056,     1,  4641,     2, 18511,     0])

# 构建Pytorch模型

In [8]:
class EmbeddingModel(nn.Module):
    #要训练两种Embedding（嵌入），一个input-embedding，一个output-embedding
    def __init__(self,vocab_size,embed_size):
        super(EmbeddingModel, self).__init__()
        self.vocab_size = vocab_size
        self.embed_size = embed_size
        
        initrange = 0.5 / self.embed_size
        self.out_embed = nn.Embedding(self.vocab_size, self.embed_size, sparse=False)
        self.out_embed.weight.data.uniform_(-initrange, initrange)
        
        
        self.in_embed = nn.Embedding(self.vocab_size, self.embed_size, sparse=False)
        self.in_embed.weight.data.uniform_(-initrange, initrange)
    def forward(self,input_labels,pos_labels,neg_labels):
        #目标是实现log的sigmiod函数
        '''
        input_labels: 中心词, [batch_size]
        pos_labels: 中心词周围 context window 出现过的单词 [batch_size * (window_size * 2)]
        neg_labelss: 中心词周围没有出现过的单词，从 negative sampling 得到 [batch_size, (window_size * 2 * K)]
        return: loss, [batch_size]
        '''
        batch_size = input_labels.size(0)
        
        #一个input
        #下面的操作的效果是embedding,只加了一个维度，变成二三维度向量
        input_embedding = self.in_embed(input_labels) #展评， 把【Batch_size(128)*embed_size(100)】的矩阵变成向量[Batch_size(128) , embed_size(100)]型的向量
        #两个output
        pos_embedding = self.out_embed(pos_labels) # [Batchsize ， (2*window_size) ，embed_size]
        neg_embedding = self.out_embed(neg_labels) # B * (2*C * K) * embed_size
      
    #为了实现loss_fun中的点乘的技巧：squeeze
    #bmm:batch matrix multiplication,对于batch1:b*n*m和batch2:b*m*p，返回b*n*p，即第一个维度不操作，后面两个维度矩阵相乘
        log_pos = torch.bmm(pos_embedding, input_embedding.unsqueeze(2)).squeeze() # B * (2*C)
        #unsqueeze(2):把[Batch_size(128) , embed_size(100)]变成[Batch_size(128) , embed_size(100),1]加上一个维度
        #squeeze就是压扁，unsqueeze就是平白无故多出来一个维度
        
        
        log_neg = torch.bmm(neg_embedding, -input_embedding.unsqueeze(2)).squeeze() # B * (2*C*K)
        
        #import torch.nn.functional as F，F内有很多公式
        #log-sigmoid
        #.sum(1)指：在第一维度上求和（序号1，实际上的第二维度，第0维是batch）
        log_pos = F.logsigmoid(log_pos).sum(1)
        log_neg = F.logsigmoid(log_neg).sum(1) # batch_size
       #实现论文中的相加
        loss = log_pos + log_neg
        return -loss
        
    def input_embeddings(self):#方便把input_embedding取出来
        return self.in_embed.weight.data.cpu().numpy()


定义一个模型

并把模型移动到GPU

In [9]:
model = EmbeddingModel(VOCAB_SIZE, EMBEDDING_SIZE)
if USE_CUDA:
    model = model.cuda(device = torch.device("cuda:3"))

# 开始训练

模型评估所用代码

In [12]:
#Spearmanr_Result（单词相似度）是-1到1之间，loss不下降但Spi-MSR（读作spi MSR）上升说明模型在优化
#0.2属于比较好，0.3是非常好，0.4是创世纪论文
def evaluate(filename, embedding_weights): 
    if filename.endswith(".csv"):
        data = pd.read_csv(filename, sep=",")
    else:
        data = pd.read_csv(filename, sep="\t")
    human_similarity = []
    model_similarity = []
    for i in data.iloc[:, 0:2].index:
        word1, word2 = data.iloc[i, 0], data.iloc[i, 1]
        if word1 not in word_to_idx or word2 not in word_to_idx:
            continue
        else:
            word1_idx, word2_idx = word_to_idx[word1], word_to_idx[word2]
            word1_embed, word2_embed = embedding_weights[[word1_idx]], embedding_weights[[word2_idx]]
            model_similarity.append(float(sklearn.metrics.pairwise.cosine_similarity(word1_embed, word2_embed)))
            human_similarity.append(float(data.iloc[i, 2]))

    return scipy.stats.spearmanr(human_similarity, model_similarity)# , model_similarity

def find_nearest(word):
    index = word_to_idx[word]
    embedding = embedding_weights[index]
    cos_dis = np.array([scipy.spatial.distance.cosine(e, embedding) for e in embedding_weights])
    return [idx_to_word[i] for i in cos_dis.argsort()[:10]]

模型训练所用代码

In [13]:
#由于loss已经定义好了，现在只需要定义一个optimizer
optimizer=torch.optim.SGD(model.parameters(),lr=LEARNING_RATE)
for e in range(NUM_EPOCHS):
    for i ,(input_labels,pos_labels,neg_labels) in enumerate(dataloader):
        #先确保数据类型没有问题
        input_labels=input_labels.long()
        pos_labels=pos_labels.long()
        neg_labels=neg_labels.long()
        if USE_CUDA:
            input_labels = input_labels.cuda(device = torch.device("cuda:3"))
            pos_labels = pos_labels.cuda(device = torch.device("cuda:3"))
            neg_labels = neg_labels.cuda(device = torch.device("cuda:3"))
         #开始训练
        optimizer.zero_grad()
        #因为model返回的是一个epoch的loss，所以要求平均
        loss=model(input_labels,pos_labels,neg_labels).mean()
        loss.backward()
        optimizer.step()
        #每200个epoch打印一次
        if i % 200 == 0:
            print("epoch",e,"iteration",i,loss.item())
        #每2000个验证一遍    
        if i % 2000 == 0:
            embedding_weights = model.input_embeddings()
            sim_simlex = evaluate("simlex-999.txt", embedding_weights)
            sim_men = evaluate("men.txt", embedding_weights)
            sim_353 = evaluate("wordsim353.csv", embedding_weights)
            with open(LOG_FILE, "a") as fout:
                print("epoch: {}, iteration: {}, simlex-999: {}, men: {}, sim353: {}, nearest to monster: {}\n".format(
                    e, i, sim_simlex, sim_men, sim_353, find_nearest("monster")))
                fout.write("epoch: {}, iteration: {}, simlex-999: {}, men: {}, sim353: {}, nearest to monster: {}\n".format(
                    e, i, sim_simlex, sim_men, sim_353, find_nearest("monster")))
                
    embedding_weights = model.input_embeddings()
    np.save("embedding-{}".format(EMBEDDING_SIZE), embedding_weights)
    torch.save(model.state_dict(), "embedding-{}.th".format(EMBEDDING_SIZE))

epoch 0 iteration 0 420.046875
epoch: 0, iteration: 0, simlex-999: SpearmanrResult(correlation=0.03169986455925229, pvalue=0.32727716108768656), men: SpearmanrResult(correlation=0.015259689878281857, pvalue=0.43750276184046677), sim353: SpearmanrResult(correlation=0.03607202389177821, pvalue=0.5215669617469392), nearest to monster: ['monster', 'disperse', 'persuading', 'microscopic', 'contour', 'supplanted', 'vantage', 'gilmore', 'turing', 'employment']

epoch 0 iteration 200 226.89720153808594
epoch 0 iteration 400 152.55532836914062
epoch 0 iteration 600 121.14224243164062
epoch 0 iteration 800 103.59605407714844
epoch 0 iteration 1000 109.61994934082031
epoch 0 iteration 1200 96.37550354003906
epoch 0 iteration 1400 76.53982543945312
epoch 0 iteration 1600 85.844482421875
epoch 0 iteration 1800 66.93041229248047
epoch 0 iteration 2000 62.5546760559082
epoch: 0, iteration: 2000, simlex-999: SpearmanrResult(correlation=-0.00195758099445206, pvalue=0.9517737470114536), men: SpearmanrRe

epoch 0 iteration 19400 32.360198974609375
epoch 0 iteration 19600 32.73289489746094
epoch 0 iteration 19800 33.03521728515625
epoch 0 iteration 20000 31.7994384765625
epoch: 0, iteration: 20000, simlex-999: SpearmanrResult(correlation=0.0718285975896491, pvalue=0.026282353115635596), men: SpearmanrResult(correlation=0.0772941549192476, pvalue=8.202298137448714e-05), sim353: SpearmanrResult(correlation=0.07367835660270598, pvalue=0.19003312520266452), nearest to monster: ['monster', 'transformation', 'diamond', 'wheel', 'ceremony', 'operator', 'plural', 'neck', 'deck', 'mounted']

epoch 0 iteration 20200 32.7657470703125
epoch 0 iteration 20400 32.10132598876953
epoch 0 iteration 20600 33.130836486816406
epoch 0 iteration 20800 31.551170349121094
epoch 0 iteration 21000 32.61425018310547
epoch 0 iteration 21200 32.73402404785156
epoch 0 iteration 21400 32.406166076660156
epoch 0 iteration 21600 32.575843811035156
epoch 0 iteration 21800 31.652023315429688
epoch 0 iteration 22000 31.759

epoch 0 iteration 38600 31.174962997436523
epoch 0 iteration 38800 31.51828956604004
epoch 0 iteration 39000 30.942855834960938
epoch 0 iteration 39200 31.49140167236328
epoch 0 iteration 39400 31.196636199951172
epoch 0 iteration 39600 30.657360076904297
epoch 0 iteration 39800 30.632293701171875
epoch 0 iteration 40000 30.79309844970703
epoch: 0, iteration: 40000, simlex-999: SpearmanrResult(correlation=0.08560565751640395, pvalue=0.008057501171038624), men: SpearmanrResult(correlation=0.08974775762483074, pvalue=4.754035800373102e-06), sim353: SpearmanrResult(correlation=0.1329738041679164, pvalue=0.017669616689674594), nearest to monster: ['monster', 'blade', 'affair', 'angel', 'plain', 'cave', 'door', 'bomb', 'nickname', 'leg']

epoch 0 iteration 40200 31.03276824951172
epoch 0 iteration 40400 31.897254943847656
epoch 0 iteration 40600 31.37371063232422
epoch 0 iteration 40800 31.628379821777344
epoch 0 iteration 41000 31.060909271240234
epoch 0 iteration 41200 31.229782104492188


epoch 0 iteration 58200 30.59186553955078
epoch 0 iteration 58400 30.581134796142578
epoch 0 iteration 58600 31.087482452392578
epoch 0 iteration 58800 31.02667236328125
epoch 0 iteration 59000 31.14072036743164
epoch 0 iteration 59200 31.556076049804688
epoch 0 iteration 59400 31.03043556213379
epoch 0 iteration 59600 30.776065826416016
epoch 0 iteration 59800 31.071584701538086
epoch 0 iteration 60000 31.202863693237305
epoch: 0, iteration: 60000, simlex-999: SpearmanrResult(correlation=0.10347148806296254, pvalue=0.0013489494335774885), men: SpearmanrResult(correlation=0.10518303474377463, pvalue=8.035913497411819e-08), sim353: SpearmanrResult(correlation=0.1539777698085401, pvalue=0.0059335486349967785), nearest to monster: ['monster', 'blade', 'leg', 'angel', 'door', 'arm', 'brand', 'nickname', 'tube', 'arrow']

epoch 0 iteration 60200 30.53045082092285
epoch 0 iteration 60400 30.792306900024414
epoch 0 iteration 60600 31.097381591796875
epoch 0 iteration 60800 31.195804595947266


epoch 0 iteration 78200 30.46540069580078
epoch 0 iteration 78400 30.781465530395508
epoch 0 iteration 78600 30.938081741333008
epoch 0 iteration 78800 30.963685989379883
epoch 0 iteration 79000 31.161176681518555
epoch 0 iteration 79200 31.17608070373535
epoch 0 iteration 79400 31.125865936279297
epoch 0 iteration 79600 31.021949768066406
epoch 0 iteration 79800 30.903221130371094
epoch 0 iteration 80000 30.198482513427734
epoch: 0, iteration: 80000, simlex-999: SpearmanrResult(correlation=0.1165218499907706, pvalue=0.0003035493039724401), men: SpearmanrResult(correlation=0.1181538307476582, pvalue=1.6145305207405442e-09), sim353: SpearmanrResult(correlation=0.17918835010655496, pvalue=0.0013328434235869145), nearest to monster: ['monster', 'blade', 'leg', 'angel', 'robot', 'arrow', 'runner', 'camera', 'factory', 'brand']

epoch 0 iteration 80200 30.791236877441406
epoch 0 iteration 80400 31.01446533203125
epoch 0 iteration 80600 30.507797241210938
epoch 0 iteration 80800 31.562120437

epoch 0 iteration 98200 30.98788070678711
epoch 0 iteration 98400 30.881149291992188
epoch 0 iteration 98600 31.145442962646484
epoch 0 iteration 98800 30.11575698852539
epoch 0 iteration 99000 30.693880081176758
epoch 0 iteration 99200 30.78475570678711
epoch 0 iteration 99400 30.73187255859375
epoch 0 iteration 99600 31.150127410888672
epoch 0 iteration 99800 30.73774528503418
epoch 0 iteration 100000 30.577566146850586
epoch: 0, iteration: 100000, simlex-999: SpearmanrResult(correlation=0.12421892748834178, pvalue=0.0001168355028842539), men: SpearmanrResult(correlation=0.1306289354413064, pvalue=2.4779589444664676e-11), sim353: SpearmanrResult(correlation=0.21104666901683555, pvalue=0.00014975661989092388), nearest to monster: ['monster', 'triangle', 'robot', 'clown', 'finger', 'blade', 'arrow', 'demon', 'giant', 'leg']

epoch 0 iteration 100200 30.786365509033203
epoch 0 iteration 100400 30.708812713623047
epoch 0 iteration 100600 30.864727020263672
epoch 0 iteration 100800 31.036

epoch 0 iteration 118200 30.6373348236084
epoch 0 iteration 118400 30.62057876586914
epoch 0 iteration 118600 30.982507705688477
epoch 0 iteration 118800 30.471500396728516
epoch 0 iteration 119000 30.734054565429688
epoch 0 iteration 119200 31.21413230895996
epoch 0 iteration 119400 30.852935791015625
epoch 1 iteration 0 30.565412521362305
epoch: 1, iteration: 0, simlex-999: SpearmanrResult(correlation=0.13192316955653116, pvalue=4.24601237763605e-05), men: SpearmanrResult(correlation=0.1382764646887521, pvalue=1.5592348611828356e-12), sim353: SpearmanrResult(correlation=0.22419212991027176, pvalue=5.492017569933576e-05), nearest to monster: ['monster', 'clown', 'triangle', 'robot', 'reed', 'enigma', 'arrow', 'demon', 'giant', 'finger']

epoch 1 iteration 200 30.724403381347656
epoch 1 iteration 400 30.76205062866211
epoch 1 iteration 600 30.78719711303711
epoch 1 iteration 800 31.027742385864258
epoch 1 iteration 1000 30.763656616210938
epoch 1 iteration 1200 30.93928337097168
epoch 

epoch 1 iteration 18200 30.892539978027344
epoch 1 iteration 18400 30.31290626525879
epoch 1 iteration 18600 30.72000503540039
epoch 1 iteration 18800 30.339500427246094
epoch 1 iteration 19000 30.500125885009766
epoch 1 iteration 19200 30.31265640258789
epoch 1 iteration 19400 30.95498275756836
epoch 1 iteration 19600 30.07970428466797
epoch 1 iteration 19800 31.297805786132812
epoch 1 iteration 20000 30.649730682373047
epoch: 1, iteration: 20000, simlex-999: SpearmanrResult(correlation=0.13715278776466197, pvalue=2.067554223578452e-05), men: SpearmanrResult(correlation=0.14528244339731583, pvalue=1.0775025699044939e-13), sim353: SpearmanrResult(correlation=0.23596356062686044, pvalue=2.1239247721611573e-05), nearest to monster: ['monster', 'clown', 'giant', 'triangle', 'robot', 'blade', 'bird', 'flower', 'finger', 'mine']

epoch 1 iteration 20200 30.550304412841797
epoch 1 iteration 20400 30.282207489013672
epoch 1 iteration 20600 30.851329803466797
epoch 1 iteration 20800 30.3792743

epoch 1 iteration 38200 31.019838333129883
epoch 1 iteration 38400 31.035503387451172
epoch 1 iteration 38600 30.5206356048584
epoch 1 iteration 38800 30.81256103515625
epoch 1 iteration 39000 30.471511840820312
epoch 1 iteration 39200 30.366579055786133
epoch 1 iteration 39400 30.057628631591797
epoch 1 iteration 39600 30.147171020507812
epoch 1 iteration 39800 30.410802841186523
epoch 1 iteration 40000 30.384536743164062
epoch: 1, iteration: 40000, simlex-999: SpearmanrResult(correlation=0.14460309507589034, pvalue=7.084467006278349e-06), men: SpearmanrResult(correlation=0.14969009794649746, pvalue=1.8736022245220972e-14), sim353: SpearmanrResult(correlation=0.2447663845925468, pvalue=1.0102913264712465e-05), nearest to monster: ['monster', 'clown', 'giant', 'triangle', 'blade', 'demon', 'robot', 'bird', 'bull', 'finger']

epoch 1 iteration 40200 31.112213134765625
epoch 1 iteration 40400 30.52859878540039
epoch 1 iteration 40600 30.040828704833984
epoch 1 iteration 40800 30.56529045

epoch 1 iteration 58200 30.522817611694336
epoch 1 iteration 58400 30.20562744140625
epoch 1 iteration 58600 30.235048294067383
epoch 1 iteration 58800 30.5760440826416
epoch 1 iteration 59000 30.480958938598633
epoch 1 iteration 59200 30.161396026611328
epoch 1 iteration 59400 30.704429626464844
epoch 1 iteration 59600 30.38228988647461
epoch 1 iteration 59800 30.53061294555664
epoch 1 iteration 60000 30.162927627563477
epoch: 1, iteration: 60000, simlex-999: SpearmanrResult(correlation=0.15338881745689104, pvalue=1.8681926130610884e-06), men: SpearmanrResult(correlation=0.15717661237095615, pvalue=8.499750671450249e-16), sim353: SpearmanrResult(correlation=0.25340202617930113, pvalue=4.7414681136653665e-06), nearest to monster: ['monster', 'giant', 'clown', 'finger', 'triangle', 'robot', 'blade', 'demon', 'belt', 'bird']

epoch 1 iteration 60200 29.964580535888672
epoch 1 iteration 60400 29.950088500976562
epoch 1 iteration 60600 30.87106704711914
epoch 1 iteration 60800 30.297519683

epoch 1 iteration 78200 30.03192901611328
epoch 1 iteration 78400 29.79343605041504
epoch 1 iteration 78600 30.235443115234375
epoch 1 iteration 78800 30.23079490661621
epoch 1 iteration 79000 30.746177673339844
epoch 1 iteration 79200 29.896738052368164
epoch 1 iteration 79400 31.110950469970703
epoch 1 iteration 79600 30.324316024780273
epoch 1 iteration 79800 30.5319766998291
epoch 1 iteration 80000 30.08377456665039
epoch: 1, iteration: 80000, simlex-999: SpearmanrResult(correlation=0.158700479484904, pvalue=8.041727171198161e-07), men: SpearmanrResult(correlation=0.16420244993976324, pvalue=4.053884152924828e-17), sim353: SpearmanrResult(correlation=0.26257447993590743, pvalue=2.0595106936228086e-06), nearest to monster: ['monster', 'giant', 'clown', 'triangle', 'finger', 'reed', 'robot', 'blade', 'elf', 'demon']

epoch 1 iteration 80200 30.710166931152344
epoch 1 iteration 80400 30.348176956176758
epoch 1 iteration 80600 30.47200584411621
epoch 1 iteration 80800 30.27964401245117

epoch 1 iteration 98200 30.354122161865234
epoch 1 iteration 98400 29.967790603637695
epoch 1 iteration 98600 30.438945770263672
epoch 1 iteration 98800 30.764007568359375
epoch 1 iteration 99000 30.828716278076172
epoch 1 iteration 99200 30.4229736328125
epoch 1 iteration 99400 30.367151260375977
epoch 1 iteration 99600 30.698068618774414
epoch 1 iteration 99800 30.42706298828125
epoch 1 iteration 100000 30.186492919921875
epoch: 1, iteration: 100000, simlex-999: SpearmanrResult(correlation=0.1647565703784685, pvalue=2.971984104700138e-07), men: SpearmanrResult(correlation=0.1705108303213819, pvalue=2.3476412698123815e-18), sim353: SpearmanrResult(correlation=0.26438251258306295, pvalue=1.7408293170465155e-06), nearest to monster: ['monster', 'clown', 'giant', 'robot', 'reed', 'warrior', 'hammer', 'bull', 'finger', 'triangle']

epoch 1 iteration 100200 30.562053680419922
epoch 1 iteration 100400 30.17049789428711
epoch 1 iteration 100600 30.62445068359375
epoch 1 iteration 100800 30.7

epoch 1 iteration 118200 30.33205795288086
epoch 1 iteration 118400 30.305835723876953
epoch 1 iteration 118600 30.182973861694336
epoch 1 iteration 118800 30.15907859802246
epoch 1 iteration 119000 29.960296630859375
epoch 1 iteration 119200 29.859825134277344
epoch 1 iteration 119400 30.475967407226562
