# 第二课 词向量

褚则伟 zeweichu@gmail.com

第二课学习目标
- 学习词向量的概念
- 用Skip-thought模型训练词向量
- 学习使用PyTorch dataset和dataloader
- 学习定义PyTorch模型
- 学习torch.nn中常见的Module
    - Embedding
- 学习常见的PyTorch operations
    - bmm
    - logsigmoid
- 保存和读取PyTorch模型
    

第二课使用的训练数据可以从以下链接下载到。

链接:https://pan.baidu.com/s/1tFeK3mXuVXEy3EMarfeWvg  密码:v2z5

在这一份notebook中，我们会（尽可能）尝试复现论文[Distributed Representations of Words and Phrases and their Compositionality](http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf)中训练词向量的方法. 我们会实现Skip-gram模型，并且使用论文中noice contrastive sampling的目标函数。

这篇论文有很多模型实现的细节，这些细节对于词向量的好坏至关重要。我们虽然无法完全复现论文中的实验结果，主要是由于计算资源等各种细节原因，但是我们还是可以大致展示如何训练词向量。

以下是一些我们没有实现的细节
- subsampling：参考论文section 2.3

In [1]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.utils.data as tud
from torch.nn.parameter import Parameter

from collections import Counter
import numpy as np
import random
import math

import pandas as pd
import scipy
import sklearn
from sklearn.metrics.pairwise import cosine_similarity

USE_CUDA = torch.cuda.is_available()

# 为了保证实验结果可以复现，我们经常会把各种random seed固定在某一个值
random.seed(53113)
np.random.seed(53113)
torch.manual_seed(53113)
if USE_CUDA:
    torch.cuda.manual_seed(53113)
    
# 设定一些超参数
    
K = 10 # number of negative samples
C = 2 # nearby words threshold
NUM_EPOCHS = 2 # The number of epochs of training
MAX_VOCAB_SIZE = 9000 # the vocabulary size
BATCH_SIZE = 32 # the batch size
LEARNING_RATE = 0.2 # the initial learning rate
EMBEDDING_SIZE = 25
       
    
LOG_FILE = "word-embedding.log"

# tokenize函数，把一篇文本转化成一个个单词
def word_tokenize(text):
    return text.split()

- 从文本文件中读取所有的文字，通过这些文本创建一个vocabulary
- 由于单词数量可能太大，我们只选取最常见的MAX_VOCAB_SIZE个单词
- 我们添加一个UNK单词表示所有不常见的单词
- 我们需要记录单词到index的mapping，以及index到单词的mapping，单词的count，单词的(normalized) frequency，以及单词总数。

In [2]:
with open("text8.train.txt", "r") as fin:
    text = fin.read()
    
text = [w for w in word_tokenize(text.lower())]
vocab = dict(Counter(text).most_common(MAX_VOCAB_SIZE-1))
vocab["<unk>"] = len(text) - np.sum(list(vocab.values()))
idx_to_word = [word for word in vocab.keys()] 
word_to_idx = {word:i for i, word in enumerate(idx_to_word)}

word_counts = np.array([count for count in vocab.values()], dtype=np.float32)
word_freqs = word_counts / np.sum(word_counts)
word_freqs = word_freqs ** (3./4.)
word_freqs = word_freqs / np.sum(word_freqs) # 用来做 negative sampling
VOCAB_SIZE = len(idx_to_word)
VOCAB_SIZE

9000

### 实现Dataloader

一个dataloader需要以下内容：

- 把所有text编码成数字，然后用subsampling预处理这些文字。
- 保存vocabulary，单词count，normalized word frequency
- 每个iteration sample一个中心词
- 根据当前的中心词返回context单词
- 根据中心词sample一些negative单词
- 返回单词的counts

这里有一个好的tutorial介绍如何使用[PyTorch dataloader](https://pytorch.org/tutorials/beginner/data_loading_tutorial.html).
为了使用dataloader，我们需要定义以下两个function:

- ```__len__``` function需要返回整个数据集中有多少个item
- ```__get__``` 根据给定的index返回一个item

有了dataloader之后，我们可以轻松随机打乱整个数据集，拿到一个batch的数据等等。

In [3]:
class WordEmbeddingDataset(tud.Dataset):
    def __init__(self, text, word_to_idx, idx_to_word, word_freqs, word_counts):
        ''' text: a list of words, all text from the training dataset
            word_to_idx: the dictionary from word to idx
            idx_to_word: idx to word mapping
            word_freq: the frequency of each word
            word_counts: the word counts
        '''
        super(WordEmbeddingDataset, self).__init__()
        self.text_encoded = [word_to_idx.get(t, VOCAB_SIZE-1) for t in text]
        self.text_encoded = torch.Tensor(self.text_encoded).long()
        self.word_to_idx = word_to_idx
        self.idx_to_word = idx_to_word
        self.word_freqs = torch.Tensor(word_freqs)
        self.word_counts = torch.Tensor(word_counts)
        
    def __len__(self):
        ''' 返回整个数据集（所有单词）的长度
        '''
        return len(self.text_encoded)
        
    def __getitem__(self, idx):
        ''' 这个function返回以下数据用于训练
            - 中心词
            - 这个单词附近的(positive)单词
            - 随机采样的K个单词作为negative sample
        '''
        center_word = self.text_encoded[idx]
        pos_indices = list(range(idx-C, idx)) + list(range(idx+1, idx+C+1))
        pos_indices = [i%len(self.text_encoded) for i in pos_indices]
        pos_words = self.text_encoded[pos_indices] 
        neg_words = torch.multinomial(self.word_freqs, K * pos_words.shape[0], True)
        
        return center_word, pos_words, neg_words 

创建dataset和dataloader

In [4]:
dataset = WordEmbeddingDataset(text, word_to_idx, idx_to_word, word_freqs, word_counts)
dataloader = tud.DataLoader(dataset, batch_size=BATCH_SIZE, shuffle=True, num_workers=0)     

In [5]:
next(iter(dataloader))

[tensor([  48, 1495, 1799, 2312, 2329, 4869,    8,    0, 8746,   71,  772,    7,
         2837,    0, 8999, 1746, 7122,    6,   10, 1956, 8999,   33,   17,  230,
         8999,    5,  291,  640, 8999,  103, 2505,  540]),
 tensor([[   0,  868,  363,   13],
         [8999, 3751, 5168,  932],
         [   3,    5,  345,    1],
         [8075,    2, 1008,    2],
         [8999, 3981, 1436,  582],
         [   4,    5,  213,   27],
         [1761,    3,   22,    8],
         [2813,   19, 8999, 2077],
         [  44, 2189, 5529,   26],
         [   0, 2266, 4809, 2876],
         [4219, 1173, 1262, 8999],
         [   9,    7,    7,  599],
         [   5, 2252, 8999,    0],
         [  70,   38, 4720,    1],
         [4719,    2,   39, 2117],
         [ 254,  387,   23,    0],
         [  54,   11, 6962, 4473],
         [  10, 7479, 5115, 8999],
         [  32,  557,  657,    5],
         [ 216,    6, 1051, 4695],
         [1463,    6,    0, 8999],
         [8999,    1,    0, 8999],
         

### 定义PyTorch模型

In [6]:
class EmbeddingModel(nn.Module):
    def __init__(self, vocab_size, embed_size):
        ''' 初始化输出和输出embedding
        '''
        super(EmbeddingModel, self).__init__()
        self.vocab_size = vocab_size
        self.embed_size = embed_size
        
        initrange = 0.5 / self.embed_size
        self.out_embed = nn.Embedding(self.vocab_size, self.embed_size, sparse=False)
        self.out_embed.weight.data.uniform_(-initrange, initrange)
        
        
        self.in_embed = nn.Embedding(self.vocab_size, self.embed_size, sparse=False)
        self.in_embed.weight.data.uniform_(-initrange, initrange)
        
        
    def forward(self, input_labels, pos_labels, neg_labels):
        '''
        input_labels: 中心词, [batch_size]
        pos_labels: 中心词周围 context window 出现过的单词 [batch_size * (window_size * 2)]
        neg_labelss: 中心词周围没有出现过的单词，从 negative sampling 得到 [batch_size, (window_size * 2 * K)]
        
        return: loss, [batch_size]
        '''
        
        batch_size = input_labels.size(0)
        
        input_embedding = self.in_embed(input_labels) # B * embed_size
        pos_embedding = self.out_embed(pos_labels) # B * (2*C) * embed_size
        neg_embedding = self.out_embed(neg_labels) # B * (2*C * K) * embed_size
      
        log_pos = torch.bmm(pos_embedding, input_embedding.unsqueeze(2)).squeeze() # B * (2*C)
        log_neg = torch.bmm(neg_embedding, -input_embedding.unsqueeze(2)).squeeze() # B * (2*C*K)

        log_pos = F.logsigmoid(log_pos).sum(1)
        log_neg = F.logsigmoid(log_neg).sum(1) # batch_size
       
        loss = log_pos + log_neg
        
        return -loss
    
    def input_embeddings(self):
        return self.in_embed.weight.data.cpu().numpy()
        

定义一个模型以及把模型移动到GPU

In [7]:
model = EmbeddingModel(VOCAB_SIZE, EMBEDDING_SIZE)
if USE_CUDA:
    model = model.cuda()

下面是评估模型的代码，以及训练模型的代码

In [8]:
def evaluate(filename, embedding_weights): 
    if filename.endswith(".csv"):
        data = pd.read_csv(filename, sep=",")
    else:
        data = pd.read_csv(filename, sep="\t")
    human_similarity = []
    model_similarity = []
    for i in data.iloc[:, 0:2].index:
        word1, word2 = data.iloc[i, 0], data.iloc[i, 1]
        if word1 not in word_to_idx or word2 not in word_to_idx:
            continue
        else:
            word1_idx, word2_idx = word_to_idx[word1], word_to_idx[word2]
            word1_embed, word2_embed = embedding_weights[[word1_idx]], embedding_weights[[word2_idx]]
            model_similarity.append(float(sklearn.metrics.pairwise.cosine_similarity(word1_embed, word2_embed)))
            human_similarity.append(float(data.iloc[i, 2]))

    return scipy.stats.spearmanr(human_similarity, model_similarity)# , model_similarity

def find_nearest(word):
    index = word_to_idx[word]
    embedding = embedding_weights[index]
    cos_dis = np.array([scipy.spatial.distance.cosine(e, embedding) for e in embedding_weights])
    return [idx_to_word[i] for i in cos_dis.argsort()[:10]]

训练模型：
- 模型一般需要训练若干个epoch
- 每个epoch我们都把所有的数据分成若干个batch
- 把每个batch的输入和输出都包装成cuda tensor
- forward pass，通过输入的句子预测每个单词的下一个单词
- 用模型的预测和正确的下一个单词计算cross entropy loss
- 清空模型当前gradient
- backward pass
- 更新模型参数
- 每隔一定的iteration输出模型在当前iteration的loss，以及在验证数据集上做模型的评估

In [16]:
optimizer = torch.optim.SGD(model.parameters(), lr=LEARNING_RATE)
for e in range(NUM_EPOCHS):
    for i, (input_labels, pos_labels, neg_labels) in enumerate(dataloader):
        
        
        # TODO
        input_labels = input_labels.long()
        pos_labels = pos_labels.long()
        neg_labels = neg_labels.long()
        if USE_CUDA:
            input_labels = input_labels.cuda()
            pos_labels = pos_labels.cuda()
            neg_labels = neg_labels.cuda()
            
        optimizer.zero_grad()
        loss = model(input_labels, pos_labels, neg_labels).mean()
        loss.backward()
        optimizer.step()

        if i % 100 == 0:
#            with open(LOG_FILE, "a") as fout:
#                fout.write("epoch: {}, iter: {}, loss: {}\n".format(e, i, loss.item()))
                print("epoch: {}, iter: {}, loss: {}".format(e, i, loss.item()))
            
        
        if i % 2000 == 0:
            embedding_weights = model.input_embeddings()
            sim_simlex = evaluate(r".\embedding\simlex-999.txt", embedding_weights)
            sim_men = evaluate(r".\embedding\\men.txt", embedding_weights)
            sim_353 = evaluate(r".\embedding\wordsim353.csv", embedding_weights)
#            with open(LOG_FILE, "a") as fout:
            print("epoch: {}, iteration: {}, simlex-999: {}, men: {}, sim353: {}, nearest to monster: {}\n".format(
                    e, i, sim_simlex, sim_men, sim_353, find_nearest("monster")))
#                fout.write("epoch: {}, iteration: {}, simlex-999: {}, men: {}, sim353: {}, nearest to monster: {}\n".format(
#                    e, i, sim_simlex, sim_men, sim_353, find_nearest("monster")))
                
    embedding_weights = model.input_embeddings()
    np.save("embedding-{}".format(EMBEDDING_SIZE), embedding_weights)
    torch.save(model.state_dict(), "embedding-{}.th".format(EMBEDDING_SIZE))

epoch: 0, iter: 0, loss: 30.498878479003906
epoch: 0, iteration: 0, simlex-999: SpearmanrResult(correlation=0.038419267682371507, pvalue=0.3487076419792051), men: SpearmanrResult(correlation=0.03038333081320989, pvalue=0.2738394063825371), sim353: SpearmanrResult(correlation=-0.02851728381964284, pvalue=0.6575798953240686), nearest to monster: ['monster', 'drawn', 'affects', 'focused', 'guidance', 'push', 'compared', 'bismarck', 'encountered', 'buddhist']

epoch: 0, iter: 100, loss: 30.487579345703125
epoch: 0, iter: 200, loss: 30.173709869384766
epoch: 0, iter: 300, loss: 29.176044464111328
epoch: 0, iter: 400, loss: 28.22866439819336
epoch: 0, iter: 500, loss: 27.254440307617188
epoch: 0, iter: 600, loss: 25.54012680053711
epoch: 0, iter: 700, loss: 23.82415008544922
epoch: 0, iter: 800, loss: 24.10273551940918
epoch: 0, iter: 900, loss: 24.398141860961914
epoch: 0, iter: 1000, loss: 25.676937103271484
epoch: 0, iter: 1100, loss: 21.91301918029785
epoch: 0, iter: 1200, loss: 20.88068

epoch: 0, iter: 12100, loss: 13.25427532196045
epoch: 0, iter: 12200, loss: 12.95503044128418
epoch: 0, iter: 12300, loss: 13.252090454101562
epoch: 0, iter: 12400, loss: 13.105239868164062
epoch: 0, iter: 12500, loss: 13.69926643371582
epoch: 0, iter: 12600, loss: 13.682611465454102
epoch: 0, iter: 12700, loss: 13.328731536865234
epoch: 0, iter: 12800, loss: 13.191139221191406
epoch: 0, iter: 12900, loss: 12.566657066345215
epoch: 0, iter: 13000, loss: 13.497055053710938
epoch: 0, iter: 13100, loss: 13.831032752990723
epoch: 0, iter: 13200, loss: 12.346025466918945
epoch: 0, iter: 13300, loss: 14.12915325164795
epoch: 0, iter: 13400, loss: 13.068685531616211
epoch: 0, iter: 13500, loss: 13.708019256591797
epoch: 0, iter: 13600, loss: 13.4109468460083
epoch: 0, iter: 13700, loss: 12.937657356262207
epoch: 0, iter: 13800, loss: 13.079425811767578
epoch: 0, iter: 13900, loss: 13.839799880981445
epoch: 0, iter: 14000, loss: 13.190155982971191
epoch: 0, iteration: 14000, simlex-999: Spearm

epoch: 0, iter: 24100, loss: 12.252010345458984
epoch: 0, iter: 24200, loss: 11.623882293701172
epoch: 0, iter: 24300, loss: 12.626989364624023
epoch: 0, iter: 24400, loss: 12.15478515625
epoch: 0, iter: 24500, loss: 12.346567153930664
epoch: 0, iter: 24600, loss: 11.919984817504883
epoch: 0, iter: 24700, loss: 12.55777359008789
epoch: 0, iter: 24800, loss: 12.016989707946777
epoch: 0, iter: 24900, loss: 12.572108268737793
epoch: 0, iter: 25000, loss: 11.995954513549805
epoch: 0, iter: 25100, loss: 12.41183853149414
epoch: 0, iter: 25200, loss: 11.38509750366211
epoch: 0, iter: 25300, loss: 12.902754783630371
epoch: 0, iter: 25400, loss: 11.913442611694336
epoch: 0, iter: 25500, loss: 12.687858581542969
epoch: 0, iter: 25600, loss: 11.980302810668945
epoch: 0, iter: 25700, loss: 12.748454093933105
epoch: 0, iter: 25800, loss: 12.460587501525879
epoch: 0, iter: 25900, loss: 12.35887336730957
epoch: 0, iter: 26000, loss: 12.11594295501709
epoch: 0, iteration: 26000, simlex-999: Spearmanr

epoch: 0, iter: 36100, loss: 11.610977172851562
epoch: 0, iter: 36200, loss: 12.651154518127441
epoch: 0, iter: 36300, loss: 12.242701530456543
epoch: 0, iter: 36400, loss: 12.130355834960938
epoch: 0, iter: 36500, loss: 11.517946243286133
epoch: 0, iter: 36600, loss: 12.078970909118652
epoch: 0, iter: 36700, loss: 12.557184219360352
epoch: 0, iter: 36800, loss: 11.657432556152344
epoch: 0, iter: 36900, loss: 12.103632926940918
epoch: 0, iter: 37000, loss: 12.138496398925781
epoch: 0, iter: 37100, loss: 11.97175407409668
epoch: 0, iter: 37200, loss: 12.5131196975708
epoch: 0, iter: 37300, loss: 12.814860343933105
epoch: 0, iter: 37400, loss: 11.409767150878906
epoch: 0, iter: 37500, loss: 12.247803688049316
epoch: 0, iter: 37600, loss: 12.004988670349121
epoch: 0, iter: 37700, loss: 11.94013786315918
epoch: 0, iter: 37800, loss: 11.871380805969238
epoch: 0, iter: 37900, loss: 11.786941528320312
epoch: 0, iter: 38000, loss: 13.039443969726562
epoch: 0, iteration: 38000, simlex-999: Spea

epoch: 0, iter: 48100, loss: 11.51461410522461
epoch: 0, iter: 48200, loss: 11.825233459472656
epoch: 0, iter: 48300, loss: 11.835733413696289
epoch: 0, iter: 48400, loss: 11.552790641784668
epoch: 0, iter: 48500, loss: 11.608691215515137
epoch: 0, iter: 48600, loss: 12.621800422668457
epoch: 0, iter: 48700, loss: 11.954909324645996
epoch: 0, iter: 48800, loss: 11.846529006958008
epoch: 0, iter: 48900, loss: 11.820432662963867
epoch: 0, iter: 49000, loss: 11.999134063720703
epoch: 0, iter: 49100, loss: 11.94824504852295
epoch: 0, iter: 49200, loss: 11.539299964904785
epoch: 0, iter: 49300, loss: 11.879488945007324
epoch: 0, iter: 49400, loss: 12.390554428100586
epoch: 0, iter: 49500, loss: 11.956644058227539
epoch: 0, iter: 49600, loss: 11.819853782653809
epoch: 0, iter: 49700, loss: 12.099848747253418
epoch: 0, iter: 49800, loss: 11.637537002563477
epoch: 0, iter: 49900, loss: 11.650900840759277
epoch: 0, iter: 50000, loss: 12.219585418701172
epoch: 0, iteration: 50000, simlex-999: Sp

epoch: 0, iter: 60100, loss: 11.92955493927002
epoch: 0, iter: 60200, loss: 11.983994483947754
epoch: 0, iter: 60300, loss: 12.198835372924805
epoch: 0, iter: 60400, loss: 11.788212776184082
epoch: 0, iter: 60500, loss: 11.45529556274414
epoch: 0, iter: 60600, loss: 12.190597534179688
epoch: 0, iter: 60700, loss: 12.243377685546875
epoch: 0, iter: 60800, loss: 11.891860961914062
epoch: 0, iter: 60900, loss: 11.839883804321289
epoch: 0, iter: 61000, loss: 12.103757858276367
epoch: 0, iter: 61100, loss: 11.315523147583008
epoch: 0, iter: 61200, loss: 12.33134937286377
epoch: 0, iter: 61300, loss: 11.319604873657227
epoch: 0, iter: 61400, loss: 11.827451705932617
epoch: 0, iter: 61500, loss: 12.247234344482422
epoch: 0, iter: 61600, loss: 11.396427154541016
epoch: 0, iter: 61700, loss: 11.62763786315918
epoch: 0, iter: 61800, loss: 11.621210098266602
epoch: 0, iter: 61900, loss: 10.91627025604248
epoch: 0, iter: 62000, loss: 12.114912033081055
epoch: 0, iteration: 62000, simlex-999: Spear

epoch: 0, iter: 72100, loss: 11.903616905212402
epoch: 0, iter: 72200, loss: 11.547494888305664
epoch: 0, iter: 72300, loss: 11.700858116149902
epoch: 0, iter: 72400, loss: 11.19325065612793
epoch: 0, iter: 72500, loss: 11.431058883666992
epoch: 0, iter: 72600, loss: 11.787359237670898
epoch: 0, iter: 72700, loss: 12.173054695129395
epoch: 0, iter: 72800, loss: 11.30528736114502
epoch: 0, iter: 72900, loss: 12.110675811767578
epoch: 0, iter: 73000, loss: 11.243267059326172
epoch: 0, iter: 73100, loss: 12.117194175720215
epoch: 0, iter: 73200, loss: 11.367107391357422
epoch: 0, iter: 73300, loss: 12.032305717468262
epoch: 0, iter: 73400, loss: 11.906856536865234
epoch: 0, iter: 73500, loss: 11.581188201904297
epoch: 0, iter: 73600, loss: 11.529533386230469
epoch: 0, iter: 73700, loss: 12.36591625213623
epoch: 0, iter: 73800, loss: 11.5829496383667
epoch: 0, iter: 73900, loss: 11.376861572265625
epoch: 0, iter: 74000, loss: 11.302669525146484
epoch: 0, iteration: 74000, simlex-999: Spear

epoch: 0, iter: 84100, loss: 11.645893096923828
epoch: 0, iter: 84200, loss: 11.66749382019043
epoch: 0, iter: 84300, loss: 11.632616996765137
epoch: 0, iter: 84400, loss: 12.082725524902344
epoch: 0, iter: 84500, loss: 11.184089660644531
epoch: 0, iter: 84600, loss: 11.732464790344238
epoch: 0, iter: 84700, loss: 11.856660842895508
epoch: 0, iter: 84800, loss: 12.243293762207031
epoch: 0, iter: 84900, loss: 11.361895561218262
epoch: 0, iter: 85000, loss: 11.812590599060059
epoch: 0, iter: 85100, loss: 11.55022144317627
epoch: 0, iter: 85200, loss: 12.16949462890625
epoch: 0, iter: 85300, loss: 11.362552642822266
epoch: 0, iter: 85400, loss: 11.60274887084961
epoch: 0, iter: 85500, loss: 11.502060890197754
epoch: 0, iter: 85600, loss: 12.215263366699219
epoch: 0, iter: 85700, loss: 11.595083236694336
epoch: 0, iter: 85800, loss: 11.694725036621094
epoch: 0, iter: 85900, loss: 12.204821586608887
epoch: 0, iter: 86000, loss: 12.307565689086914
epoch: 0, iteration: 86000, simlex-999: Spea

epoch: 0, iter: 96200, loss: 11.705784797668457
epoch: 0, iter: 96300, loss: 11.141871452331543
epoch: 0, iter: 96400, loss: 11.687528610229492
epoch: 0, iter: 96500, loss: 11.194098472595215
epoch: 0, iter: 96600, loss: 11.558453559875488
epoch: 0, iter: 96700, loss: 11.489604949951172
epoch: 0, iter: 96800, loss: 12.067207336425781
epoch: 0, iter: 96900, loss: 11.773247718811035
epoch: 0, iter: 97000, loss: 11.652469635009766
epoch: 0, iter: 97100, loss: 11.680973052978516
epoch: 0, iter: 97200, loss: 11.69735336303711
epoch: 0, iter: 97300, loss: 11.451690673828125
epoch: 0, iter: 97400, loss: 11.941868782043457
epoch: 0, iter: 97500, loss: 11.489564895629883
epoch: 0, iter: 97600, loss: 11.740006446838379
epoch: 0, iter: 97700, loss: 12.565382957458496
epoch: 0, iter: 97800, loss: 11.574379920959473
epoch: 0, iter: 97900, loss: 11.068246841430664
epoch: 0, iter: 98000, loss: 11.9818115234375
epoch: 0, iteration: 98000, simlex-999: SpearmanrResult(correlation=0.06652133393953952, pv

epoch: 0, iter: 108100, loss: 11.925078392028809
epoch: 0, iter: 108200, loss: 11.301591873168945
epoch: 0, iter: 108300, loss: 11.854233741760254
epoch: 0, iter: 108400, loss: 11.471846580505371
epoch: 0, iter: 108500, loss: 10.744473457336426
epoch: 0, iter: 108600, loss: 11.227954864501953
epoch: 0, iter: 108700, loss: 12.08677864074707
epoch: 0, iter: 108800, loss: 11.27961254119873
epoch: 0, iter: 108900, loss: 11.187448501586914
epoch: 0, iter: 109000, loss: 11.52439022064209
epoch: 0, iter: 109100, loss: 11.733097076416016
epoch: 0, iter: 109200, loss: 12.1475191116333
epoch: 0, iter: 109300, loss: 11.690363883972168
epoch: 0, iter: 109400, loss: 11.410561561584473
epoch: 0, iter: 109500, loss: 11.264134407043457
epoch: 0, iter: 109600, loss: 11.513944625854492
epoch: 0, iter: 109700, loss: 11.091390609741211
epoch: 0, iter: 109800, loss: 11.972676277160645
epoch: 0, iter: 109900, loss: 11.513616561889648
epoch: 0, iter: 110000, loss: 10.993926048278809
epoch: 0, iteration: 1100

epoch: 0, iter: 120100, loss: 11.7240571975708
epoch: 0, iter: 120200, loss: 12.243743896484375
epoch: 0, iter: 120300, loss: 11.465514183044434
epoch: 0, iter: 120400, loss: 11.78659439086914
epoch: 0, iter: 120500, loss: 11.646685600280762
epoch: 0, iter: 120600, loss: 11.690075874328613
epoch: 0, iter: 120700, loss: 11.784485816955566
epoch: 0, iter: 120800, loss: 11.835149765014648
epoch: 0, iter: 120900, loss: 11.441770553588867
epoch: 0, iter: 121000, loss: 12.045429229736328
epoch: 0, iter: 121100, loss: 11.674468040466309
epoch: 0, iter: 121200, loss: 11.901330947875977
epoch: 0, iter: 121300, loss: 11.655827522277832
epoch: 0, iter: 121400, loss: 12.179803848266602
epoch: 0, iter: 121500, loss: 12.560428619384766
epoch: 0, iter: 121600, loss: 11.750929832458496
epoch: 0, iter: 121700, loss: 11.582982063293457
epoch: 0, iter: 121800, loss: 11.686595916748047
epoch: 0, iter: 121900, loss: 12.155259132385254
epoch: 0, iter: 122000, loss: 11.640657424926758
epoch: 0, iteration: 12

epoch: 0, iter: 132100, loss: 11.355051040649414
epoch: 0, iter: 132200, loss: 11.964483261108398
epoch: 0, iter: 132300, loss: 12.116920471191406
epoch: 0, iter: 132400, loss: 10.713314056396484
epoch: 0, iter: 132500, loss: 11.686285018920898
epoch: 0, iter: 132600, loss: 12.553925514221191
epoch: 0, iter: 132700, loss: 11.027671813964844
epoch: 0, iter: 132800, loss: 11.600082397460938
epoch: 0, iter: 132900, loss: 11.402132987976074
epoch: 0, iter: 133000, loss: 11.613564491271973
epoch: 0, iter: 133100, loss: 11.432049751281738
epoch: 0, iter: 133200, loss: 11.793468475341797
epoch: 0, iter: 133300, loss: 11.942193031311035
epoch: 0, iter: 133400, loss: 11.84040641784668
epoch: 0, iter: 133500, loss: 11.361369132995605
epoch: 0, iter: 133600, loss: 10.886395454406738
epoch: 0, iter: 133700, loss: 10.827814102172852
epoch: 0, iter: 133800, loss: 11.458784103393555
epoch: 0, iter: 133900, loss: 11.306711196899414
epoch: 0, iter: 134000, loss: 11.8717041015625
epoch: 0, iteration: 13

epoch: 0, iter: 144100, loss: 11.4356050491333
epoch: 0, iter: 144200, loss: 11.634811401367188
epoch: 0, iter: 144300, loss: 11.083287239074707
epoch: 0, iter: 144400, loss: 11.311080932617188
epoch: 0, iter: 144500, loss: 12.236970901489258
epoch: 0, iter: 144600, loss: 11.448102951049805
epoch: 0, iter: 144700, loss: 10.947938919067383
epoch: 0, iter: 144800, loss: 11.367523193359375
epoch: 0, iter: 144900, loss: 11.834500312805176
epoch: 0, iter: 145000, loss: 11.755539894104004
epoch: 0, iter: 145100, loss: 11.179652214050293
epoch: 0, iter: 145200, loss: 11.906011581420898
epoch: 0, iter: 145300, loss: 11.234804153442383
epoch: 0, iter: 145400, loss: 11.980571746826172
epoch: 0, iter: 145500, loss: 12.10007095336914
epoch: 0, iter: 145600, loss: 11.743902206420898
epoch: 0, iter: 145700, loss: 11.748456954956055
epoch: 0, iter: 145800, loss: 12.265347480773926
epoch: 0, iter: 145900, loss: 11.490777015686035
epoch: 0, iter: 146000, loss: 11.12562370300293
epoch: 0, iteration: 146

epoch: 0, iter: 156100, loss: 11.745922088623047
epoch: 0, iter: 156200, loss: 11.779789924621582
epoch: 0, iter: 156300, loss: 11.855816841125488
epoch: 0, iter: 156400, loss: 10.929489135742188
epoch: 0, iter: 156500, loss: 11.64958381652832
epoch: 0, iter: 156600, loss: 11.584351539611816
epoch: 0, iter: 156700, loss: 11.642061233520508
epoch: 0, iter: 156800, loss: 12.368048667907715
epoch: 0, iter: 156900, loss: 11.622170448303223
epoch: 0, iter: 157000, loss: 11.496026039123535
epoch: 0, iter: 157100, loss: 11.927581787109375
epoch: 0, iter: 157200, loss: 11.708251953125
epoch: 0, iter: 157300, loss: 11.439584732055664
epoch: 0, iter: 157400, loss: 11.209953308105469
epoch: 0, iter: 157500, loss: 11.851616859436035
epoch: 0, iter: 157600, loss: 11.70179557800293
epoch: 0, iter: 157700, loss: 11.841743469238281
epoch: 0, iter: 157800, loss: 11.623915672302246
epoch: 0, iter: 157900, loss: 11.688606262207031
epoch: 0, iter: 158000, loss: 11.223831176757812
epoch: 0, iteration: 1580

epoch: 0, iter: 168100, loss: 11.039896965026855
epoch: 0, iter: 168200, loss: 11.110939979553223
epoch: 0, iter: 168300, loss: 11.912881851196289
epoch: 0, iter: 168400, loss: 11.161237716674805
epoch: 0, iter: 168500, loss: 10.611534118652344
epoch: 0, iter: 168600, loss: 11.528963088989258
epoch: 0, iter: 168700, loss: 11.71331787109375
epoch: 0, iter: 168800, loss: 10.98074722290039
epoch: 0, iter: 168900, loss: 12.09313678741455
epoch: 0, iter: 169000, loss: 11.26278305053711
epoch: 0, iter: 169100, loss: 11.225979804992676
epoch: 0, iter: 169200, loss: 11.757174491882324
epoch: 0, iter: 169300, loss: 11.647377014160156
epoch: 0, iter: 169400, loss: 11.423784255981445
epoch: 0, iter: 169500, loss: 11.621198654174805
epoch: 0, iter: 169600, loss: 12.080116271972656
epoch: 0, iter: 169700, loss: 11.187460899353027
epoch: 0, iter: 169800, loss: 11.186626434326172
epoch: 0, iter: 169900, loss: 11.547027587890625
epoch: 0, iter: 170000, loss: 11.4912748336792
epoch: 0, iteration: 17000

epoch: 0, iter: 180100, loss: 11.090185165405273
epoch: 0, iter: 180200, loss: 11.953389167785645
epoch: 0, iter: 180300, loss: 11.54367446899414
epoch: 0, iter: 180400, loss: 11.753445625305176
epoch: 0, iter: 180500, loss: 11.46009349822998
epoch: 0, iter: 180600, loss: 11.766695976257324
epoch: 0, iter: 180700, loss: 11.10057544708252
epoch: 0, iter: 180800, loss: 10.722545623779297
epoch: 0, iter: 180900, loss: 11.46096420288086
epoch: 0, iter: 181000, loss: 12.153016090393066
epoch: 0, iter: 181100, loss: 11.785270690917969
epoch: 0, iter: 181200, loss: 11.276440620422363
epoch: 0, iter: 181300, loss: 11.68997573852539
epoch: 0, iter: 181400, loss: 11.880184173583984
epoch: 0, iter: 181500, loss: 10.43293571472168
epoch: 0, iter: 181600, loss: 11.225030899047852
epoch: 0, iter: 181700, loss: 11.082812309265137
epoch: 0, iter: 181800, loss: 12.115926742553711
epoch: 0, iter: 181900, loss: 11.35693359375
epoch: 0, iter: 182000, loss: 11.686400413513184
epoch: 0, iteration: 182000, s

epoch: 0, iter: 192100, loss: 11.137178421020508
epoch: 0, iter: 192200, loss: 11.457162857055664
epoch: 0, iter: 192300, loss: 11.075242042541504
epoch: 0, iter: 192400, loss: 10.966530799865723
epoch: 0, iter: 192500, loss: 11.903085708618164
epoch: 0, iter: 192600, loss: 10.985834121704102
epoch: 0, iter: 192700, loss: 11.185771942138672
epoch: 0, iter: 192800, loss: 11.517904281616211
epoch: 0, iter: 192900, loss: 12.232954025268555
epoch: 0, iter: 193000, loss: 11.539508819580078
epoch: 0, iter: 193100, loss: 10.41205883026123
epoch: 0, iter: 193200, loss: 11.262755393981934
epoch: 0, iter: 193300, loss: 11.246628761291504
epoch: 0, iter: 193400, loss: 11.322210311889648
epoch: 0, iter: 193500, loss: 11.147115707397461
epoch: 0, iter: 193600, loss: 10.661967277526855
epoch: 0, iter: 193700, loss: 11.923043251037598
epoch: 0, iter: 193800, loss: 11.415914535522461
epoch: 0, iter: 193900, loss: 11.389062881469727
epoch: 0, iter: 194000, loss: 11.046623229980469
epoch: 0, iteration: 

epoch: 0, iter: 204100, loss: 12.069547653198242
epoch: 0, iter: 204200, loss: 11.146254539489746
epoch: 0, iter: 204300, loss: 11.893793106079102
epoch: 0, iter: 204400, loss: 11.607210159301758
epoch: 0, iter: 204500, loss: 12.03763198852539
epoch: 0, iter: 204600, loss: 11.474799156188965
epoch: 0, iter: 204700, loss: 11.233381271362305
epoch: 0, iter: 204800, loss: 11.991693496704102
epoch: 0, iter: 204900, loss: 11.771105766296387
epoch: 0, iter: 205000, loss: 11.326339721679688
epoch: 0, iter: 205100, loss: 11.7488431930542
epoch: 0, iter: 205200, loss: 11.5686616897583
epoch: 0, iter: 205300, loss: 12.080113410949707
epoch: 0, iter: 205400, loss: 10.87888240814209
epoch: 0, iter: 205500, loss: 11.283507347106934
epoch: 0, iter: 205600, loss: 11.663151741027832
epoch: 0, iter: 205700, loss: 11.183731079101562
epoch: 0, iter: 205800, loss: 11.23902416229248
epoch: 0, iter: 205900, loss: 11.189193725585938
epoch: 0, iter: 206000, loss: 11.550821304321289
epoch: 0, iteration: 206000

epoch: 0, iter: 216100, loss: 11.1701021194458
epoch: 0, iter: 216200, loss: 12.03540325164795
epoch: 0, iter: 216300, loss: 12.089823722839355
epoch: 0, iter: 216400, loss: 11.287643432617188
epoch: 0, iter: 216500, loss: 11.284204483032227
epoch: 0, iter: 216600, loss: 10.915752410888672
epoch: 0, iter: 216700, loss: 11.872382164001465
epoch: 0, iter: 216800, loss: 11.521516799926758
epoch: 0, iter: 216900, loss: 10.597962379455566
epoch: 0, iter: 217000, loss: 11.078991889953613
epoch: 0, iter: 217100, loss: 11.2544584274292
epoch: 0, iter: 217200, loss: 11.758410453796387
epoch: 0, iter: 217300, loss: 11.50547981262207
epoch: 0, iter: 217400, loss: 11.467378616333008
epoch: 0, iter: 217500, loss: 10.961986541748047
epoch: 0, iter: 217600, loss: 11.891656875610352
epoch: 0, iter: 217700, loss: 11.148626327514648
epoch: 0, iter: 217800, loss: 11.049010276794434
epoch: 0, iter: 217900, loss: 12.007476806640625
epoch: 0, iter: 218000, loss: 11.875362396240234
epoch: 0, iteration: 21800

epoch: 0, iter: 228100, loss: 11.542698860168457
epoch: 0, iter: 228200, loss: 11.453659057617188
epoch: 0, iter: 228300, loss: 10.677862167358398
epoch: 0, iter: 228400, loss: 11.099722862243652
epoch: 0, iter: 228500, loss: 11.446996688842773
epoch: 0, iter: 228600, loss: 11.294355392456055
epoch: 0, iter: 228700, loss: 11.643220901489258
epoch: 0, iter: 228800, loss: 11.088661193847656
epoch: 0, iter: 228900, loss: 11.101150512695312
epoch: 0, iter: 229000, loss: 11.90011215209961
epoch: 0, iter: 229100, loss: 11.389514923095703
epoch: 0, iter: 229200, loss: 10.56958293914795
epoch: 0, iter: 229300, loss: 11.619606971740723
epoch: 0, iter: 229400, loss: 11.546365737915039
epoch: 0, iter: 229500, loss: 10.676097869873047
epoch: 0, iter: 229600, loss: 11.87415599822998
epoch: 0, iter: 229700, loss: 11.591650009155273
epoch: 0, iter: 229800, loss: 12.202205657958984
epoch: 0, iter: 229900, loss: 11.679741859436035
epoch: 0, iter: 230000, loss: 11.528962135314941
epoch: 0, iteration: 23

epoch: 0, iter: 240100, loss: 11.118514060974121
epoch: 0, iter: 240200, loss: 11.2254056930542
epoch: 0, iter: 240300, loss: 11.758247375488281
epoch: 0, iter: 240400, loss: 12.122637748718262
epoch: 0, iter: 240500, loss: 11.088740348815918
epoch: 0, iter: 240600, loss: 11.657621383666992
epoch: 0, iter: 240700, loss: 11.675909996032715
epoch: 0, iter: 240800, loss: 11.690454483032227
epoch: 0, iter: 240900, loss: 11.472369194030762
epoch: 0, iter: 241000, loss: 12.155494689941406
epoch: 0, iter: 241100, loss: 11.84433364868164
epoch: 0, iter: 241200, loss: 11.395380020141602
epoch: 0, iter: 241300, loss: 12.184316635131836
epoch: 0, iter: 241400, loss: 11.35064697265625
epoch: 0, iter: 241500, loss: 10.436685562133789
epoch: 0, iter: 241600, loss: 11.854750633239746
epoch: 0, iter: 241700, loss: 11.880831718444824
epoch: 0, iter: 241800, loss: 10.667790412902832
epoch: 0, iter: 241900, loss: 11.35820198059082
epoch: 0, iter: 242000, loss: 11.643926620483398
epoch: 0, iteration: 2420

epoch: 0, iter: 252100, loss: 11.351543426513672
epoch: 0, iter: 252200, loss: 11.929389953613281
epoch: 0, iter: 252300, loss: 10.69474983215332
epoch: 0, iter: 252400, loss: 11.482152938842773
epoch: 0, iter: 252500, loss: 11.830768585205078
epoch: 0, iter: 252600, loss: 11.284900665283203
epoch: 0, iter: 252700, loss: 11.239938735961914
epoch: 0, iter: 252800, loss: 11.099348068237305
epoch: 0, iter: 252900, loss: 11.666582107543945
epoch: 0, iter: 253000, loss: 11.891585350036621
epoch: 0, iter: 253100, loss: 12.187602996826172
epoch: 0, iter: 253200, loss: 11.698333740234375
epoch: 0, iter: 253300, loss: 11.771513938903809
epoch: 0, iter: 253400, loss: 10.893658638000488
epoch: 0, iter: 253500, loss: 11.124748229980469
epoch: 0, iter: 253600, loss: 10.977779388427734
epoch: 0, iter: 253700, loss: 11.06995677947998
epoch: 0, iter: 253800, loss: 12.125645637512207
epoch: 0, iter: 253900, loss: 11.270247459411621
epoch: 0, iter: 254000, loss: 11.883584022521973
epoch: 0, iteration: 2

epoch: 0, iter: 264100, loss: 11.112356185913086
epoch: 0, iter: 264200, loss: 11.611568450927734
epoch: 0, iter: 264300, loss: 11.513851165771484
epoch: 0, iter: 264400, loss: 10.997224807739258
epoch: 0, iter: 264500, loss: 11.413169860839844
epoch: 0, iter: 264600, loss: 11.203251838684082
epoch: 0, iter: 264700, loss: 11.168595314025879
epoch: 0, iter: 264800, loss: 11.742165565490723
epoch: 0, iter: 264900, loss: 11.674676895141602
epoch: 0, iter: 265000, loss: 11.40509033203125
epoch: 0, iter: 265100, loss: 11.36636734008789
epoch: 0, iter: 265200, loss: 12.40318775177002
epoch: 0, iter: 265300, loss: 11.874512672424316
epoch: 0, iter: 265400, loss: 11.110729217529297
epoch: 0, iter: 265500, loss: 11.24526596069336
epoch: 0, iter: 265600, loss: 11.370089530944824
epoch: 0, iter: 265700, loss: 11.529664993286133
epoch: 0, iter: 265800, loss: 11.289911270141602
epoch: 0, iter: 265900, loss: 10.757589340209961
epoch: 0, iter: 266000, loss: 11.596047401428223
epoch: 0, iteration: 266

epoch: 0, iter: 276100, loss: 11.470839500427246
epoch: 0, iter: 276200, loss: 10.995189666748047
epoch: 0, iter: 276300, loss: 10.977116584777832
epoch: 0, iter: 276400, loss: 11.436588287353516
epoch: 0, iter: 276500, loss: 11.23040771484375
epoch: 0, iter: 276600, loss: 11.454416275024414
epoch: 0, iter: 276700, loss: 11.936423301696777
epoch: 0, iter: 276800, loss: 11.450571060180664
epoch: 0, iter: 276900, loss: 11.35940170288086
epoch: 0, iter: 277000, loss: 11.026724815368652
epoch: 0, iter: 277100, loss: 11.046656608581543
epoch: 0, iter: 277200, loss: 11.651577949523926
epoch: 0, iter: 277300, loss: 11.289670944213867
epoch: 0, iter: 277400, loss: 11.361491203308105
epoch: 0, iter: 277500, loss: 11.371637344360352
epoch: 0, iter: 277600, loss: 11.45529556274414
epoch: 0, iter: 277700, loss: 11.057517051696777
epoch: 0, iter: 277800, loss: 11.462898254394531
epoch: 0, iter: 277900, loss: 11.397003173828125
epoch: 0, iter: 278000, loss: 11.665596961975098
epoch: 0, iteration: 27

epoch: 0, iter: 288100, loss: 10.737722396850586
epoch: 0, iter: 288200, loss: 10.708979606628418
epoch: 0, iter: 288300, loss: 11.679311752319336
epoch: 0, iter: 288400, loss: 11.256463050842285
epoch: 0, iter: 288500, loss: 12.909807205200195
epoch: 0, iter: 288600, loss: 11.213282585144043
epoch: 0, iter: 288700, loss: 11.544684410095215
epoch: 0, iter: 288800, loss: 10.83371639251709
epoch: 0, iter: 288900, loss: 10.897549629211426
epoch: 0, iter: 289000, loss: 10.962357521057129
epoch: 0, iter: 289100, loss: 11.1315279006958
epoch: 0, iter: 289200, loss: 11.68606948852539
epoch: 0, iter: 289300, loss: 11.382966995239258
epoch: 0, iter: 289400, loss: 11.495035171508789
epoch: 0, iter: 289500, loss: 11.255025863647461
epoch: 0, iter: 289600, loss: 11.429162979125977
epoch: 0, iter: 289700, loss: 11.341487884521484
epoch: 0, iter: 289800, loss: 10.897457122802734
epoch: 0, iter: 289900, loss: 11.665961265563965
epoch: 0, iter: 290000, loss: 11.434289932250977
epoch: 0, iteration: 290

epoch: 0, iter: 300100, loss: 11.454395294189453
epoch: 0, iter: 300200, loss: 11.075627326965332
epoch: 0, iter: 300300, loss: 12.213724136352539
epoch: 0, iter: 300400, loss: 11.342933654785156
epoch: 0, iter: 300500, loss: 11.734464645385742
epoch: 0, iter: 300600, loss: 11.523920059204102
epoch: 0, iter: 300700, loss: 11.548796653747559
epoch: 0, iter: 300800, loss: 11.181425094604492
epoch: 0, iter: 300900, loss: 11.509031295776367
epoch: 0, iter: 301000, loss: 10.907533645629883
epoch: 0, iter: 301100, loss: 11.461844444274902
epoch: 0, iter: 301200, loss: 11.310422897338867
epoch: 0, iter: 301300, loss: 11.590632438659668
epoch: 0, iter: 301400, loss: 12.068644523620605
epoch: 0, iter: 301500, loss: 11.865399360656738
epoch: 0, iter: 301600, loss: 11.272879600524902
epoch: 0, iter: 301700, loss: 10.898653984069824
epoch: 0, iter: 301800, loss: 11.158014297485352
epoch: 0, iter: 301900, loss: 11.693648338317871
epoch: 0, iter: 302000, loss: 11.736193656921387
epoch: 0, iteration:

epoch: 0, iter: 312100, loss: 10.9503173828125
epoch: 0, iter: 312200, loss: 11.556374549865723
epoch: 0, iter: 312300, loss: 11.951698303222656
epoch: 0, iter: 312400, loss: 11.758517265319824
epoch: 0, iter: 312500, loss: 11.652471542358398
epoch: 0, iter: 312600, loss: 11.324699401855469
epoch: 0, iter: 312700, loss: 11.414352416992188
epoch: 0, iter: 312800, loss: 11.548324584960938
epoch: 0, iter: 312900, loss: 11.889546394348145
epoch: 0, iter: 313000, loss: 11.512578010559082
epoch: 0, iter: 313100, loss: 11.128235816955566
epoch: 0, iter: 313200, loss: 11.248445510864258
epoch: 0, iter: 313300, loss: 10.363587379455566
epoch: 0, iter: 313400, loss: 11.840630531311035
epoch: 0, iter: 313500, loss: 11.281039237976074
epoch: 0, iter: 313600, loss: 10.941038131713867
epoch: 0, iter: 313700, loss: 11.017096519470215
epoch: 0, iter: 313800, loss: 11.25251579284668
epoch: 0, iter: 313900, loss: 11.785686492919922
epoch: 0, iter: 314000, loss: 10.975617408752441
epoch: 0, iteration: 31

epoch: 0, iter: 324100, loss: 12.024188041687012
epoch: 0, iter: 324200, loss: 11.466096878051758
epoch: 0, iter: 324300, loss: 11.34410572052002
epoch: 0, iter: 324400, loss: 11.86225700378418
epoch: 0, iter: 324500, loss: 11.665178298950195
epoch: 0, iter: 324600, loss: 11.913183212280273
epoch: 0, iter: 324700, loss: 10.93771743774414
epoch: 0, iter: 324800, loss: 12.522050857543945
epoch: 0, iter: 324900, loss: 11.08456802368164
epoch: 0, iter: 325000, loss: 11.031689643859863
epoch: 0, iter: 325100, loss: 10.475017547607422
epoch: 0, iter: 325200, loss: 10.900367736816406
epoch: 0, iter: 325300, loss: 12.149263381958008
epoch: 0, iter: 325400, loss: 11.302902221679688
epoch: 0, iter: 325500, loss: 11.374266624450684
epoch: 0, iter: 325600, loss: 11.396123886108398
epoch: 0, iter: 325700, loss: 11.511282920837402
epoch: 0, iter: 325800, loss: 11.260234832763672
epoch: 0, iter: 325900, loss: 11.329360961914062
epoch: 0, iter: 326000, loss: 11.2615327835083
epoch: 0, iteration: 32600

epoch: 0, iter: 336100, loss: 11.991351127624512
epoch: 0, iter: 336200, loss: 11.344064712524414
epoch: 0, iter: 336300, loss: 11.024457931518555
epoch: 0, iter: 336400, loss: 11.101261138916016
epoch: 0, iter: 336500, loss: 10.766898155212402
epoch: 0, iter: 336600, loss: 11.024117469787598
epoch: 0, iter: 336700, loss: 10.777314186096191
epoch: 0, iter: 336800, loss: 11.182938575744629
epoch: 0, iter: 336900, loss: 11.169828414916992
epoch: 0, iter: 337000, loss: 11.776643753051758
epoch: 0, iter: 337100, loss: 10.846199989318848
epoch: 0, iter: 337200, loss: 11.469167709350586
epoch: 0, iter: 337300, loss: 11.635801315307617
epoch: 0, iter: 337400, loss: 11.143136024475098
epoch: 0, iter: 337500, loss: 10.350590705871582
epoch: 0, iter: 337600, loss: 11.447093963623047
epoch: 0, iter: 337700, loss: 11.46265697479248
epoch: 0, iter: 337800, loss: 11.27100944519043
epoch: 0, iter: 337900, loss: 10.955117225646973
epoch: 0, iter: 338000, loss: 10.785004615783691
epoch: 0, iteration: 3

epoch: 0, iter: 348100, loss: 11.785426139831543
epoch: 0, iter: 348200, loss: 10.857661247253418
epoch: 0, iter: 348300, loss: 11.850385665893555
epoch: 0, iter: 348400, loss: 11.696052551269531
epoch: 0, iter: 348500, loss: 11.23501205444336
epoch: 0, iter: 348600, loss: 11.015312194824219
epoch: 0, iter: 348700, loss: 11.532173156738281
epoch: 0, iter: 348800, loss: 10.420066833496094
epoch: 0, iter: 348900, loss: 10.695056915283203
epoch: 0, iter: 349000, loss: 11.601234436035156
epoch: 0, iter: 349100, loss: 10.954727172851562
epoch: 0, iter: 349200, loss: 11.028566360473633
epoch: 0, iter: 349300, loss: 10.841217994689941
epoch: 0, iter: 349400, loss: 11.176000595092773
epoch: 0, iter: 349500, loss: 11.29881477355957
epoch: 0, iter: 349600, loss: 11.299059867858887
epoch: 0, iter: 349700, loss: 11.2485933303833
epoch: 0, iter: 349800, loss: 11.616020202636719
epoch: 0, iter: 349900, loss: 11.486583709716797
epoch: 0, iter: 350000, loss: 11.816116333007812
epoch: 0, iteration: 350

epoch: 0, iter: 360100, loss: 11.427862167358398
epoch: 0, iter: 360200, loss: 11.34166145324707
epoch: 0, iter: 360300, loss: 11.686708450317383
epoch: 0, iter: 360400, loss: 10.925749778747559
epoch: 0, iter: 360500, loss: 10.956079483032227
epoch: 0, iter: 360600, loss: 11.283712387084961
epoch: 0, iter: 360700, loss: 11.541314125061035
epoch: 0, iter: 360800, loss: 11.599288940429688
epoch: 0, iter: 360900, loss: 11.085792541503906
epoch: 0, iter: 361000, loss: 12.012290000915527
epoch: 0, iter: 361100, loss: 10.876388549804688
epoch: 0, iter: 361200, loss: 11.786643028259277
epoch: 0, iter: 361300, loss: 11.525917053222656
epoch: 0, iter: 361400, loss: 10.861713409423828
epoch: 0, iter: 361500, loss: 11.039424896240234
epoch: 0, iter: 361600, loss: 11.295374870300293
epoch: 0, iter: 361700, loss: 11.437152862548828
epoch: 0, iter: 361800, loss: 12.139970779418945
epoch: 0, iter: 361900, loss: 10.98206615447998
epoch: 0, iter: 362000, loss: 11.424155235290527
epoch: 0, iteration: 3

epoch: 0, iter: 372100, loss: 11.681467056274414
epoch: 0, iter: 372200, loss: 11.486305236816406
epoch: 0, iter: 372300, loss: 11.03354263305664
epoch: 0, iter: 372400, loss: 11.298450469970703
epoch: 0, iter: 372500, loss: 11.841912269592285
epoch: 0, iter: 372600, loss: 11.838630676269531
epoch: 0, iter: 372700, loss: 11.6641263961792
epoch: 0, iter: 372800, loss: 11.945866584777832
epoch: 0, iter: 372900, loss: 11.063522338867188
epoch: 0, iter: 373000, loss: 11.142114639282227
epoch: 0, iter: 373100, loss: 11.235666275024414
epoch: 0, iter: 373200, loss: 10.965405464172363
epoch: 0, iter: 373300, loss: 11.760381698608398
epoch: 0, iter: 373400, loss: 10.888360977172852
epoch: 0, iter: 373500, loss: 11.766383171081543
epoch: 0, iter: 373600, loss: 11.461511611938477
epoch: 0, iter: 373700, loss: 10.727136611938477
epoch: 0, iter: 373800, loss: 11.509536743164062
epoch: 0, iter: 373900, loss: 10.955926895141602
epoch: 0, iter: 374000, loss: 11.14954662322998
epoch: 0, iteration: 374

epoch: 0, iter: 384100, loss: 10.856330871582031
epoch: 0, iter: 384200, loss: 11.33133602142334
epoch: 0, iter: 384300, loss: 10.65456485748291
epoch: 0, iter: 384400, loss: 11.64821720123291
epoch: 0, iter: 384500, loss: 10.674111366271973
epoch: 0, iter: 384600, loss: 10.873263359069824
epoch: 0, iter: 384700, loss: 10.958484649658203
epoch: 0, iter: 384800, loss: 11.347101211547852
epoch: 0, iter: 384900, loss: 11.111138343811035
epoch: 0, iter: 385000, loss: 11.009770393371582
epoch: 0, iter: 385100, loss: 11.189162254333496
epoch: 0, iter: 385200, loss: 10.854365348815918
epoch: 0, iter: 385300, loss: 11.612722396850586
epoch: 0, iter: 385400, loss: 11.193160057067871
epoch: 0, iter: 385500, loss: 11.82893180847168
epoch: 0, iter: 385600, loss: 11.259968757629395
epoch: 0, iter: 385700, loss: 11.22782039642334
epoch: 0, iter: 385800, loss: 11.31762981414795
epoch: 0, iter: 385900, loss: 11.047696113586426
epoch: 0, iter: 386000, loss: 11.559019088745117
epoch: 0, iteration: 38600

epoch: 0, iter: 396100, loss: 11.29081916809082
epoch: 0, iter: 396200, loss: 12.231623649597168
epoch: 0, iter: 396300, loss: 11.342403411865234
epoch: 0, iter: 396400, loss: 12.016568183898926
epoch: 0, iter: 396500, loss: 11.94694995880127
epoch: 0, iter: 396600, loss: 10.597472190856934
epoch: 0, iter: 396700, loss: 11.346927642822266
epoch: 0, iter: 396800, loss: 11.149637222290039
epoch: 0, iter: 396900, loss: 11.944972038269043
epoch: 0, iter: 397000, loss: 11.349061965942383
epoch: 0, iter: 397100, loss: 11.601856231689453
epoch: 0, iter: 397200, loss: 11.069112777709961
epoch: 0, iter: 397300, loss: 10.918126106262207
epoch: 0, iter: 397400, loss: 11.204771041870117
epoch: 0, iter: 397500, loss: 11.376755714416504
epoch: 0, iter: 397600, loss: 11.317876815795898
epoch: 0, iter: 397700, loss: 10.919249534606934
epoch: 0, iter: 397800, loss: 10.720453262329102
epoch: 0, iter: 397900, loss: 10.761446952819824
epoch: 0, iter: 398000, loss: 11.46279525756836
epoch: 0, iteration: 39

epoch: 0, iter: 408100, loss: 11.779541969299316
epoch: 0, iter: 408200, loss: 11.650177001953125
epoch: 0, iter: 408300, loss: 11.791118621826172
epoch: 0, iter: 408400, loss: 10.825400352478027
epoch: 0, iter: 408500, loss: 11.601175308227539
epoch: 0, iter: 408600, loss: 12.147032737731934
epoch: 0, iter: 408700, loss: 10.999972343444824
epoch: 0, iter: 408800, loss: 12.574047088623047
epoch: 0, iter: 408900, loss: 11.579950332641602
epoch: 0, iter: 409000, loss: 11.637733459472656
epoch: 0, iter: 409100, loss: 11.412740707397461
epoch: 0, iter: 409200, loss: 11.277876853942871
epoch: 0, iter: 409300, loss: 11.734307289123535
epoch: 0, iter: 409400, loss: 11.455856323242188
epoch: 0, iter: 409500, loss: 11.052420616149902
epoch: 0, iter: 409600, loss: 11.059600830078125
epoch: 0, iter: 409700, loss: 12.0580415725708
epoch: 0, iter: 409800, loss: 11.26962661743164
epoch: 0, iter: 409900, loss: 11.790458679199219
epoch: 0, iter: 410000, loss: 11.379515647888184
epoch: 0, iteration: 41

epoch: 0, iter: 420100, loss: 10.455055236816406
epoch: 0, iter: 420200, loss: 11.575783729553223
epoch: 0, iter: 420300, loss: 11.021283149719238
epoch: 0, iter: 420400, loss: 11.274637222290039
epoch: 0, iter: 420500, loss: 10.360069274902344
epoch: 0, iter: 420600, loss: 11.636823654174805
epoch: 0, iter: 420700, loss: 10.890934944152832
epoch: 0, iter: 420800, loss: 12.317852973937988
epoch: 0, iter: 420900, loss: 11.6246976852417
epoch: 0, iter: 421000, loss: 11.164423942565918
epoch: 0, iter: 421100, loss: 12.145063400268555
epoch: 0, iter: 421200, loss: 12.04057502746582
epoch: 0, iter: 421300, loss: 12.14256477355957
epoch: 0, iter: 421400, loss: 11.30630874633789
epoch: 0, iter: 421500, loss: 11.030746459960938
epoch: 0, iter: 421600, loss: 11.737908363342285
epoch: 0, iter: 421700, loss: 10.92831802368164
epoch: 0, iter: 421800, loss: 11.715524673461914
epoch: 0, iter: 421900, loss: 11.661860466003418
epoch: 0, iter: 422000, loss: 11.635869979858398
epoch: 0, iteration: 42200

epoch: 0, iter: 432100, loss: 10.966445922851562
epoch: 0, iter: 432200, loss: 11.823445320129395
epoch: 0, iter: 432300, loss: 11.342015266418457
epoch: 0, iter: 432400, loss: 11.226673126220703
epoch: 0, iter: 432500, loss: 11.17452335357666
epoch: 0, iter: 432600, loss: 11.598169326782227
epoch: 0, iter: 432700, loss: 11.33889102935791
epoch: 0, iter: 432800, loss: 10.518770217895508
epoch: 0, iter: 432900, loss: 11.55213737487793
epoch: 0, iter: 433000, loss: 11.06940746307373
epoch: 0, iter: 433100, loss: 11.56338119506836
epoch: 0, iter: 433200, loss: 11.469023704528809
epoch: 0, iter: 433300, loss: 11.329580307006836
epoch: 0, iter: 433400, loss: 10.751382827758789
epoch: 0, iter: 433500, loss: 11.35960865020752
epoch: 0, iter: 433600, loss: 11.261940956115723
epoch: 0, iter: 433700, loss: 12.174213409423828
epoch: 0, iter: 433800, loss: 11.416852951049805
epoch: 0, iter: 433900, loss: 10.941497802734375
epoch: 0, iter: 434000, loss: 11.705933570861816
epoch: 0, iteration: 43400

epoch: 0, iter: 444100, loss: 11.683469772338867
epoch: 0, iter: 444200, loss: 12.01870346069336
epoch: 0, iter: 444300, loss: 11.331128120422363
epoch: 0, iter: 444400, loss: 11.0176362991333
epoch: 0, iter: 444500, loss: 10.556511878967285
epoch: 0, iter: 444600, loss: 11.246403694152832
epoch: 0, iter: 444700, loss: 10.725255012512207
epoch: 0, iter: 444800, loss: 11.01286792755127
epoch: 0, iter: 444900, loss: 11.764869689941406
epoch: 0, iter: 445000, loss: 11.24378776550293
epoch: 0, iter: 445100, loss: 11.032085418701172
epoch: 0, iter: 445200, loss: 10.605826377868652
epoch: 0, iter: 445300, loss: 11.376529693603516
epoch: 0, iter: 445400, loss: 11.80075740814209
epoch: 0, iter: 445500, loss: 11.571449279785156
epoch: 0, iter: 445600, loss: 11.272469520568848
epoch: 0, iter: 445700, loss: 11.605645179748535
epoch: 0, iter: 445800, loss: 11.805532455444336
epoch: 0, iter: 445900, loss: 11.91768741607666
epoch: 0, iter: 446000, loss: 11.170608520507812
epoch: 0, iteration: 446000

epoch: 0, iter: 456100, loss: 11.114847183227539
epoch: 0, iter: 456200, loss: 11.014121055603027
epoch: 0, iter: 456300, loss: 11.698596000671387
epoch: 0, iter: 456400, loss: 11.620346069335938
epoch: 0, iter: 456500, loss: 11.491429328918457
epoch: 0, iter: 456600, loss: 11.071727752685547
epoch: 0, iter: 456700, loss: 11.740540504455566
epoch: 0, iter: 456800, loss: 11.199275970458984
epoch: 0, iter: 456900, loss: 10.765937805175781
epoch: 0, iter: 457000, loss: 10.742925643920898
epoch: 0, iter: 457100, loss: 11.283797264099121
epoch: 0, iter: 457200, loss: 11.571139335632324
epoch: 0, iter: 457300, loss: 10.49561882019043
epoch: 0, iter: 457400, loss: 12.006497383117676
epoch: 0, iter: 457500, loss: 11.3284912109375
epoch: 0, iter: 457600, loss: 11.081747055053711
epoch: 0, iter: 457700, loss: 11.480844497680664
epoch: 0, iter: 457800, loss: 12.322287559509277
epoch: 0, iter: 457900, loss: 11.576096534729004
epoch: 0, iter: 458000, loss: 11.278947830200195
epoch: 0, iteration: 45

epoch: 0, iter: 468100, loss: 10.845634460449219
epoch: 0, iter: 468200, loss: 11.238813400268555
epoch: 0, iter: 468300, loss: 11.498359680175781
epoch: 0, iter: 468400, loss: 11.546070098876953
epoch: 0, iter: 468500, loss: 11.050278663635254
epoch: 0, iter: 468600, loss: 11.497187614440918
epoch: 0, iter: 468700, loss: 12.108531951904297
epoch: 0, iter: 468800, loss: 11.675360679626465
epoch: 0, iter: 468900, loss: 11.11349105834961
epoch: 0, iter: 469000, loss: 11.38863754272461
epoch: 0, iter: 469100, loss: 11.70390796661377
epoch: 0, iter: 469200, loss: 11.999160766601562
epoch: 0, iter: 469300, loss: 11.262419700622559
epoch: 0, iter: 469400, loss: 11.108016967773438
epoch: 0, iter: 469500, loss: 11.351262092590332
epoch: 0, iter: 469600, loss: 11.535988807678223
epoch: 0, iter: 469700, loss: 10.62255859375
epoch: 0, iter: 469800, loss: 11.125840187072754
epoch: 0, iter: 469900, loss: 11.531224250793457
epoch: 0, iter: 470000, loss: 11.057272911071777
epoch: 0, iteration: 470000

epoch: 1, iter: 1800, loss: 10.918845176696777
epoch: 1, iter: 1900, loss: 11.020317077636719
epoch: 1, iter: 2000, loss: 11.269536972045898
epoch: 1, iteration: 2000, simlex-999: SpearmanrResult(correlation=0.15090879941345595, pvalue=0.00021498080422196138), men: SpearmanrResult(correlation=0.33215434827617996, pvalue=7.897089133397721e-35), sim353: SpearmanrResult(correlation=0.35198715596965896, pvalue=1.5869568401419347e-08), nearest to monster: ['monster', 'cow', 'pen', 'arch', 'bull', 'flower', 'loch', 'dam', 'whale', 'mosque']

epoch: 1, iter: 2100, loss: 11.25626277923584
epoch: 1, iter: 2200, loss: 11.687488555908203
epoch: 1, iter: 2300, loss: 10.832658767700195
epoch: 1, iter: 2400, loss: 11.819114685058594
epoch: 1, iter: 2500, loss: 11.393999099731445
epoch: 1, iter: 2600, loss: 11.445016860961914
epoch: 1, iter: 2700, loss: 11.441900253295898
epoch: 1, iter: 2800, loss: 11.33147144317627
epoch: 1, iter: 2900, loss: 11.011962890625
epoch: 1, iter: 3000, loss: 10.942445755

epoch: 1, iter: 14100, loss: 11.300692558288574
epoch: 1, iter: 14200, loss: 11.322765350341797
epoch: 1, iter: 14300, loss: 11.719510078430176
epoch: 1, iter: 14400, loss: 11.035600662231445
epoch: 1, iter: 14500, loss: 11.087489128112793
epoch: 1, iter: 14600, loss: 10.96407413482666
epoch: 1, iter: 14700, loss: 11.198843002319336
epoch: 1, iter: 14800, loss: 11.052695274353027
epoch: 1, iter: 14900, loss: 10.754159927368164
epoch: 1, iter: 15000, loss: 11.385823249816895
epoch: 1, iter: 15100, loss: 11.128301620483398
epoch: 1, iter: 15200, loss: 11.61777114868164
epoch: 1, iter: 15300, loss: 11.487518310546875
epoch: 1, iter: 15400, loss: 11.185871124267578
epoch: 1, iter: 15500, loss: 11.686366081237793
epoch: 1, iter: 15600, loss: 12.15280532836914
epoch: 1, iter: 15700, loss: 11.225166320800781
epoch: 1, iter: 15800, loss: 11.135313034057617
epoch: 1, iter: 15900, loss: 11.344971656799316
epoch: 1, iter: 16000, loss: 11.436079025268555
epoch: 1, iteration: 16000, simlex-999: Spe

epoch: 1, iter: 26300, loss: 10.741165161132812
epoch: 1, iter: 26400, loss: 11.445236206054688
epoch: 1, iter: 26500, loss: 11.764164924621582
epoch: 1, iter: 26600, loss: 11.207797050476074
epoch: 1, iter: 26700, loss: 11.276344299316406
epoch: 1, iter: 26800, loss: 11.390767097473145
epoch: 1, iter: 26900, loss: 10.588213920593262
epoch: 1, iter: 27000, loss: 11.219806671142578
epoch: 1, iter: 27100, loss: 11.30235481262207
epoch: 1, iter: 27200, loss: 11.827703475952148
epoch: 1, iter: 27300, loss: 10.829516410827637
epoch: 1, iter: 27400, loss: 11.697892189025879
epoch: 1, iter: 27500, loss: 11.147656440734863
epoch: 1, iter: 27600, loss: 11.575058937072754
epoch: 1, iter: 27700, loss: 10.97193717956543
epoch: 1, iter: 27800, loss: 11.619051933288574
epoch: 1, iter: 27900, loss: 11.748370170593262
epoch: 1, iter: 28000, loss: 11.37106704711914
epoch: 1, iteration: 28000, simlex-999: SpearmanrResult(correlation=0.15444242860362395, pvalue=0.0001516044754082988), men: SpearmanrResul

epoch: 1, iter: 38400, loss: 10.915035247802734
epoch: 1, iter: 38500, loss: 11.814226150512695
epoch: 1, iter: 38600, loss: 10.76147747039795
epoch: 1, iter: 38700, loss: 10.984491348266602
epoch: 1, iter: 38800, loss: 11.729494094848633
epoch: 1, iter: 38900, loss: 10.968881607055664
epoch: 1, iter: 39000, loss: 10.598572731018066
epoch: 1, iter: 39100, loss: 11.15229320526123
epoch: 1, iter: 39200, loss: 10.862957000732422
epoch: 1, iter: 39300, loss: 11.871477127075195
epoch: 1, iter: 39400, loss: 11.34459114074707
epoch: 1, iter: 39500, loss: 10.897807121276855
epoch: 1, iter: 39600, loss: 10.217729568481445
epoch: 1, iter: 39700, loss: 11.482390403747559
epoch: 1, iter: 39800, loss: 10.861772537231445
epoch: 1, iter: 39900, loss: 11.109482765197754
epoch: 1, iter: 40000, loss: 10.884627342224121
epoch: 1, iteration: 40000, simlex-999: SpearmanrResult(correlation=0.15719335733490786, pvalue=0.00011490476497698774), men: SpearmanrResult(correlation=0.3518337639139103, pvalue=3.7940

epoch: 1, iter: 50500, loss: 11.130485534667969
epoch: 1, iter: 50600, loss: 11.703081130981445
epoch: 1, iter: 50700, loss: 11.171920776367188
epoch: 1, iter: 50800, loss: 11.606524467468262
epoch: 1, iter: 50900, loss: 11.21096134185791
epoch: 1, iter: 51000, loss: 11.486978530883789
epoch: 1, iter: 51100, loss: 10.909452438354492
epoch: 1, iter: 51200, loss: 11.458338737487793
epoch: 1, iter: 51300, loss: 10.677668571472168
epoch: 1, iter: 51400, loss: 11.346306800842285
epoch: 1, iter: 51500, loss: 11.29880428314209
epoch: 1, iter: 51600, loss: 11.05435562133789
epoch: 1, iter: 51700, loss: 11.695789337158203
epoch: 1, iter: 51800, loss: 10.515220642089844
epoch: 1, iter: 51900, loss: 11.463676452636719
epoch: 1, iter: 52000, loss: 11.127321243286133
epoch: 1, iteration: 52000, simlex-999: SpearmanrResult(correlation=0.15524168206615999, pvalue=0.00013994108890254565), men: SpearmanrResult(correlation=0.35392407427120426, pvalue=1.2646494742336553e-39), sim353: SpearmanrResult(corr

epoch: 1, iter: 62600, loss: 10.986783027648926
epoch: 1, iter: 62700, loss: 11.254256248474121
epoch: 1, iter: 62800, loss: 10.825746536254883
epoch: 1, iter: 62900, loss: 11.510576248168945
epoch: 1, iter: 63000, loss: 11.250343322753906
epoch: 1, iter: 63100, loss: 11.198448181152344
epoch: 1, iter: 63200, loss: 11.104626655578613
epoch: 1, iter: 63300, loss: 11.141884803771973
epoch: 1, iter: 63400, loss: 11.258155822753906
epoch: 1, iter: 63500, loss: 11.839920997619629
epoch: 1, iter: 63600, loss: 11.472936630249023
epoch: 1, iter: 63700, loss: 11.532313346862793
epoch: 1, iter: 63800, loss: 11.41724967956543
epoch: 1, iter: 63900, loss: 11.194047927856445
epoch: 1, iter: 64000, loss: 11.166964530944824
epoch: 1, iteration: 64000, simlex-999: SpearmanrResult(correlation=0.15627135221155894, pvalue=0.00012615551179248535), men: SpearmanrResult(correlation=0.36157218684415743, pvalue=2.1153050158873865e-41), sim353: SpearmanrResult(correlation=0.3663686798182753, pvalue=3.637500560

epoch: 1, iter: 74700, loss: 11.530078887939453
epoch: 1, iter: 74800, loss: 11.090250015258789
epoch: 1, iter: 74900, loss: 12.13181209564209
epoch: 1, iter: 75000, loss: 11.295372009277344
epoch: 1, iter: 75100, loss: 10.835882186889648
epoch: 1, iter: 75200, loss: 11.940363883972168
epoch: 1, iter: 75300, loss: 11.269214630126953
epoch: 1, iter: 75400, loss: 11.47337818145752
epoch: 1, iter: 75500, loss: 11.688067436218262
epoch: 1, iter: 75600, loss: 11.160213470458984
epoch: 1, iter: 75700, loss: 11.311019897460938
epoch: 1, iter: 75800, loss: 11.642290115356445
epoch: 1, iter: 75900, loss: 11.299044609069824
epoch: 1, iter: 76000, loss: 12.117277145385742
epoch: 1, iteration: 76000, simlex-999: SpearmanrResult(correlation=0.16189266201393918, pvalue=7.080357761568978e-05), men: SpearmanrResult(correlation=0.3657171292105785, pvalue=2.198199840697219e-42), sim353: SpearmanrResult(correlation=0.3661402639791035, pvalue=3.725750662002795e-09), nearest to monster: ['monster', 'cow', 

epoch: 1, iter: 86800, loss: 10.970561981201172
epoch: 1, iter: 86900, loss: 11.55356502532959
epoch: 1, iter: 87000, loss: 11.633825302124023
epoch: 1, iter: 87100, loss: 11.666046142578125
epoch: 1, iter: 87200, loss: 12.03233528137207
epoch: 1, iter: 87300, loss: 11.547616958618164
epoch: 1, iter: 87400, loss: 10.470470428466797
epoch: 1, iter: 87500, loss: 11.386842727661133
epoch: 1, iter: 87600, loss: 12.083517074584961
epoch: 1, iter: 87700, loss: 10.774431228637695
epoch: 1, iter: 87800, loss: 11.270758628845215
epoch: 1, iter: 87900, loss: 11.031774520874023
epoch: 1, iter: 88000, loss: 10.85326862335205
epoch: 1, iteration: 88000, simlex-999: SpearmanrResult(correlation=0.16548311284355996, pvalue=4.846344836997148e-05), men: SpearmanrResult(correlation=0.3723877125453185, pvalue=5.357403830146028e-44), sim353: SpearmanrResult(correlation=0.3699390713441367, pvalue=2.494648068749578e-09), nearest to monster: ['monster', 'cow', 'bat', 'flower', 'triangle', 'snake', 'whale', 'a

epoch: 1, iter: 98900, loss: 11.6860933303833
epoch: 1, iter: 99000, loss: 10.756143569946289
epoch: 1, iter: 99100, loss: 11.42257308959961
epoch: 1, iter: 99200, loss: 11.56196403503418
epoch: 1, iter: 99300, loss: 11.244466781616211
epoch: 1, iter: 99400, loss: 10.90241527557373
epoch: 1, iter: 99500, loss: 11.694575309753418
epoch: 1, iter: 99600, loss: 11.520233154296875
epoch: 1, iter: 99700, loss: 10.918465614318848
epoch: 1, iter: 99800, loss: 11.152124404907227
epoch: 1, iter: 99900, loss: 11.188076972961426
epoch: 1, iter: 100000, loss: 11.45207691192627
epoch: 1, iteration: 100000, simlex-999: SpearmanrResult(correlation=0.16663169954160023, pvalue=4.285633590587805e-05), men: SpearmanrResult(correlation=0.373970135419778, pvalue=2.1912020736273673e-44), sim353: SpearmanrResult(correlation=0.383330930905853, pvalue=5.815588522636261e-10), nearest to monster: ['monster', 'cow', 'whale', 'flower', 'snake', 'arch', 'triangle', 'bat', 'bull', 'loch']

epoch: 1, iter: 100100, los

epoch: 1, iter: 110900, loss: 11.047225952148438
epoch: 1, iter: 111000, loss: 11.546764373779297
epoch: 1, iter: 111100, loss: 11.548361778259277
epoch: 1, iter: 111200, loss: 11.594120979309082
epoch: 1, iter: 111300, loss: 11.421357154846191
epoch: 1, iter: 111400, loss: 10.285360336303711
epoch: 1, iter: 111500, loss: 11.129070281982422
epoch: 1, iter: 111600, loss: 11.197073936462402
epoch: 1, iter: 111700, loss: 11.639020919799805
epoch: 1, iter: 111800, loss: 11.650994300842285
epoch: 1, iter: 111900, loss: 11.028433799743652
epoch: 1, iter: 112000, loss: 11.283931732177734
epoch: 1, iteration: 112000, simlex-999: SpearmanrResult(correlation=0.16334754977785348, pvalue=6.077953473786415e-05), men: SpearmanrResult(correlation=0.38179282821237287, pvalue=2.450688186222278e-46), sim353: SpearmanrResult(correlation=0.39661870223706075, pvalue=1.2828600733150737e-10), nearest to monster: ['monster', 'whale', 'cow', 'snake', 'loch', 'flower', 'triangle', 'arch', 'cloud', 'bat']

epoch

epoch: 1, iter: 122700, loss: 11.24117660522461
epoch: 1, iter: 122800, loss: 11.996676445007324
epoch: 1, iter: 122900, loss: 11.07351016998291
epoch: 1, iter: 123000, loss: 11.715804100036621
epoch: 1, iter: 123100, loss: 11.295879364013672
epoch: 1, iter: 123200, loss: 11.744889259338379
epoch: 1, iter: 123300, loss: 10.742799758911133
epoch: 1, iter: 123400, loss: 11.66125774383545
epoch: 1, iter: 123500, loss: 11.26734733581543
epoch: 1, iter: 123600, loss: 12.15109920501709
epoch: 1, iter: 123700, loss: 11.682125091552734
epoch: 1, iter: 123800, loss: 11.005929946899414
epoch: 1, iter: 123900, loss: 11.447067260742188
epoch: 1, iter: 124000, loss: 11.351899147033691
epoch: 1, iteration: 124000, simlex-999: SpearmanrResult(correlation=0.16708528568085484, pvalue=4.0815906477387545e-05), men: SpearmanrResult(correlation=0.38226719775204404, pvalue=1.8587917420283223e-46), sim353: SpearmanrResult(correlation=0.39966989175597994, pvalue=8.980050843858137e-11), nearest to monster: ['m

epoch: 1, iter: 134600, loss: 11.435664176940918
epoch: 1, iter: 134700, loss: 12.275751113891602
epoch: 1, iter: 134800, loss: 11.207732200622559
epoch: 1, iter: 134900, loss: 11.636312484741211
epoch: 1, iter: 135000, loss: 11.592269897460938
epoch: 1, iter: 135100, loss: 11.187093734741211
epoch: 1, iter: 135200, loss: 11.854930877685547
epoch: 1, iter: 135300, loss: 11.399690628051758
epoch: 1, iter: 135400, loss: 10.843225479125977
epoch: 1, iter: 135500, loss: 11.493987083435059
epoch: 1, iter: 135600, loss: 10.807875633239746
epoch: 1, iter: 135700, loss: 11.621132850646973
epoch: 1, iter: 135800, loss: 11.534954071044922
epoch: 1, iter: 135900, loss: 11.408248901367188
epoch: 1, iter: 136000, loss: 11.174983978271484
epoch: 1, iteration: 136000, simlex-999: SpearmanrResult(correlation=0.16301460035005252, pvalue=6.29476546109937e-05), men: SpearmanrResult(correlation=0.3876606331308837, pvalue=7.765919203663574e-48), sim353: SpearmanrResult(correlation=0.39767280209522804, pval

epoch: 1, iter: 146500, loss: 11.610860824584961
epoch: 1, iter: 146600, loss: 11.009058952331543
epoch: 1, iter: 146700, loss: 11.563115119934082
epoch: 1, iter: 146800, loss: 11.55832290649414
epoch: 1, iter: 146900, loss: 10.967169761657715
epoch: 1, iter: 147000, loss: 11.176801681518555
epoch: 1, iter: 147100, loss: 11.84765625
epoch: 1, iter: 147200, loss: 11.501907348632812
epoch: 1, iter: 147300, loss: 11.139493942260742
epoch: 1, iter: 147400, loss: 11.263932228088379
epoch: 1, iter: 147500, loss: 11.251258850097656
epoch: 1, iter: 147600, loss: 11.076189041137695
epoch: 1, iter: 147700, loss: 12.021448135375977
epoch: 1, iter: 147800, loss: 12.01584529876709
epoch: 1, iter: 147900, loss: 11.153575897216797
epoch: 1, iter: 148000, loss: 12.35605239868164
epoch: 1, iteration: 148000, simlex-999: SpearmanrResult(correlation=0.16265257741839256, pvalue=6.538788123021611e-05), men: SpearmanrResult(correlation=0.3947639244505942, pvalue=1.0819349113955852e-49), sim353: SpearmanrRes

epoch: 1, iter: 158400, loss: 10.72252368927002
epoch: 1, iter: 158500, loss: 10.875571250915527
epoch: 1, iter: 158600, loss: 11.805222511291504
epoch: 1, iter: 158700, loss: 11.61132526397705
epoch: 1, iter: 158800, loss: 11.531607627868652
epoch: 1, iter: 158900, loss: 10.762349128723145
epoch: 1, iter: 159000, loss: 11.905500411987305
epoch: 1, iter: 159100, loss: 10.740833282470703
epoch: 1, iter: 159200, loss: 11.882152557373047
epoch: 1, iter: 159300, loss: 11.395297050476074
epoch: 1, iter: 159400, loss: 11.226688385009766
epoch: 1, iter: 159500, loss: 11.077245712280273
epoch: 1, iter: 159600, loss: 10.355518341064453
epoch: 1, iter: 159700, loss: 11.106245994567871
epoch: 1, iter: 159800, loss: 11.667930603027344
epoch: 1, iter: 159900, loss: 11.971274375915527
epoch: 1, iter: 160000, loss: 11.034452438354492
epoch: 1, iteration: 160000, simlex-999: SpearmanrResult(correlation=0.16684982131885448, pvalue=4.1863368087040254e-05), men: SpearmanrResult(correlation=0.400270338899

epoch: 1, iter: 170300, loss: 11.580501556396484
epoch: 1, iter: 170400, loss: 11.051185607910156
epoch: 1, iter: 170500, loss: 11.345072746276855
epoch: 1, iter: 170600, loss: 11.313884735107422
epoch: 1, iter: 170700, loss: 11.398548126220703
epoch: 1, iter: 170800, loss: 11.226531028747559
epoch: 1, iter: 170900, loss: 11.06230640411377
epoch: 1, iter: 171000, loss: 11.642644882202148
epoch: 1, iter: 171100, loss: 10.989036560058594
epoch: 1, iter: 171200, loss: 11.260032653808594
epoch: 1, iter: 171300, loss: 11.007243156433105
epoch: 1, iter: 171400, loss: 11.656438827514648
epoch: 1, iter: 171500, loss: 11.098740577697754
epoch: 1, iter: 171600, loss: 11.663400650024414
epoch: 1, iter: 171700, loss: 11.714648246765137
epoch: 1, iter: 171800, loss: 10.940762519836426
epoch: 1, iter: 171900, loss: 11.930944442749023
epoch: 1, iter: 172000, loss: 11.397117614746094
epoch: 1, iteration: 172000, simlex-999: SpearmanrResult(correlation=0.1677598417291675, pvalue=3.795076449194349e-05),

epoch: 1, iter: 182200, loss: 11.17591381072998
epoch: 1, iter: 182300, loss: 11.738001823425293
epoch: 1, iter: 182400, loss: 10.296578407287598
epoch: 1, iter: 182500, loss: 10.653881072998047
epoch: 1, iter: 182600, loss: 11.827786445617676
epoch: 1, iter: 182700, loss: 11.438946723937988
epoch: 1, iter: 182800, loss: 11.66248607635498
epoch: 1, iter: 182900, loss: 11.088054656982422
epoch: 1, iter: 183000, loss: 11.437849998474121
epoch: 1, iter: 183100, loss: 11.40089225769043
epoch: 1, iter: 183200, loss: 11.962478637695312
epoch: 1, iter: 183300, loss: 11.26458740234375
epoch: 1, iter: 183400, loss: 11.413863182067871
epoch: 1, iter: 183500, loss: 11.311695098876953
epoch: 1, iter: 183600, loss: 11.525810241699219
epoch: 1, iter: 183700, loss: 11.298828125
epoch: 1, iter: 183800, loss: 11.142067909240723
epoch: 1, iter: 183900, loss: 10.991665840148926
epoch: 1, iter: 184000, loss: 12.276593208312988
epoch: 1, iteration: 184000, simlex-999: SpearmanrResult(correlation=0.17071843

epoch: 1, iter: 194100, loss: 11.14745044708252
epoch: 1, iter: 194200, loss: 11.127172470092773
epoch: 1, iter: 194300, loss: 10.920754432678223
epoch: 1, iter: 194400, loss: 11.203548431396484
epoch: 1, iter: 194500, loss: 11.766935348510742
epoch: 1, iter: 194600, loss: 11.234634399414062
epoch: 1, iter: 194700, loss: 11.0404052734375
epoch: 1, iter: 194800, loss: 11.350342750549316
epoch: 1, iter: 194900, loss: 10.865541458129883
epoch: 1, iter: 195000, loss: 11.329682350158691
epoch: 1, iter: 195100, loss: 10.73600959777832
epoch: 1, iter: 195200, loss: 11.107250213623047
epoch: 1, iter: 195300, loss: 11.449899673461914
epoch: 1, iter: 195400, loss: 11.06064510345459
epoch: 1, iter: 195500, loss: 11.826132774353027
epoch: 1, iter: 195600, loss: 11.58769702911377
epoch: 1, iter: 195700, loss: 10.986814498901367
epoch: 1, iter: 195800, loss: 10.722884178161621
epoch: 1, iter: 195900, loss: 11.220098495483398
epoch: 1, iter: 196000, loss: 10.714516639709473
epoch: 1, iteration: 19600

epoch: 1, iter: 206100, loss: 11.375326156616211
epoch: 1, iter: 206200, loss: 11.713299751281738
epoch: 1, iter: 206300, loss: 11.019659996032715
epoch: 1, iter: 206400, loss: 11.061277389526367
epoch: 1, iter: 206500, loss: 11.395835876464844
epoch: 1, iter: 206600, loss: 10.537747383117676
epoch: 1, iter: 206700, loss: 10.476123809814453
epoch: 1, iter: 206800, loss: 11.395556449890137
epoch: 1, iter: 206900, loss: 12.215741157531738
epoch: 1, iter: 207000, loss: 10.860994338989258
epoch: 1, iter: 207100, loss: 11.583778381347656
epoch: 1, iter: 207200, loss: 11.494771003723145
epoch: 1, iter: 207300, loss: 11.62319564819336
epoch: 1, iter: 207400, loss: 10.793098449707031
epoch: 1, iter: 207500, loss: 11.848119735717773
epoch: 1, iter: 207600, loss: 11.71525764465332
epoch: 1, iter: 207700, loss: 11.3838472366333
epoch: 1, iter: 207800, loss: 11.87214469909668
epoch: 1, iter: 207900, loss: 10.935977935791016
epoch: 1, iter: 208000, loss: 11.398731231689453
epoch: 1, iteration: 2080

epoch: 1, iter: 218100, loss: 10.951763153076172
epoch: 1, iter: 218200, loss: 11.476723670959473
epoch: 1, iter: 218300, loss: 11.088994979858398
epoch: 1, iter: 218400, loss: 10.987229347229004
epoch: 1, iter: 218500, loss: 11.198566436767578
epoch: 1, iter: 218600, loss: 11.416081428527832
epoch: 1, iter: 218700, loss: 11.41604995727539
epoch: 1, iter: 218800, loss: 11.623327255249023
epoch: 1, iter: 218900, loss: 10.985546112060547
epoch: 1, iter: 219000, loss: 11.147407531738281
epoch: 1, iter: 219100, loss: 12.084134101867676
epoch: 1, iter: 219200, loss: 11.44672966003418
epoch: 1, iter: 219300, loss: 11.676520347595215
epoch: 1, iter: 219400, loss: 11.285768508911133
epoch: 1, iter: 219500, loss: 11.330821990966797
epoch: 1, iter: 219600, loss: 10.97744369506836
epoch: 1, iter: 219700, loss: 11.342808723449707
epoch: 1, iter: 219800, loss: 11.436417579650879
epoch: 1, iter: 219900, loss: 11.059874534606934
epoch: 1, iter: 220000, loss: 10.928915023803711
epoch: 1, iteration: 22

epoch: 1, iter: 230100, loss: 11.303959846496582
epoch: 1, iter: 230200, loss: 10.64719009399414
epoch: 1, iter: 230300, loss: 11.457016944885254
epoch: 1, iter: 230400, loss: 10.654086112976074
epoch: 1, iter: 230500, loss: 11.081222534179688
epoch: 1, iter: 230600, loss: 11.02998161315918
epoch: 1, iter: 230700, loss: 11.127065658569336
epoch: 1, iter: 230800, loss: 11.409205436706543
epoch: 1, iter: 230900, loss: 11.747581481933594
epoch: 1, iter: 231000, loss: 11.192422866821289
epoch: 1, iter: 231100, loss: 10.823511123657227
epoch: 1, iter: 231200, loss: 11.55473518371582
epoch: 1, iter: 231300, loss: 12.091190338134766
epoch: 1, iter: 231400, loss: 11.521707534790039
epoch: 1, iter: 231500, loss: 11.449785232543945
epoch: 1, iter: 231600, loss: 11.024251937866211
epoch: 1, iter: 231700, loss: 11.10603141784668
epoch: 1, iter: 231800, loss: 11.057670593261719
epoch: 1, iter: 231900, loss: 11.616464614868164
epoch: 1, iter: 232000, loss: 11.597519874572754
epoch: 1, iteration: 232

epoch: 1, iter: 242100, loss: 10.814802169799805
epoch: 1, iter: 242200, loss: 11.216727256774902
epoch: 1, iter: 242300, loss: 11.034337997436523
epoch: 1, iter: 242400, loss: 11.900420188903809
epoch: 1, iter: 242500, loss: 11.16045093536377
epoch: 1, iter: 242600, loss: 11.270302772521973
epoch: 1, iter: 242700, loss: 10.879514694213867
epoch: 1, iter: 242800, loss: 11.322898864746094
epoch: 1, iter: 242900, loss: 12.233163833618164
epoch: 1, iter: 243000, loss: 11.218557357788086
epoch: 1, iter: 243100, loss: 11.068227767944336
epoch: 1, iter: 243200, loss: 10.929821014404297
epoch: 1, iter: 243300, loss: 11.940969467163086
epoch: 1, iter: 243400, loss: 11.069089889526367
epoch: 1, iter: 243500, loss: 10.817985534667969
epoch: 1, iter: 243600, loss: 11.690827369689941
epoch: 1, iter: 243700, loss: 11.340550422668457
epoch: 1, iter: 243800, loss: 10.994140625
epoch: 1, iter: 243900, loss: 11.354438781738281
epoch: 1, iter: 244000, loss: 11.3626708984375
epoch: 1, iteration: 244000, 

epoch: 1, iter: 254100, loss: 10.848477363586426
epoch: 1, iter: 254200, loss: 10.418547630310059
epoch: 1, iter: 254300, loss: 11.950201988220215
epoch: 1, iter: 254400, loss: 11.178282737731934
epoch: 1, iter: 254500, loss: 11.19064998626709
epoch: 1, iter: 254600, loss: 10.831796646118164
epoch: 1, iter: 254700, loss: 11.059998512268066
epoch: 1, iter: 254800, loss: 10.56035327911377
epoch: 1, iter: 254900, loss: 11.586791038513184
epoch: 1, iter: 255000, loss: 11.460409164428711
epoch: 1, iter: 255100, loss: 10.374359130859375
epoch: 1, iter: 255200, loss: 12.178007125854492
epoch: 1, iter: 255300, loss: 10.840160369873047
epoch: 1, iter: 255400, loss: 11.060770034790039
epoch: 1, iter: 255500, loss: 10.436591148376465
epoch: 1, iter: 255600, loss: 11.169668197631836
epoch: 1, iter: 255700, loss: 10.561355590820312
epoch: 1, iter: 255800, loss: 11.202978134155273
epoch: 1, iter: 255900, loss: 10.892671585083008
epoch: 1, iter: 256000, loss: 11.21651840209961
epoch: 1, iteration: 25

epoch: 1, iter: 266100, loss: 11.771434783935547
epoch: 1, iter: 266200, loss: 11.432225227355957
epoch: 1, iter: 266300, loss: 11.461756706237793
epoch: 1, iter: 266400, loss: 11.839859008789062
epoch: 1, iter: 266500, loss: 11.04379653930664
epoch: 1, iter: 266600, loss: 11.449333190917969
epoch: 1, iter: 266700, loss: 10.821914672851562
epoch: 1, iter: 266800, loss: 11.647049903869629
epoch: 1, iter: 266900, loss: 11.579934120178223
epoch: 1, iter: 267000, loss: 11.175274848937988
epoch: 1, iter: 267100, loss: 12.299535751342773
epoch: 1, iter: 267200, loss: 11.19509220123291
epoch: 1, iter: 267300, loss: 11.679340362548828
epoch: 1, iter: 267400, loss: 10.725985527038574
epoch: 1, iter: 267500, loss: 11.426072120666504
epoch: 1, iter: 267600, loss: 11.341987609863281
epoch: 1, iter: 267700, loss: 10.872314453125
epoch: 1, iter: 267800, loss: 11.642166137695312
epoch: 1, iter: 267900, loss: 11.54948616027832
epoch: 1, iter: 268000, loss: 11.196860313415527
epoch: 1, iteration: 26800

epoch: 1, iter: 278100, loss: 11.179596900939941
epoch: 1, iter: 278200, loss: 11.372085571289062
epoch: 1, iter: 278300, loss: 11.359447479248047
epoch: 1, iter: 278400, loss: 11.091362953186035
epoch: 1, iter: 278500, loss: 11.386086463928223
epoch: 1, iter: 278600, loss: 11.248091697692871
epoch: 1, iter: 278700, loss: 11.363455772399902
epoch: 1, iter: 278800, loss: 12.239469528198242
epoch: 1, iter: 278900, loss: 11.08115005493164
epoch: 1, iter: 279000, loss: 11.546113014221191
epoch: 1, iter: 279100, loss: 11.385331153869629
epoch: 1, iter: 279200, loss: 11.72231674194336
epoch: 1, iter: 279300, loss: 11.215499877929688
epoch: 1, iter: 279400, loss: 11.591362953186035
epoch: 1, iter: 279500, loss: 11.76828384399414
epoch: 1, iter: 279600, loss: 10.932838439941406
epoch: 1, iter: 279700, loss: 11.532787322998047
epoch: 1, iter: 279800, loss: 11.874547958374023
epoch: 1, iter: 279900, loss: 10.796367645263672
epoch: 1, iter: 280000, loss: 11.368276596069336
epoch: 1, iteration: 28

epoch: 1, iter: 290100, loss: 11.511242866516113
epoch: 1, iter: 290200, loss: 10.90953254699707
epoch: 1, iter: 290300, loss: 11.021191596984863
epoch: 1, iter: 290400, loss: 11.302078247070312
epoch: 1, iter: 290500, loss: 10.85472583770752
epoch: 1, iter: 290600, loss: 10.917892456054688
epoch: 1, iter: 290700, loss: 10.641409873962402
epoch: 1, iter: 290800, loss: 11.055588722229004
epoch: 1, iter: 290900, loss: 10.764321327209473
epoch: 1, iter: 291000, loss: 10.961349487304688
epoch: 1, iter: 291100, loss: 10.872259140014648
epoch: 1, iter: 291200, loss: 11.819900512695312
epoch: 1, iter: 291300, loss: 10.52360725402832
epoch: 1, iter: 291400, loss: 11.433791160583496
epoch: 1, iter: 291500, loss: 11.470471382141113
epoch: 1, iter: 291600, loss: 11.562607765197754
epoch: 1, iter: 291700, loss: 11.592007637023926
epoch: 1, iter: 291800, loss: 11.161138534545898
epoch: 1, iter: 291900, loss: 10.7814302444458
epoch: 1, iter: 292000, loss: 11.335573196411133
epoch: 1, iteration: 2920

epoch: 1, iter: 302100, loss: 10.91454792022705
epoch: 1, iter: 302200, loss: 10.616466522216797
epoch: 1, iter: 302300, loss: 11.491199493408203
epoch: 1, iter: 302400, loss: 11.438224792480469
epoch: 1, iter: 302500, loss: 11.244717597961426
epoch: 1, iter: 302600, loss: 10.479109764099121
epoch: 1, iter: 302700, loss: 11.023923873901367
epoch: 1, iter: 302800, loss: 11.655069351196289
epoch: 1, iter: 302900, loss: 10.986444473266602
epoch: 1, iter: 303000, loss: 11.506070137023926
epoch: 1, iter: 303100, loss: 11.189078330993652
epoch: 1, iter: 303200, loss: 11.649898529052734
epoch: 1, iter: 303300, loss: 11.703609466552734
epoch: 1, iter: 303400, loss: 10.859373092651367
epoch: 1, iter: 303500, loss: 11.040179252624512
epoch: 1, iter: 303600, loss: 11.137129783630371
epoch: 1, iter: 303700, loss: 11.340540885925293
epoch: 1, iter: 303800, loss: 11.355193138122559
epoch: 1, iter: 303900, loss: 11.812112808227539
epoch: 1, iter: 304000, loss: 11.384217262268066
epoch: 1, iteration: 

epoch: 1, iter: 314100, loss: 11.06897258758545
epoch: 1, iter: 314200, loss: 11.61420726776123
epoch: 1, iter: 314300, loss: 11.439647674560547
epoch: 1, iter: 314400, loss: 11.477725982666016
epoch: 1, iter: 314500, loss: 11.68657112121582
epoch: 1, iter: 314600, loss: 11.608104705810547
epoch: 1, iter: 314700, loss: 11.647395133972168
epoch: 1, iter: 314800, loss: 11.685114860534668
epoch: 1, iter: 314900, loss: 11.623188018798828
epoch: 1, iter: 315000, loss: 11.420181274414062
epoch: 1, iter: 315100, loss: 11.637863159179688
epoch: 1, iter: 315200, loss: 11.624120712280273
epoch: 1, iter: 315300, loss: 11.46505355834961
epoch: 1, iter: 315400, loss: 11.204451560974121
epoch: 1, iter: 315500, loss: 11.445466995239258
epoch: 1, iter: 315600, loss: 11.57599925994873
epoch: 1, iter: 315700, loss: 10.906534194946289
epoch: 1, iter: 315800, loss: 11.487659454345703
epoch: 1, iter: 315900, loss: 12.015453338623047
epoch: 1, iter: 316000, loss: 10.89474105834961
epoch: 1, iteration: 31600

epoch: 1, iter: 326100, loss: 11.957534790039062
epoch: 1, iter: 326200, loss: 11.390605926513672
epoch: 1, iter: 326300, loss: 11.41275691986084
epoch: 1, iter: 326400, loss: 11.740699768066406
epoch: 1, iter: 326500, loss: 11.731061935424805
epoch: 1, iter: 326600, loss: 10.917253494262695
epoch: 1, iter: 326700, loss: 11.239794731140137
epoch: 1, iter: 326800, loss: 11.645939826965332
epoch: 1, iter: 326900, loss: 11.255464553833008
epoch: 1, iter: 327000, loss: 11.826922416687012
epoch: 1, iter: 327100, loss: 11.657106399536133
epoch: 1, iter: 327200, loss: 11.788126945495605
epoch: 1, iter: 327300, loss: 10.886468887329102
epoch: 1, iter: 327400, loss: 11.589963912963867
epoch: 1, iter: 327500, loss: 10.91930866241455
epoch: 1, iter: 327600, loss: 11.226459503173828
epoch: 1, iter: 327700, loss: 11.960453987121582
epoch: 1, iter: 327800, loss: 10.949692726135254
epoch: 1, iter: 327900, loss: 10.880739212036133
epoch: 1, iter: 328000, loss: 11.675814628601074
epoch: 1, iteration: 3

epoch: 1, iter: 338100, loss: 11.131051063537598
epoch: 1, iter: 338200, loss: 11.236504554748535
epoch: 1, iter: 338300, loss: 11.552051544189453
epoch: 1, iter: 338400, loss: 11.290592193603516
epoch: 1, iter: 338500, loss: 11.39600944519043
epoch: 1, iter: 338600, loss: 11.361034393310547
epoch: 1, iter: 338700, loss: 10.401175498962402
epoch: 1, iter: 338800, loss: 10.846454620361328
epoch: 1, iter: 338900, loss: 11.617212295532227
epoch: 1, iter: 339000, loss: 11.857020378112793
epoch: 1, iter: 339100, loss: 11.201360702514648
epoch: 1, iter: 339200, loss: 10.626022338867188
epoch: 1, iter: 339300, loss: 11.004541397094727
epoch: 1, iter: 339400, loss: 11.053577423095703
epoch: 1, iter: 339500, loss: 11.841917991638184
epoch: 1, iter: 339600, loss: 11.726404190063477
epoch: 1, iter: 339700, loss: 11.726951599121094
epoch: 1, iter: 339800, loss: 11.687437057495117
epoch: 1, iter: 339900, loss: 11.657244682312012
epoch: 1, iter: 340000, loss: 10.951449394226074
epoch: 1, iteration: 

epoch: 1, iter: 350100, loss: 11.091301918029785
epoch: 1, iter: 350200, loss: 11.38483715057373
epoch: 1, iter: 350300, loss: 12.401779174804688
epoch: 1, iter: 350400, loss: 11.679975509643555
epoch: 1, iter: 350500, loss: 11.849279403686523
epoch: 1, iter: 350600, loss: 11.454946517944336
epoch: 1, iter: 350700, loss: 10.950664520263672
epoch: 1, iter: 350800, loss: 11.001213073730469
epoch: 1, iter: 350900, loss: 11.33930492401123
epoch: 1, iter: 351000, loss: 11.260547637939453
epoch: 1, iter: 351100, loss: 11.473790168762207
epoch: 1, iter: 351200, loss: 11.556967735290527
epoch: 1, iter: 351300, loss: 10.142210006713867
epoch: 1, iter: 351400, loss: 11.749345779418945
epoch: 1, iter: 351500, loss: 10.56413459777832
epoch: 1, iter: 351600, loss: 11.163341522216797
epoch: 1, iter: 351700, loss: 10.551451683044434
epoch: 1, iter: 351800, loss: 11.467304229736328
epoch: 1, iter: 351900, loss: 10.958245277404785
epoch: 1, iter: 352000, loss: 10.862275123596191
epoch: 1, iteration: 35

epoch: 1, iter: 362100, loss: 11.29149055480957
epoch: 1, iter: 362200, loss: 11.190072059631348
epoch: 1, iter: 362300, loss: 10.865127563476562
epoch: 1, iter: 362400, loss: 11.193901062011719
epoch: 1, iter: 362500, loss: 10.762223243713379
epoch: 1, iter: 362600, loss: 11.348251342773438
epoch: 1, iter: 362700, loss: 11.774222373962402
epoch: 1, iter: 362800, loss: 11.425209045410156
epoch: 1, iter: 362900, loss: 10.773710250854492
epoch: 1, iter: 363000, loss: 10.289616584777832
epoch: 1, iter: 363100, loss: 10.440143585205078
epoch: 1, iter: 363200, loss: 11.209319114685059
epoch: 1, iter: 363300, loss: 11.071725845336914
epoch: 1, iter: 363400, loss: 11.560872077941895
epoch: 1, iter: 363500, loss: 11.306280136108398
epoch: 1, iter: 363600, loss: 11.622930526733398
epoch: 1, iter: 363700, loss: 11.065410614013672
epoch: 1, iter: 363800, loss: 11.029680252075195
epoch: 1, iter: 363900, loss: 10.948576927185059
epoch: 1, iter: 364000, loss: 10.556087493896484
epoch: 1, iteration: 

epoch: 1, iter: 374100, loss: 11.39795207977295
epoch: 1, iter: 374200, loss: 11.763384819030762
epoch: 1, iter: 374300, loss: 11.5939302444458
epoch: 1, iter: 374400, loss: 10.580074310302734
epoch: 1, iter: 374500, loss: 11.087350845336914
epoch: 1, iter: 374600, loss: 11.066162109375
epoch: 1, iter: 374700, loss: 12.662107467651367
epoch: 1, iter: 374800, loss: 11.1595458984375
epoch: 1, iter: 374900, loss: 11.64978313446045
epoch: 1, iter: 375000, loss: 11.717304229736328
epoch: 1, iter: 375100, loss: 11.177095413208008
epoch: 1, iter: 375200, loss: 11.250066757202148
epoch: 1, iter: 375300, loss: 10.737218856811523
epoch: 1, iter: 375400, loss: 11.282071113586426
epoch: 1, iter: 375500, loss: 11.27998161315918
epoch: 1, iter: 375600, loss: 11.006904602050781
epoch: 1, iter: 375700, loss: 11.410905838012695
epoch: 1, iter: 375800, loss: 11.424981117248535
epoch: 1, iter: 375900, loss: 10.648628234863281
epoch: 1, iter: 376000, loss: 10.912236213684082
epoch: 1, iteration: 376000, s

epoch: 1, iter: 386100, loss: 11.388280868530273
epoch: 1, iter: 386200, loss: 11.520695686340332
epoch: 1, iter: 386300, loss: 12.200346946716309
epoch: 1, iter: 386400, loss: 10.970626831054688
epoch: 1, iter: 386500, loss: 11.92538070678711
epoch: 1, iter: 386600, loss: 11.41404914855957
epoch: 1, iter: 386700, loss: 11.422754287719727
epoch: 1, iter: 386800, loss: 11.121648788452148
epoch: 1, iter: 386900, loss: 10.547311782836914
epoch: 1, iter: 387000, loss: 11.80119514465332
epoch: 1, iter: 387100, loss: 11.976757049560547
epoch: 1, iter: 387200, loss: 10.93313217163086
epoch: 1, iter: 387300, loss: 11.177508354187012
epoch: 1, iter: 387400, loss: 11.175080299377441
epoch: 1, iter: 387500, loss: 11.784387588500977
epoch: 1, iter: 387600, loss: 11.147073745727539
epoch: 1, iter: 387700, loss: 11.453468322753906
epoch: 1, iter: 387800, loss: 11.667861938476562
epoch: 1, iter: 387900, loss: 11.933662414550781
epoch: 1, iter: 388000, loss: 10.857882499694824
epoch: 1, iteration: 388

epoch: 1, iter: 398100, loss: 11.26270866394043
epoch: 1, iter: 398200, loss: 11.698651313781738
epoch: 1, iter: 398300, loss: 11.021997451782227
epoch: 1, iter: 398400, loss: 11.332110404968262
epoch: 1, iter: 398500, loss: 10.872905731201172
epoch: 1, iter: 398600, loss: 11.304462432861328
epoch: 1, iter: 398700, loss: 12.264683723449707
epoch: 1, iter: 398800, loss: 10.86986255645752
epoch: 1, iter: 398900, loss: 11.722541809082031
epoch: 1, iter: 399000, loss: 11.43885326385498
epoch: 1, iter: 399100, loss: 11.66014575958252
epoch: 1, iter: 399200, loss: 11.308335304260254
epoch: 1, iter: 399300, loss: 10.966008186340332
epoch: 1, iter: 399400, loss: 11.345585823059082
epoch: 1, iter: 399500, loss: 11.246187210083008
epoch: 1, iter: 399600, loss: 11.107443809509277
epoch: 1, iter: 399700, loss: 11.607418060302734
epoch: 1, iter: 399800, loss: 11.282276153564453
epoch: 1, iter: 399900, loss: 11.230591773986816
epoch: 1, iter: 400000, loss: 11.4681396484375
epoch: 1, iteration: 40000

epoch: 1, iter: 410100, loss: 11.425519943237305
epoch: 1, iter: 410200, loss: 11.041234016418457
epoch: 1, iter: 410300, loss: 11.325824737548828
epoch: 1, iter: 410400, loss: 11.657641410827637
epoch: 1, iter: 410500, loss: 11.570671081542969
epoch: 1, iter: 410600, loss: 11.93261432647705
epoch: 1, iter: 410700, loss: 11.195002555847168
epoch: 1, iter: 410800, loss: 11.240930557250977
epoch: 1, iter: 410900, loss: 11.036304473876953
epoch: 1, iter: 411000, loss: 11.501260757446289
epoch: 1, iter: 411100, loss: 11.414464950561523
epoch: 1, iter: 411200, loss: 11.096514701843262
epoch: 1, iter: 411300, loss: 10.847840309143066
epoch: 1, iter: 411400, loss: 10.924610137939453
epoch: 1, iter: 411500, loss: 10.760747909545898
epoch: 1, iter: 411600, loss: 11.838245391845703
epoch: 1, iter: 411700, loss: 11.143621444702148
epoch: 1, iter: 411800, loss: 11.52128791809082
epoch: 1, iter: 411900, loss: 11.197322845458984
epoch: 1, iter: 412000, loss: 11.49935531616211
epoch: 1, iteration: 41

epoch: 1, iter: 422100, loss: 10.962005615234375
epoch: 1, iter: 422200, loss: 11.717196464538574
epoch: 1, iter: 422300, loss: 11.740622520446777
epoch: 1, iter: 422400, loss: 11.439126968383789
epoch: 1, iter: 422500, loss: 11.251435279846191
epoch: 1, iter: 422600, loss: 11.98227596282959
epoch: 1, iter: 422700, loss: 11.151405334472656
epoch: 1, iter: 422800, loss: 11.325945854187012
epoch: 1, iter: 422900, loss: 10.862709999084473
epoch: 1, iter: 423000, loss: 11.037946701049805
epoch: 1, iter: 423100, loss: 11.99278450012207
epoch: 1, iter: 423200, loss: 10.962024688720703
epoch: 1, iter: 423300, loss: 10.907966613769531
epoch: 1, iter: 423400, loss: 10.773236274719238
epoch: 1, iter: 423500, loss: 11.348245620727539
epoch: 1, iter: 423600, loss: 11.209400177001953
epoch: 1, iter: 423700, loss: 10.71605396270752
epoch: 1, iter: 423800, loss: 10.275297164916992
epoch: 1, iter: 423900, loss: 11.691515922546387
epoch: 1, iter: 424000, loss: 11.235740661621094
epoch: 1, iteration: 42

epoch: 1, iter: 434100, loss: 11.398099899291992
epoch: 1, iter: 434200, loss: 11.275341987609863
epoch: 1, iter: 434300, loss: 11.040188789367676
epoch: 1, iter: 434400, loss: 11.44877815246582
epoch: 1, iter: 434500, loss: 11.126958847045898
epoch: 1, iter: 434600, loss: 11.022445678710938
epoch: 1, iter: 434700, loss: 10.526141166687012
epoch: 1, iter: 434800, loss: 10.883358001708984
epoch: 1, iter: 434900, loss: 11.001891136169434
epoch: 1, iter: 435000, loss: 10.739945411682129
epoch: 1, iter: 435100, loss: 12.499483108520508
epoch: 1, iter: 435200, loss: 12.20584774017334
epoch: 1, iter: 435300, loss: 11.220914840698242
epoch: 1, iter: 435400, loss: 11.303383827209473
epoch: 1, iter: 435500, loss: 10.84464168548584
epoch: 1, iter: 435600, loss: 11.762062072753906
epoch: 1, iter: 435700, loss: 12.285225868225098
epoch: 1, iter: 435800, loss: 11.432060241699219
epoch: 1, iter: 435900, loss: 11.49901294708252
epoch: 1, iter: 436000, loss: 10.682817459106445
epoch: 1, iteration: 436

epoch: 1, iter: 446100, loss: 10.895466804504395
epoch: 1, iter: 446200, loss: 11.517123222351074
epoch: 1, iter: 446300, loss: 11.391008377075195
epoch: 1, iter: 446400, loss: 11.51754093170166
epoch: 1, iter: 446500, loss: 11.363222122192383
epoch: 1, iter: 446600, loss: 11.756290435791016
epoch: 1, iter: 446700, loss: 11.249637603759766
epoch: 1, iter: 446800, loss: 10.764711380004883
epoch: 1, iter: 446900, loss: 11.212006568908691
epoch: 1, iter: 447000, loss: 12.00946044921875
epoch: 1, iter: 447100, loss: 11.911036491394043
epoch: 1, iter: 447200, loss: 11.14554214477539
epoch: 1, iter: 447300, loss: 11.05186939239502
epoch: 1, iter: 447400, loss: 11.011931419372559
epoch: 1, iter: 447500, loss: 11.680644035339355
epoch: 1, iter: 447600, loss: 11.899770736694336
epoch: 1, iter: 447700, loss: 10.516495704650879
epoch: 1, iter: 447800, loss: 10.518952369689941
epoch: 1, iter: 447900, loss: 11.454275131225586
epoch: 1, iter: 448000, loss: 10.575553894042969
epoch: 1, iteration: 448

epoch: 1, iter: 458100, loss: 11.59527587890625
epoch: 1, iter: 458200, loss: 11.086811065673828
epoch: 1, iter: 458300, loss: 10.718034744262695
epoch: 1, iter: 458400, loss: 11.232890129089355
epoch: 1, iter: 458500, loss: 11.93146800994873
epoch: 1, iter: 458600, loss: 10.830869674682617
epoch: 1, iter: 458700, loss: 11.208268165588379
epoch: 1, iter: 458800, loss: 10.195466995239258
epoch: 1, iter: 458900, loss: 11.265641212463379
epoch: 1, iter: 459000, loss: 11.847845077514648
epoch: 1, iter: 459100, loss: 11.914114952087402
epoch: 1, iter: 459200, loss: 11.146728515625
epoch: 1, iter: 459300, loss: 10.803190231323242
epoch: 1, iter: 459400, loss: 11.36327838897705
epoch: 1, iter: 459500, loss: 11.565722465515137
epoch: 1, iter: 459600, loss: 11.303350448608398
epoch: 1, iter: 459700, loss: 11.040358543395996
epoch: 1, iter: 459800, loss: 11.079276084899902
epoch: 1, iter: 459900, loss: 11.18036937713623
epoch: 1, iter: 460000, loss: 11.595579147338867
epoch: 1, iteration: 460000

epoch: 1, iter: 470100, loss: 11.579556465148926
epoch: 1, iter: 470200, loss: 11.333196640014648
epoch: 1, iter: 470300, loss: 11.586036682128906
epoch: 1, iter: 470400, loss: 11.622262954711914
epoch: 1, iter: 470500, loss: 11.52660846710205
epoch: 1, iter: 470600, loss: 11.743627548217773
epoch: 1, iter: 470700, loss: 11.7948579788208
epoch: 1, iter: 470800, loss: 10.743227005004883
epoch: 1, iter: 470900, loss: 11.618430137634277
epoch: 1, iter: 471000, loss: 10.725007057189941
epoch: 1, iter: 471100, loss: 11.477409362792969
epoch: 1, iter: 471200, loss: 11.633712768554688
epoch: 1, iter: 471300, loss: 11.383800506591797
epoch: 1, iter: 471400, loss: 11.573264122009277
epoch: 1, iter: 471500, loss: 11.79548454284668
epoch: 1, iter: 471600, loss: 10.598237037658691
epoch: 1, iter: 471700, loss: 10.672428131103516
epoch: 1, iter: 471800, loss: 10.925034523010254
epoch: 1, iter: 471900, loss: 12.021942138671875
epoch: 1, iter: 472000, loss: 10.825090408325195
epoch: 1, iteration: 472

In [17]:
model.load_state_dict(torch.load("embedding-{}.th".format(EMBEDDING_SIZE)))

<All keys matched successfully>

## 在 MEN 和 Simplex-999 数据集上做评估

In [19]:
embedding_weights = model.input_embeddings()
print("simlex-999", evaluate(r".\embedding\simlex-999.txt", embedding_weights))
print("men", evaluate(r".\embedding\men.txt", embedding_weights))
print("wordsim353", evaluate(r".\embedding\wordsim353.csv", embedding_weights))

simlex-999 SpearmanrResult(correlation=0.17346411394508127, pvalue=2.0277156165168663e-05)
men SpearmanrResult(correlation=0.45392756043920385, pvalue=5.045440536459275e-67)
wordsim353 SpearmanrResult(correlation=0.42494185324583417, pvalue=4.055988368438879e-12)


## 寻找nearest neighbors

In [20]:
for word in ["good", "fresh", "monster", "green", "like", "america", "chicago", "work", "computer", "language"]:
    print(word, find_nearest(word))

good ['good', 'bad', 'better', 'magical', 'mere', 'conscious', 'worse', 'unique', 'perfect', 'complete']
fresh ['fresh', 'wet', 'milk', 'grain', 'warm', 'shallow', 'mild', 'alcohol', 'breathing', 'kidney']
monster ['monster', 'cow', 'bull', 'snake', 'whale', 'hammer', 'tiger', 'flower', 'bat', 'skull']
green ['green', 'blue', 'yellow', 'purple', 'orange', 'pink', 'white', 'red', 'shaped', 'black']
like ['like', 'resembling', 'bearing', 'etc', 'resemble', 'exhibit', 'bear', 'similarly', 'eating', 'wherein']
america ['america', 'australia', 'africa', 'england', 'korea', 'europe', 'berlin', 'carolina', 'mexico', 'wales']
chicago ['chicago', 'brooklyn', 'boston', 'toronto', 'buffalo', 'cincinnati', 'seattle', 'detroit', 'pittsburgh', 'philadelphia']
work ['work', 'writing', 'comment', 'account', 'vision', 'talent', 'reading', 'criticism', 'dialogue', 'appearance']
computer ['computer', 'animation', 'editing', 'interactive', 'digital', 'micro', 'gaming', 'unix', 'software', 'graphics']
lang

## 单词之间的关系

In [21]:
man_idx = word_to_idx["man"] 
king_idx = word_to_idx["king"] 
woman_idx = word_to_idx["woman"]
embedding = embedding_weights[woman_idx] - embedding_weights[man_idx] + embedding_weights[king_idx]
cos_dis = np.array([scipy.spatial.distance.cosine(e, embedding) for e in embedding_weights])
for i in cos_dis.argsort()[:20]:
    print(idx_to_word[i])

king
viii
tsar
bonaparte
augustus
constantine
vii
alfonso
baldwin
julius
afonso
empress
clement
byron
son
emperor
xii
charlemagne
frederick
pope
