#Word2Box

Author: Bigarella Chiara

# 1 - Introduction

 The aim of this notebook is to provide an in-depth analysis of the **word2box** algorithm presented in [Word2Box: Capturing Set-Theoretic Semantics of Words using Box Embeddings](https://arxiv.org/abs/2106.14361) (2022, Dasgupta et al.).

 The original code is freely available at the following GitHub repository: https://github.com/iesl/word2box



## 1.1 Project Structure




The *word2box* project has the following folder structure. In **section 3** I will go through all the modules and I will explain the behavior of all the classes and functions defined in the project.

```
word2box/
  data/
    ptb/
    similarity_datasets/
  src/
    language_modeling_with_boxes/
      box/
        box_wrapper.py
        modules.py
        utils.py
      datasets/
        utils.py
        word2vecgpu.py
      models/
        BaseModule.py
        word2box.py
      train/
        loss.py
        negative_sampling.py
        train.py
        Trainer.py
```

#2 - Setup

In this section, we install and import all the dependencies needed to run the project. We also create some constants and add some global settings.

## 2.1 Install dependencies

Before installing the dependencies, we have to uninstall some modules to avoid dependency conflicts due to wrong versions. After that, we can install the required packages.

In [None]:
# uninstall the following modules to avoid dependency conflicts

! pip uninstall torch -y
! pip uninstall torchtext -y
! pip uninstall torchaudio -y
! pip uninstall torchdata -y
! pip uninstall torchsummary -y
! pip uninstall torchvision -y
! pip uninstall fastai -y

In [None]:
! pip install click>=7.0
! pip install attrs
! pip install toml
! pip install torchtext==0.6.0
! pip install tqdm
! pip install xopen
! pip install loguru
! pip install typer[all]
! pip install wandb==0.10.33
! pip install wandb-utils

To install the `pytorch_utils` package, we need to clone the **word2box** repository and install the package from the local folder.

In [None]:
%cd /content
! rm -rf word2box

! git clone https://github.com/ChiaraBi/word2box.git --recurse-submodules

%cd /content/word2box/lib/pytorch-utils
! pip install .

## 2.2 Imports

In [None]:
import argparse
import attr
import csv
import itertools
import json
import math
from math import ceil
import os
import pickle
import pprint
import random

from multiprocessing import Manager
from pathlib import Path
from tqdm import tqdm
from typing import *
import wandb
from xopen import xopen

from pytorch_utils import TensorDataLoader

import torchtext, torch
from torch import LongTensor, BoolTensor, Tensor
from torch.autograd import Variable
import torch.nn as nn
from torch.utils.data import Dataset

import numpy as np
from scipy import special
from scipy.stats import spearmanr

## 2.3 Constants and settings

In [None]:
tanh_eps = 1e-20
euler_gamma = 0.57721566490153286060

In [None]:
global use_cuda
use_cuda = torch.cuda.is_available()
device = torch.cuda.current_device() if use_cuda else "cpu"

#3 - Modules

Here I present the **word2box** classes and functions and, for each one of them, I provide a description of their behaviour.

I considered only the code needed for training and testing the `Word2BoxConjuction` model, since it is the only model for which the original code works. For each piece of code, I point to the codebase file it came from.

There might be small differences with the original codebase, since I fixed some bugs and removed unecessary or not working code.

## 3.1 datasets

### Word2VecDatasetOnDevice

The `Word2VecDatasetOnDevice` class inherits from `torch.utils.dataDataset`.

The **constructor** takes in input the following parameters:
* `corpus: LongTensor`: a chunk of text from the dataset.
* `device: Union[int, str]`: `'cpu'` or `'gpu'`.
* `eos_mask: bool`: tells whether to use or not an end of sentence (EOS) mask, i.e. a boolean mask used to indicate the positions of the end of sentences in a batch of token sequences. By default it is set to `True`.
* `subsample_thresh: float`: the subsample treshold. By default it is set to `1e-3`.
* `vocab: Union[Dict, Any]`: an instance of `torchtext.data.Field()`. This object has two main attributes: `stoi` (string-to-index) and `itos` (index-to-string).
`vocab.stoi` is a dictionary that maps each unique word in the vocabulary to a unique index. This is useful for converting a sequence of words into a sequence of indices, which can then be input to a model. `vocab.itos` is a list where the element at each index is the corresponding word. This is useful for converting a sequence of indices back into a sequence of words.
* `window_size: int`: the window size. By default it is set to `10`.

It sets the `pad_id`, `eos_token`, and `pad_size` and computes the `subsampling_prob`.

**Methods**

* `__getitem__(idx: LongTensor)`: this method takes in input the indices of the center words in the corpus. Then, it performs the following actions:
  1. For each center word index, it computes the corresponding context word indices.
  2. It perform the subsampling for both center and context words by calling the `sub_sample_words` method.
  3. It gets rid of the data that has `<pad>` as the center word or that has all context words as `<pad>`.
  4. It returns a dictionary with `center_word`, `context_words` and `context_mask`.

* `__len__()`: returns the length of the corpus minus the padding.
* `to(device: Union[torch.device, str])`: takes in input the `device` and returns an instance of `Word2VecDatasetOnDevice` where the corpus is `self.corpus.to(device)`.
* `sub_sample_words(_input: LongTensor)`: masks out the subsampled words by replacing them with `pad_id`.
* `get_mask(_input: LongTensor)`: returns a mask for the context words that have the `<pad>` token.

In [None]:
# datasets/word2vecgpu.py

class Word2VecDatasetOnDevice(Dataset):
    def __init__(
        self,
        corpus: LongTensor,
        window_size: int = 10,
        vocab: Union[Dict, Any] = None,
        subsample_thresh: float = 1e-3,
        eos_mask: bool = True,
        device: Union[int, str] = None,
    ):
        self.corpus = corpus
        self.window_size = window_size
        self.vocab = vocab
        self.subsample_thresh = subsample_thresh
        self.eos_mask = eos_mask
        self.pad_id = torch.tensor(self.vocab.stoi["<pad>"]).to(self.corpus.device)
        self.eos_token = torch.tensor(self.vocab.stoi["<eos>"]).to(self.corpus.device)
        # pad this at the beginning and end with window_size number of padding
        self.pad_size = 10
        total_words = sum(self.vocab.freqs.values())
        unigram_prob = (
            torch.tensor([self.vocab.freqs.get(key, 0) for key in self.vocab.itos])
            / total_words
        )
        self.subsampling_prob = 1.0 - torch.sqrt(
            subsample_thresh / (unigram_prob + 1e-19)
        ).to(
            self.corpus.device
        )  #

    def __getitem__(
        self, idx: LongTensor
    ) -> Tuple[LongTensor, LongTensor, BoolTensor, BoolTensor]:
        # idx is a Tensor of indicies of the corpus, eg. [2342,12312312,34534,1]
        # we will interpret these as the id of the center word
        idx += self.pad_size
        # Idx is repeated to get the sliding window effect
        # For the sliding window part we add the range with idx
        window_range = torch.arange(-self.window_size, self.window_size + 1)
        idx = idx.unsqueeze(1) + window_range.unsqueeze(0)

        # idx = torch.transpose(idx.repeat(2*self.window_size+1,1), 0, 1)
        # idx = idx + torch.arange(-self.window_size, self.window_size+1)

        # Get the middle slice for the center
        # The rest of them are context
        center = self.corpus[idx[:, self.window_size]]
        context = self.corpus[
            torch.cat(
                (idx[:, : self.window_size], idx[:, self.window_size + 1 :]), dim=1
            )
        ]
        # Get do the subsampling.
        center = self.sub_sample_words(center)
        context = self.sub_sample_words(context)

        # Get rid of the dataset that has the center word as <pad>.
        # Or has all context words as <pad>.
        if not self.eos_mask:
            keep = (center != self.pad_id) & (context != self.pad_id).any(dim=-1)
            center = center[keep]
            context = context[keep]
            assert (center != self.pad_id).all()
            context_mask = torch.ones_like(context)
        else:
            keep = (
                (center != self.pad_id)
                & (context != self.pad_id).any(dim=-1)
                & (center != self.eos_token)
                & (context != self.eos_token).any(dim=-1)
            )
            center = center[keep]
            context = context[keep]
            assert (center != self.pad_id).all()
            context_mask = self.get_mask(context)
            # Mask might do away with the whole sentence. In that case remove that
            keep = (context_mask != False).any(dim=1).squeeze()
            center = center[keep]
            context = context[keep]
            context_mask = context_mask[keep]
        return {
            "center_word": center,
            "context_words": context,
            "context_mask": context_mask,
        }

    def __len__(self) -> int:
        return len(self.corpus) - 2 * self.pad_size

    def to(self, device: Union[torch.device, str]):
        return Word2VecDatasetOnDevice(
            self.corpus.to(device), self.window_size, self.vocab, self.subsample_thresh
        )

    def sub_sample_words(self, _input: LongTensor) -> BoolTensor:
        ## Mask out the subsampled words. We will do so by
        ## replacing them with pad ids.
        mask_prob = torch.rand(_input.shape).to(_input.device)
        _input[mask_prob < self.subsampling_prob[_input]] = self.pad_id
        return _input

    def get_mask(self, _input: LongTensor) -> BoolTensor:
        ## Get the mask for the contexts that has pad token.
        right_mask = ~(
            (_input[:, self.window_size :] == self.eos_token).cumsum(dim=-1) > 0
        )
        l_eos = _input[:, : self.window_size] == self.eos_token
        left_mask = ~(
            l_eos | (l_eos.any(dim=-1, keepdim=True) & (l_eos.cumsum(dim=-1) == 0))
        )
        mask = torch.cat((left_mask, right_mask), dim=-1)
        return (_input != self.pad_id) & mask


### LazyDatasetLoader

The `LazyDatasetLoader` class represents a data loader. The class decorator `@attr.s` adds dunder methods to the class and the parameter `auto_attribs=True`collects PEP 526-annotated attributes from the class body.

**Methods**

* `__attrs_post_init__()`: computes the length and the `training_tensor_chunks`.
* `__iter__()`: iterates over `training_tensor_chunks`. For each chunk, it creates an instance of `Word2VecDatasetOnDevice` and then it yields an iterator to the dataset chunk.
* `__len__()`: returns the length computed in `__attrs_post_init__`.

In [None]:
# datasets/word2vecgpu.py

@attr.s(auto_attribs=True)
class LazyDatasetLoader:
    training_tensor: Tensor
    n_splits: int
    window_size: int = 10
    vocab: Union[Dict, Any] = None
    subsample_thresh: float = 1e-3
    eos_mask: bool = True
    batch_size: int = 64
    device: Union[int, str] = None

    def __attrs_post_init__(self):
        lng = len(self.training_tensor)
        splits = [int(lng / 10) * i for i in range(10)]
        splits.append(lng)
        self.leng = sum(
            ceil((j - i) / self.batch_size) for i, j in zip(splits[:-1], splits[1:])
        )
        self.training_tensor_chunks = [
            self.training_tensor[splits[i] : splits[i + 1]] for i in range(10)
        ]

    def __iter__(self):
        for chunk in self.training_tensor_chunks:
            train_dataset = Word2VecDatasetOnDevice(
                corpus=chunk,
                window_size=self.window_size,
                vocab=self.vocab,
                subsample_thresh=self.subsample_thresh,
                eos_mask=self.eos_mask,
            ).to(self.device)
            train_iter = TensorDataLoader(train_dataset, self.batch_size, shuffle=True)
            yield from train_iter
            del train_dataset
            del train_iter

    def __len__(self):
        return self.leng

### utils

* `load_vocab(data_dir: Union[str, Path])`: takes in input the data directory (`data_dir`) and returns the vocabulary numerical identifiers and frequencies.
* `load_tokenizer(data_dir)`: takes in input the data directory and returns the tokenized training dataset.
* `load_train_data_as_tensor(data_dir)`: takes in input the data directory and returns the training dataset as a `torch.Tensor`.
* `get_iter_on_device`:
  - input parameters:
    - `batch_size`: batch size.
    - `dataset`: dataset name. It should be the name of the corresponding dataset folder inside the `data` folder.
    - `data_device`: `'cpu'` or `'gpu'`.
    - `eos_mask: bool`: tells whether to use or not an End Of Sentence (EOS) mask, i.e. a boolean mask used to indicate the positions of the end of sentences in a batch of token sequences.
    - `n_gram`: window size.
    - `subsample_thresh`: the subsample treshold.
  - returns the vocabulary statistics (`TEXT`) and an instance of `LazyDatasetLoader` for the training data.

In [None]:
# datasets/utils.py

def load_vocab(data_dir: Union[str, Path]):
    vocab_tsv = Path(data_dir + "vocab.tsv")
    vocab_stoi = {}
    vocab_freq = {}
    if vocab_tsv.exists():
        with vocab_tsv.open() as vocab_file:
            next(vocab_file)  # skips header line
            for token_id, line in enumerate(vocab_file):
                token, frequency = line.split()
                vocab_stoi[token] = int(token_id)
                vocab_freq[token] = int(frequency)
    elif os.path.isfile(data_dir + "vocab_stoi.json") and os.path.isfile(
        data_dir + "vocab_freq.json"
    ):
        vocab_stoi = json.load(open(data_dir + "vocab_stoi.json", "r"))
        vocab_freq = json.load(open(data_dir + "vocab_freq.json", "r"))
    else:
        TEXT = torchtext.data.Field()
        train_split = torchtext.datasets.LanguageModelingDataset.splits(
            path=data_dir,
            train="train.txt",
            validation=None,
            test=None,
            text_field=TEXT,
        )
        TEXT.build_vocab(train_split[0])
        vocab_stoi_file = open(data_dir + "vocab_stoi.json", "a")
        vocab_freq_file = open(data_dir + "vocab_freq.json", "a")
        json.dump(TEXT.vocab.stoi, vocab_stoi_file)
        json.dump(TEXT.vocab.freqs, vocab_freq_file)
        vocab_stoi_file.close()
        vocab_freq_file.close()
        vocab_stoi = TEXT.vocab.stoi
        vocab_freq = TEXT.vocab.freqs
    return vocab_stoi, vocab_freq


def load_tokenizer(data_dir):
    if os.path.isfile(data_dir + "train_tokenized.pkl"):
        train_tokenized = pickle.load(open(data_dir + "train_tokenized.pkl", "rb"))
    else:
        train_tokenized = []
        vocab_stoi = json.load(open(data_dir + "vocab_stoi.json", "r"))
        vocab_freq = json.load(open(data_dir + "vocab_freq.json", "r"))

        with open(data_dir + "train.txt", "r") as f:
            for line in f:
                words = line.split()
                train_tokenized.append(
                    [vocab_stoi[ele] for ele in words] + [vocab_stoi["<eos>"]]
                )

        pickle.dump(train_tokenized, open(data_dir + "train_tokenized.pkl", "wb"))
    return train_tokenized


def load_train_data_as_tensor(data_dir):
    tensor_file = Path(data_dir + "train.pt")
    if tensor_file.exists():
        return torch.load(tensor_file)
    else:
        train_tensor = torch.tensor(
            list(itertools.chain.from_iterable(load_tokenizer(data_dir)))
        )
        torch.save(train_tensor, tensor_file)
    return train_tensor


def get_iter_on_device(
    batch_size,
    dataset,
    n_gram,
    subsample_thresh,
    data_device,
    eos_mask,
):
    print("Loading VOCAB & Tokenized Training files ...")

    data_dir = "./data/" + dataset + "/"
    vocab_stoi, vocab_freq = load_vocab(data_dir)
    train_tokenized = load_train_data_as_tensor(data_dir)

    ## Create Vocabulary properties
    print("Creating iterable dataset ...")
    TEXT = torchtext.data.Field()
    TEXT.stoi = vocab_stoi
    TEXT.freqs = vocab_freq
    TEXT.itos = [k for k, v in sorted(vocab_stoi.items(), key=lambda item: item[1])]

    # Since we won't train on <pad> and <eos>. These should not come in any sort of
    # subsampling and negative sampling part.
    TEXT.freqs["<pad>"] = 0
    TEXT.freqs["<unk>"] = 0

    if eos_mask:
        TEXT.freqs["<eos>"] = 0

    ## Create data on the device
    print("Creating iterable dataset on GPU/CPU...")
    if data_device == "gpu":
        data_device = device
    train_iter = LazyDatasetLoader(
        training_tensor=train_tokenized,
        n_splits=1000,
        window_size=n_gram,
        vocab=TEXT,
        subsample_thresh=subsample_thresh,
        eos_mask=eos_mask,
        device=data_device,
        batch_size=batch_size,
    )

    return TEXT, train_iter


## 3.2 box

### utils

The `log1mexp` function computes the logarithm of the complementary exponential function. In mathematical terms, it calculates `log(1 - exp(x))`.

This function is useful in numerical computations because it can provide more accurate results when `x` is close to zero. Directly computing `log(1 - exp(x))` can result in a loss of precision due to the subtraction in the argument of the logarithm. The `log1mexp` function is designed to avoid this issue.

In [None]:
# box/utils.py

_log1mexp_switch = math.log(0.5)


def log1mexp(
    x: torch.Tensor, split_point=_log1mexp_switch, exp_zero_eps=1e-7
) -> torch.Tensor:
    """
    Computes log(1 - exp(x)).

    Splits at x=log(1/2) for x in (-inf, 0] i.e. at -x=log(2) for -x in [0, inf).

    = log1p(-exp(x)) when x <= log(1/2)
    or
    = log(-expm1(x)) when log(1/2) < x <= 0

    For details, see

    https://cran.r-project.org/web/packages/Rmpfr/vignettes/log1mexp-note.pdf

    https://github.com/visinf/n3net/commit/31968bd49c7d638cef5f5656eb62793c46b41d76
    """
    logexpm1_switch = x > split_point
    Z = torch.zeros_like(x)
    # this clamp is necessary because expm1(log_p) will give zero when log_p=1,
    # ie. p=1
    logexpm1 = torch.log((-torch.expm1(x[logexpm1_switch])).clamp_min(1e-38))
    # hack the backward pass
    # if expm1(x) gets very close to zero, then the grad log() will produce inf
    # and inf*0 = nan. Hence clip the grad so that it does not produce inf
    logexpm1_bw = torch.log(-torch.expm1(x[logexpm1_switch]) + exp_zero_eps)
    Z[logexpm1_switch] = logexpm1.detach() + (logexpm1_bw - logexpm1_bw.detach())
    # Z[1 - logexpm1_switch] = torch.log1p(-torch.exp(x[1 - logexpm1_switch]))
    Z[~logexpm1_switch] = torch.log1p(-torch.exp(x[~logexpm1_switch]))

    return Z


### BoxTensor

The `BoxTensor` class is a wrapper for a `torch.Tensor` and represents single or multiple boxes.

The **constructor** takes in input the following parameters and checks whether the tensor shape is correct:
* `data: torch.Tensor`: a Tensor that represents our box or boxes.
* `learnt_temp: bool`: tells whether to use a learned temperature parameter, which is often used in models that employ a softmax function for probabilistic predictions. By default it is set to `False`.

`BoxTensor` exposes the following methods and properties.

**Properties**

* `z: torch.Tensor` : lower left coordinates as Tensor.
* `Z: torch.Tensor` : top right coordinates as Tensor.
* `box_type: String`: the class name.
* `centre: torch.Tensor`: centre coordinates as Tensor.

**Class methods**

* `from_zZ`:
  1. checks if `z` and `Z` have the same shape. If not, it throws an error.
  2. creates a `torch.Tensor` by concatenating `z` and `Z`.
  4. wraps the torch tensor with a `BoxTensor` and returns it.
* `from_split`: creates a `BoxTensor` by splitting a `torch.Tensor` on the dimension `dim` at the midpoint.
* `_log_soft_volume`: computes the logarithm of the volume of a box using `nn.functional.softplus`.
* `_log_soft_volume_adjusted`: computes the logarithm of the volume of a box using `nn.functional.softplus` and adding a factor of `- 2 * euler_gamma * gumbel_beta`.
* `get_wW`: returns the tuple `z, Z`.

**Methods**

* `_intersection`: computes the intersection between 2 `BoxTensor`. It returns `z` and `Z` of the intersection. Note that it can return flipped boxes, i.e. where `z[i] > Z[i]`.
* `gumbel_intersection_log_volume`: computes a bayesian intersection between 2 `BoxTensor` by calling the `_intersection` method and passing the parameters `bayesian=True` and `gumbel_beta=intersection_temp`. It then computes the logarithm of the volume of the intersection by calling `_log_soft_volume_adjusted` and returns it.
* `intersection`: computes the intersection between 2 BoxTensor by calling the `_intersection` method. It returns the intersection as a `BoxTensor`. Note that it can return flipped boxes, i.e. where `z[i] > Z[i]`.
* `join`: returns the join between 2 `BoxTensor`, as a `BoxTensor`.
* `log_soft_volume`: calls `_log_soft_volume` and returns the result.
* `intersection_log_soft_volume`: computes the intersection between 2 `BoxTensor` by calling the `_intersection` method. It then computes the logarithm of the volume of the intersection by calling `_log_soft_volume` and returns it.


In [None]:
# box/box_wrapper.py

def _box_shape_ok(t: torch.Tensor, learnt_temp=False) -> bool:
  '''
  Performs the following checks:
  - len(t.shape) should be >= 2
  - if learnt_temp==True -> t.size(-2) should be 4
  - if learnt_temp==False -> t.size(-2) should be 2
  '''
  if len(t.shape) < 2:
      return False
  if not learnt_temp:
      if t.size(-2) != 2:
          return False
      return True
  else:
      if t.size(-2) != 4:
          return False

      return True


def _shape_error_str(tensor_name, expected_shape, actual_shape):
    return "Shape of {} has to be {} but is {}".format(
        tensor_name, expected_shape, tuple(actual_shape)
    )


# see: https://realpython.com/python-type-checking/#type-hints-for-methods
# to know why we need to use TypeVar
TBoxTensor = TypeVar("TBoxTensor", bound="BoxTensor")


class BoxTensor(object):
    """A wrapper which contains a single tensor that
    represents single or multiple boxes.

    Have to use composition instead of inheritance since
    it is not safe to interit from :class:`torch.Tensor`, because
    creating an instance of such a class will always make it a leaf node.
    This works for :class:`torch.nn.Parameter` but won't work for a general
    box_tensor.
    """

    def __init__(self, data: torch.Tensor, learnt_temp: bool = False) -> None:
        """
        .. todo:: Validate the values of z, Z ? z < Z

        Arguments:
            data: Tensor of shape (**, zZ, num_dims). Here, zZ=2, where
                the 0th dim is for bottom left corner and 1st dim is for
                top right corner of the box
        """

        if _box_shape_ok(data, learnt_temp):
            self.data = data
        else:
            raise ValueError(_shape_error_str("data", "(**,2,num_dims)", data.shape))
        super().__init__()

    def __repr__(self):
        return "box_tensor_wrapper(" + self.data.__repr__() + ")"

    @property
    def z(self) -> torch.Tensor:
        """Lower left coordinate as Tensor"""

        return self.data[..., 0, :]

    @property
    def Z(self) -> torch.Tensor:
        """Top right coordinate as Tensor"""

        return self.data[..., 1, :]

    @property
    def box_type(self):
        return "BoxTensor"

    @property
    def centre(self) -> torch.Tensor:
        """Centre coordinate as Tensor"""

        return (self.z + self.Z) / 2

    @classmethod
    def from_zZ(cls: Type[TBoxTensor], z: torch.Tensor, Z: torch.Tensor) -> TBoxTensor:
        """
        Creates a box by stacking z and Z along -2 dim.
        That is if z.shape == Z.shape == (**, num_dim),
        then the result would be box of shape (**, 2, num_dim)
        """

        if z.shape != Z.shape:
            raise ValueError(
                "Shape of z and Z should be the same but is {} and {}".format(
                    z.shape, Z.shape
                )
            )
        box_val: torch.Tensor = torch.stack((z, Z), -2)

        return cls(box_val)

    @classmethod
    def from_split(cls: Type[TBoxTensor], t: torch.Tensor, dim: int = -1) -> TBoxTensor:
        """Creates a BoxTensor by splitting on the dimension dim at midpoint

        Args:
            t: input
            dim: dimension to split on

        Returns:
            BoxTensor: output BoxTensor

        Raises:
            ValueError: `dim` has to be even
        """
        len_dim = t.size(dim)

        if len_dim % 2 != 0:
            raise ValueError(
                "dim has to be even to split on it but is {}".format(t.size(dim))
            )
        split_point = int(len_dim / 2)
        z = t.index_select(
            dim,
            torch.tensor(list(range(split_point)), dtype=torch.int64, device=t.device),
        )

        Z = t.index_select(
            dim,
            torch.tensor(
                list(range(split_point, len_dim)), dtype=torch.int64, device=t.device
            ),
        )

        return cls.from_zZ(z, Z)

    def _intersection(
        self: TBoxTensor,
        other: TBoxTensor,
        gumbel_beta: float = 1.0,
        bayesian: bool = False,
    ) -> Tuple[torch.Tensor, torch.Tensor]:
        t1 = self
        t2 = other
        z, Z = None, None
        if bayesian:
            try:
                z = gumbel_beta * torch.logaddexp(
                    t1.z / gumbel_beta, t2.z / gumbel_beta
                )
                z = torch.max(z, torch.max(t1.z, t2.z))
                Z = -gumbel_beta * torch.logaddexp(
                    -t1.Z / gumbel_beta, -t2.Z / gumbel_beta
                )
                Z = torch.min(Z, torch.min(t1.Z, t2.Z))
            except Exception as e:
                print("Gumbel intersection is not possible")
                # breakpoint()
        else:
            z = torch.max(t1.z, t2.z)
            Z = torch.min(t1.Z, t2.Z)

        return z, Z

    def gumbel_intersection_log_volume(
        self: TBoxTensor,
        other: TBoxTensor,
        volume_temp=1.0,
        intersection_temp: float = 1.0,
        scale=1.0,
    ) -> TBoxTensor:
        z, Z = self._intersection(other, gumbel_beta=intersection_temp, bayesian=True)
        vol = self._log_soft_volume_adjusted(
            z, Z, temp=volume_temp, gumbel_beta=intersection_temp, scale=scale
        )
        return vol

    def intersection(self: TBoxTensor, other: TBoxTensor) -> TBoxTensor:
        """Gives intersection of self and other.

        .. note:: This function can give flipped boxes, i.e. where z[i] > Z[i]
        """
        z, Z = self._intersection(other)

        return self.from_zZ(z, Z)

    def join(self: TBoxTensor, other: TBoxTensor) -> TBoxTensor:
        """Gives join"""
        z = torch.min(self.z, other.z)
        Z = torch.max(self.Z, other.Z)

        return self.from_zZ(z, Z)

    @classmethod
    def _log_soft_volume(
        cls, z: torch.Tensor, Z: torch.Tensor, temp: float = 1.0, scale: Union[float, torch.Tensor] = 1.0
    ) -> torch.Tensor:
        eps = torch.finfo(z.dtype).tiny  # type: ignore

        if isinstance(scale, float):
            s = torch.tensor(scale)
        else:
            s = scale

        return torch.sum(
            torch.log(nn.functional.softplus(Z - z, beta=temp) + 1e-23), dim=-1
        ) + torch.log(
            s
        )  # need this eps so that the derivative of log does not blow

    def log_soft_volume(
        self, temp: float = 1.0, scale: Union[float, torch.Tensor] = 1.0
    ) -> torch.Tensor:
        res = self._log_soft_volume(self.z, self.Z, temp=temp, scale=scale)

        return res

    @classmethod
    def _log_soft_volume_adjusted(
        cls,
        z: torch.Tensor,
        Z: torch.Tensor,
        temp: float = 1.0,
        gumbel_beta: float = 1.0,
        scale: Union[float, torch.Tensor] = 1.0,
    ) -> torch.Tensor:
        eps = torch.finfo(z.dtype).tiny  # type: ignore

        if isinstance(scale, float):
            s = torch.tensor(scale)
        else:
            s = scale

        return (
            torch.sum(
                torch.log(
                    nn.functional.softplus(Z - z - 2 * euler_gamma * gumbel_beta, beta=temp) + 1e-23
                ),
                dim=-1,
            )
            + torch.log(s)
        )

    def intersection_log_soft_volume(
        self,
        other: TBoxTensor,
        temp: float = 1.0,
        gumbel_beta: float = 1.0,
        bayesian: bool = False,
        scale: Union[float, torch.Tensor] = 1.0,
    ) -> torch.Tensor:
        z, Z = self._intersection(other, gumbel_beta, bayesian)
        vol = self._log_soft_volume(z, Z, temp=temp, scale=scale)

        return vol

    @classmethod
    def get_wW(cls, z, Z):
        return z, Z


### DeltaBoxTensor

The `DeltaBoxTensor` class inherits from the `BoxTensor` class. As `BoxTensor`, it is a wrapper for a `torch.Tensor` and represents single or multiple boxes, but it uses a <u>different parametrization</u>.

`DeltaBoxTensor` overrides the following methods and properties.

**Properties**

* `z: torch.Tensor` : lower left coordinates as Tensor.
* `Z: torch.Tensor` : top right coordinates as Tensor.

**Class methods**

* `from_zZ(z, Z)`:
  1. checks if `z` and `Z` have the same shape. If not, it throws an error.
  2. calls `get_wW` to get `w` and `W`.
  3. creates a `torch.Tensor` by concatenating `w` and `W`.
  4. wraps the torch tensor with a `DeltaBoxTensor` and returns it.
* `get_wW(z, Z)`:
  1. checks if `z` and `Z` have the same shape. If not, it throws an error.
  2. calls `_softplus_inverse` to compute `W`.
  3. returns `w` and `W`.
* `from_split(t, dim)`: creates a `DeltaBoxTensor` by splitting the `torch.Tensor` `t `on the dimension `dim` at the midpoint.



In [None]:
# box/box_wrapper.py


def _softplus_inverse(t: torch.Tensor, beta=1.0, threshold=20):
    below_thresh = beta * t < threshold
    res = t
    res[below_thresh] = torch.log(torch.exp(beta * t[below_thresh]) - 1.0) / beta

    return res


class DeltaBoxTensor(BoxTensor):
    """Same as BoxTensor but with a different parameterization: (**,wW, num_dims)

    z = w
    Z = z + delta(which is always positive)
    """

    @property
    def z(self) -> torch.Tensor:
        return self.data[..., 0, :]

    @property
    def Z(self) -> torch.Tensor:
        z = self.z
        Z = z + nn.functional.softplus(self.data[..., 1, :], beta=10)

        return Z

    @classmethod
    def from_zZ(cls: Type[TBoxTensor], z: Tensor, Z: Tensor) -> TBoxTensor:

        if z.shape != Z.shape:
            raise ValueError(
                "Shape of z and Z should be the same but is {} and {}".format(
                    z.shape, Z.shape
                )
            )
        w, W = cls.get_wW(z, Z)  # type:ignore

        box_val: torch.Tensor = torch.stack((w, W), -2)

        return cls(box_val)

    @classmethod
    def get_wW(cls, z, Z):
        if z.shape != Z.shape:
            raise ValueError(
                "Shape of z and Z should be the same but is {} and {}".format(
                    z.shape, Z.shape
                )
            )
        w = z
        W = _softplus_inverse(Z - z, beta=10.0)  # type:ignore

        return w, W

    @classmethod
    def from_split(cls: Type[TBoxTensor], t: torch.Tensor, dim: int = -1) -> TBoxTensor:
        """Creates a BoxTensor by splitting on the dimension dim at midpoint

        Args:
            t: input
            dim: dimension to split on

        Returns:
            BoxTensor: output BoxTensor

        Raises:
            ValueError: `dim` has to be even
        """
        len_dim = t.size(dim)

        if len_dim % 2 != 0:
            raise ValueError(
                "dim has to be even to split on it but is {}".format(t.size(dim))
            )
        split_point = int(len_dim / 2)
        w = t.index_select(
            dim,
            torch.tensor(list(range(split_point)), dtype=torch.int64, device=t.device),
        )

        W = t.index_select(
            dim,
            torch.tensor(
                list(range(split_point, len_dim)), dtype=torch.int64, device=t.device
            ),
        )
        box_val: torch.Tensor = torch.stack((w, W), -2)

        return cls(box_val)

### BoxEmbedding

The `BoxEmbedding` class inherits from `torch.nn.Embedding`, a lookup table that maps an index value to a weight matrix of a certain dimension. It constitutes the model's embedding layer and it is similar to [AllenNLP](https://github.com/allenai/allennlp/tree/main) embedding, but it returns box tensors by splitting the output of usual embeddings into `z` and `Z`.

The **constructor** takes in input the following parameters and initializes the layer's weights:
 * `num_embeddings: int`: the size of the dictionary of embeddings, i.e. the number of embeddings.
 * `box_embedding_dim: int`: the dimension of each embedding. The number of weights will be `box_embedding_dim*2`.
 * `box_type`: the type of the box. By default it is set to `"BoxTensor"`.
 * `init_interval_center`: By default it is set to `0.25`.
 * `init_interval_delta`: By default it is set to `0.1`.


`BoxEmbedding` exposes the following methods and properties.

**Properties**

* `all_boxes: TBoxTensor` : returns all the box embeddings.

**Methods**

* `init_weights`: initializes the layer's weights from a uniform distribution.
* `forward`:
  - calls the `nn.Embedding.forward` method;
  - call the `from_split` method of `DeltaBoxTensor` or `BoxTensor`, depending on the value of the `box_type` paramer passed to the `BoxEmbedding` constructor.
* `get_volumes`: returns the `log_soft_volume` of the box embeddings.

In [None]:
# box/modules.py

def _uniform_small(weight, emb_dim, param1, param2, box_type):
    with torch.no_grad():
        temp = torch.zeros_like(weight)
        torch.nn.init.uniform_(temp, 0.0 + 1e-7, 1.0 - 0.1 - 1e-7)
        # z = torch.min(temp[..., :emb_dim], temp[..., emb_dim:])
        z = temp[..., :emb_dim]
        Z = z + 0.1
        w, W = box_type.get_wW(z, Z)
        weight[..., :emb_dim] = w
        weight[..., emb_dim : emb_dim * 2] = W


class BoxEmbedding(nn.Embedding):
    box_types = {
        "DeltaBoxTensor": DeltaBoxTensor,
        "BoxTensor": BoxTensor,
    }

    def init_weights(self):
        _uniform_small(
            self.weight,
            self.box_embedding_dim,
            0.0 + 1e-7,
            1.0 - 1e-7,
            self.box_types[self.box_type],
        )

    def __init__(
        self,
        num_embeddings: int,
        box_embedding_dim: int,
        box_type="BoxTensor",
        init_interval_center=0.25,
        init_interval_delta=0.1,
    ) -> None:
        """Similar to allennlp embeddings but it returns box
        tensor by splitting the output of usual embeddings
        into z and Z

        Arguments:
            box_embedding_dim: Embedding weight would be box_embedding_dim*2
        """
        vector_emb_dim = box_embedding_dim * 2
        if box_type == "BoxTensorLearntTemp":
          # it will never enter here because box_type can only be "BoxTensor" or
          # "DeltaBoxTensor"
            vector_emb_dim = box_embedding_dim * 4

        super().__init__(num_embeddings, vector_emb_dim)
        self.box_type = box_type
        self.init_interval_delta = init_interval_delta
        self.init_interval_center = init_interval_center
        try:
            self.box = self.box_types[box_type]
        except KeyError as ke:
            raise ValueError("Invalid box type {}".format(box_type)) from ke
        self.box_embedding_dim = box_embedding_dim
        self.init_weights()

    def forward(self, inputs: torch.LongTensor):
        emb = super().forward(inputs)  # shape (**, self.box_embedding_dim*2)
        box_emb = self.box.from_split(emb)
        return box_emb

    def get_volumes(self, temp: Union[float, torch.Tensor]) -> torch.Tensor:
        return self.all_boxes.log_soft_volume(temp=temp)

    @property
    def all_boxes(self) -> TBoxTensor:
        all_index = torch.arange(
            0, self.num_embeddings, dtype=torch.long, device=self.weight.device
        )
        all_ = self.forward(all_index)

        return all_


## 3.3 models

### BaseModule

The `BaseModule` class inherits from `torch.nn.Module` and exposes the methods needed to save and load the model's parameters and the training checkpoints.

To do so, it uses the following methods defined in `torch.nn.Module`:

* `load_state_dict`: https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.load_state_dict

* `eval`
* `state_dict`

In [None]:
# models/BaseModule.py

class BaseModule(nn.Module):
    def __init__(self):
        super(BaseModule, self).__init__()
        self.zero_const = nn.Parameter(torch.Tensor([0]))
        self.zero_const.requires_grad = False
        self.pi_const = nn.Parameter(torch.Tensor([3.14159265358979323846]))
        self.pi_const.requires_grad = False

    def load_checkpoint(self, path):
        self.load_state_dict(torch.load(os.path.join(path), map_location="cpu"))
        self.eval()

    def save_checkpoint(self, path):
        torch.save(self.state_dict(), path)

    def load_parameters(self, path):
        f = open(path, "r")
        parameters = json.loads(f.read())
        f.close()
        for i in parameters:
            parameters[i] = torch.Tensor(parameters[i])
        self.load_state_dict(parameters, strict=False)
        self.eval()

    def save_parameters(self, path):
        f = open(path, "w")
        f.write(json.dumps(self.get_parameters("list")))
        f.close()

    def get_parameters(self, mode="numpy", param_dict=None):
        all_param_dict = self.state_dict()
        if param_dict == None:
            param_dict = all_param_dict.keys()
        res = {}
        for param in param_dict:
            if mode == "numpy":
                res[param] = all_param_dict[param].cpu().numpy()
            elif mode == "list":
                res[param] = all_param_dict[param].cpu().numpy().tolist()
            else:
                res[param] = all_param_dict[param]
        return res

    def set_parameters(self, parameters):
        for i in parameters:
            parameters[i] = torch.Tensor(parameters[i])
        self.load_state_dict(parameters, strict=False)
        self.eval()


### Word2Box

The `Word2Box` class inherits from `BaseModule`.

The **constructor** takes in input the following parameters:
* `TEXT`: an object that contains some text statistics, used to get the size of the vocabulary.
* `embedding_dim`: the dimension of the `BoxEmbedding` layers. By default it is set to `50`.
* `batch_size`: the batch size. By default it is set to `10`.
* `n_gram`: the window size. By default it is set to `4`.
* `volume_temp`: volume temperature to adjust the "sharpness" of the distribution. A high temperature makes the distribution more uniform, while a low temperature makes it more sharply peaked. By default it is set to `1.0`.
* `intersection_temp`: the beta coeffiencient of the Gumbel distribution. By default it is set to `1.0`.
* `box_type`: the type of the box. By default it is set to `"BoxTensor"`.

It initializes the model and creates the `BoxEmbedding` layers for the target and context words.

`Word2Box` defines the following methods:

* `forward`:
  - applies the `BoxEmbedding` layers to the context and target words;
  - unsqueezes the training data;
  - computes the score choosing the method based on the value of the `intersection_temp` parameter.
* `word_similarity(w1, w2)`: computes the word embedding for each of the input words `w1` and `w2`, and then it computes the volume of the intersection between the 2 box embeddings. The method used to compute the volume of the intersection is chosen based on the value of the `intersection_temp` parameter.
* `conditional_similarity(w1, w2)`: same as `word_similary`, plus it computes the `_log_soft_volume_adjusted`.

In [None]:
# models/word2box.py

class Word2Box(BaseModule):
    def __init__(
        self,
        TEXT=None,
        embedding_dim=50,
        batch_size=10,
        n_gram=4,
        volume_temp=1.0,
        intersection_temp=1.0,
        box_type="BoxTensor",
        **kwargs
    ):
        super(Word2Box, self).__init__()

        # Model
        self.batch_size = batch_size
        self.n_gram = n_gram
        self.vocab_size = len(TEXT.itos)
        self.embedding_dim = embedding_dim

        # Box features
        self.volume_temp = volume_temp
        self.intersection_temp = intersection_temp
        self.box_type = box_type

        # Create embeddings
        self.embeddings_word = BoxEmbedding(
            self.vocab_size, self.embedding_dim, box_type=box_type
        )
        self.embedding_context = BoxEmbedding(
            self.vocab_size, self.embedding_dim, box_type=box_type
        )

    def forward(self, idx_word, idx_context, context_mask, train=True):
        context_boxes = self.embedding_context(idx_context)  # Batch_size * 2 * dim
        word_boxes = self.embeddings_word(idx_word)  # Batch_size * ns+1 * 2 * dim
        if train == True:
            word_boxes.data.unsqueeze_(
                1
            )  # Braodcast the word vector to the the context + negative_samples.

        if self.intersection_temp == 0.0:
            score = word_boxes.intersection_log_soft_volume(
                context_boxes, temp=self.volume_temp
            )
        else:
            score = word_boxes.gumbel_intersection_log_volume(
                context_boxes,
                volume_temp=self.volume_temp,
                intersection_temp=self.intersection_temp,
            )

        return score

    def word_similarity(self, w1, w2):
        with torch.no_grad():
            word1 = self.embeddings_word(w1)
            word2 = self.embeddings_word(w2)
            if self.intersection_temp == 0.0:
                score = word1.intersection_log_soft_volume(word2, temp=self.volume_temp)
            else:
                score = word1.gumbel_intersection_log_volume(
                    word2,
                    volume_temp=self.volume_temp,
                    intersection_temp=self.intersection_temp,
                )
            return score

    def conditional_similarity(self, w1, w2):
        with torch.no_grad():
            word1 = self.embeddings_word(w1)
            word2 = self.embeddings_word(w2)
            if self.intersection_temp == 0.0:
                score = word1.intersection_log_soft_volume(word2, temp=self.volume_temp)
            else:
                score = word1.gumbel_intersection_log_volume(
                    word2,
                    volume_temp=self.volume_temp,
                    intersection_temp=self.intersection_temp,
                )

            # Word1 Word2  queen   royalty 5.93
            # Word2 is more general P(royalty | queen) = 1
            # Thus we need p(w2 | w1)
            score -= word1._log_soft_volume_adjusted(
                word1.z,
                word1.Z,
                temp=self.volume_temp,
                gumbel_beta=self.intersection_temp,
            )

            return score


### Word2BoxConjunction

`Word2BoxConjunction` is the CBOW implementation of the Word2Box algorithm. It inherits from the `Word2Box` class and overrides the `forward` method of `Word2Box`. Furthermore,
it defines a new method called `intersect_multiple_box`, which applies a mask to the input boxes and turns them into Gumbel random variables.

* `forward`: similar to `Word2Box`'s `forward` method, but instead of using the box embeddings of the context as is, it applies to it a further trasformation defined by the `intersect_multiple_box` method.
* `intersect_multiple_box`:
    - Applies a mask to the boxes.
    - Then, it applies a transformation to `z` and `Z` in order to make them Gumbel random variables.
    - Finally, it returns a new `BoxTensor` with the resulting `z` and `Z`.
    - Note that <u>`intersection_temp` should always be `!=0`</u>, otherwise there would be a division by 0.


In [None]:
# models/word2box.py

class Word2BoxConjunction(Word2Box):

    def intersect_multiple_box(self, boxes, mask):
        beta = self.intersection_temp
        z = boxes.z.clone()
        Z = boxes.Z.clone()

        z[~mask] = float("-inf")
        Z[~mask] = float("inf")
        z = beta * torch.logsumexp(z / beta, dim=1, keepdim=True)
        Z = -beta * torch.logsumexp(-Z / beta, dim=1, keepdim=True)

        return BoxTensor.from_zZ(z, Z)

    def forward(self, idx_word, idx_context, mask_context, train=True):
        word_boxes = self.embeddings_word(idx_word)  # Batch_size * ns+1 * 2 * dim
        context_boxes = self.embedding_context(idx_context)  # Batch_size * 2 * dim

        # Note that the context is not masked yet. We need to mask them as well.
        pooled_context = self.intersect_multiple_box(context_boxes, mask_context)

        if self.intersection_temp == 0.0:
          # it will never enter here
            score = word_boxes.intersection_log_soft_volume(
                pooled_context, temp=self.volume_temp
            )
        else:
            score = word_boxes.gumbel_intersection_log_volume(
                pooled_context,
                volume_temp=self.volume_temp,
                intersection_temp=self.intersection_temp,
            )
        return score

## 3.4 train

### Loss functions

* **Noise Contrastive Estimation** (`nce`) loss function:
  + can be used for any embeddings;
  + *Word2Vec* uses this loss function;
  + we pass the *unnormalised* probabilities through sigmoid to normalise the score.

  + input:
	    pos: Unnormalised similarity score for positives.
	    neg: Unnormalised similarity score for negatives
  + output:
	    loss = - (logsigmoid(pos) + sum(logsigmoid(-neg)))

* **Negative Log Likelihood** (`nll`) loss for box embeddings:
  + input:
	    pos = log probabiltiy for positive examples
	    neg = log probabiltiy for negaitve examples
  + output:
	    loss = - (pos + sum(log(1-exp(neg)))

* **Max Margin** (`max_margin`) loss for box embeddings:
  + used to encourage the model to correctly rank the correct output higher than any of the incorrect outputs. This is achieved by maximizing the margin between the score of the correct output and the scores of the incorrect outputs. The max-margin loss helps the model learn to make more accurate predictions by emphasizing the importance of the correct output being clearly distinguished from the incorrect ones.
  + input scores can be un-normalised;
  + goal: increase the pos similarity score more than a margin from the negative scores. If that margin is satisfied then the loss is zero.

  + input:
	    pos: Unnormalised similarity (may be log in case of Boxes) score for positives.
	    neg: Unnormalised similarity (may be log in case of Boxes) score for negatives.
  + output:
	    loss =  - max(0, pos - mean(neg) + margin)

In [None]:
# train/loss.py

def nll(pos, neg, **kwagrs):
    """
	The Negative Log Likelihood loss funtion is used for
    box embeddings.

	Args:
	    pos = log probabiltiy for positive examples
	    neg = log probabiltiy for negaitve examples
	Output:
	    loss = - (pos + sum(log(1-exp(neg)))
	"""
    assert (pos < 0).all(), "Log probabiltiy can not be positive"
    assert (neg < 0).all(), "Log probabiltiy can not be positive"
    return -(pos + torch.sum(log1mexp(neg), dim=1)) # log1mexp is defined in box.utils


def nce(pos, neg, **kwagrs):
    """
	The Noise Contrastive Estimation loss function can be
    used for any embeddings.
	However, here we pass the unnormalised probabilities
	through sigmoid to normalised the score. Word2vec uses
	this loss function.

	Args:
	    pos: Unnormalised similarity score for positives.
	    neg: Unnormalised similarity score for negatives.
	Output:
	    loss = -(logsigmoid(pos) + sum(logsigmoid(-neg)))
	"""
    return -(nn.functional.logsigmoid(pos) + torch.sum(nn.functional.logsigmoid(-neg), dim=1))


def max_margin(pos, neg, margin=5.0):
    """
	This is max margin loss for box embeddings.
	Here, the input scores can be un-normalised. The object here
	is to make increse the pos similarity score more than a margin
	from the negative scores. If that margin is satisfied then the
	loss is zero.

	Args:
	    pos: Unnormalised similarity(maybe log in case of Boxes) score for positives.
	    neg: Unnormalised similarity(maybe log in case of Boxes) score for negatives.
	Output:
	    loss =  - max(0, pos - mean(neg) + margin)
	"""
    # Replicate the positive score number of negative sample times
    zero = torch.tensor(0.0).to(device)
    return torch.sum(torch.max(zero, neg - pos + margin), dim=1)


### Negative Sampling

**RandomNegativeCBOW**

The class `RandomNegativeCBOW` is used to perform the negative sampling for the models based on *CBOW*.

The **constructor** takes in input the following parameters:
* `number_of_samples: int`: number of negative samples for each positive sample. By default it is set to `5`.
* `sampling_distn: torch.LongTensor`: the sampling distribution; it's a tensor containing probabilities.

`RandomNegativeCBOW` implements the `__call__(batch)` method, which samples the negative examples by using the `torch.multinomial` method. The negative samples are appended to the `batch`'s `center_word` and then this new batch is returned.


**RandomNegativeSkipGram**

The class `RandomNegativeSkipGram` inherits from `RandomNegativeCBOW` and it is used to perform the negative sampling for the models based on *SkipGram*.

It overrides the `__call__(batch)` method. This method appends the negative samples to the `batch`'s `context_words`.

In [None]:
# train/negative_sampling.py

class RandomNegativeCBOW:
    """
    This augments a batch of data to include randomly sampled target center words.
    Appends the sampled words with 'center_words' of the batch
    """

    def __init__(self, number_of_samples: int = 5, sampling_distn: torch.LongTensor = None):
        self.number_of_samples = number_of_samples
        self.sampling_distn = sampling_distn

    def __call__(self, batch) -> torch.LongTensor:
        x, y = batch["context_words"].shape
        negatives = torch.multinomial(
            self.sampling_distn,
            num_samples=self.number_of_samples * x,
            replacement=True,
        ).resize(x, self.number_of_samples)
        batch["center_word"] = torch.cat(
            (batch["center_word"].unsqueeze(1), negatives), dim=-1
        )
        return batch


class RandomNegativeSkipGram(RandomNegativeCBOW):
    """
    This augments a batch of data to include randomly sampled target context words.
    Appends the sampled words with 'context_words' of the batch
    """

    def __call__(self, batch) -> torch.LongTensor:
        x, y = batch["context_words"].shape
        negatives = torch.multinomial(
            self.sampling_distn,
            num_samples=self.number_of_samples * x * y,
            replacement=True,
        ).resize(x, y, self.number_of_samples)
        batch["context_words"] = torch.cat(
            (batch["context_words"].unsqueeze(-1), negatives), dim=-1
        )
        return batch


### Trainer

The `Trainer` class exposes the methods needed to train the models defined in the `models` folder.

The **constructor** takes in input the following parameters:
* `train_iter: LazyDatasetLoader`: an instance of `LazyDatasetLoader` for the training dataset.
* `val_iter: LazyDatasetLoader`: an instance of `LazyDatasetLoader` for the validation dataset.
* `vocab`: an instance of `torchtext.data.Field()`. This object has two main attributes: `stoi` (string-to-index) and `itos` (index-to-string).
`vocab.stoi` is a dictionary that maps each unique word in the vocabulary to a unique index. This is useful for converting a sequence of words into a sequence of indices, which can then be input to a model. `vocab.itos` is a list where the element at each index is the corresponding word. This is useful for converting a sequence of indices back into a sequence of words.
* `lr: float`: the learning rate. By default it is set to `0.001`.
* `n_gram: int`: the window size. By default it is set to `4`.
* `loss_fn: str`: the loss function. It can be `'nce'`, `'nll'` or `'max-margin'`.
* `negative_samples: int`: number of negative samples for training. By default it is set to `5`.
* `log_frequency: int`: the log frequency.  By default it is set to `1000`.

It also sets the frequency of `'<pad>'` to `0`, to avoid to sample it during the negative sampling.


**Methods**

* `train_model`: Not implemented.


In [None]:
# train/Trainer.py

torch.autograd.set_detect_anomaly(True)

criterions = {"nll": nll, "nce": nce, "max_margin": max_margin}


class Trainer:
    def __init__(
        self,
        train_iter,
        val_iter,
        vocab,
        lr=0.001,
        n_gram=4,
        loss_fn=None,
        negative_samples=5,
        log_frequency=1000,
    ):
        self.train_iter = train_iter
        self.val_iter = val_iter
        self.n_gram = n_gram
        self.lr = lr
        self.vocab = vocab
        self.vocab_size = len(self.vocab.itos)
        self.negative_samples = negative_samples
        self.log_frequency = log_frequency
        self.vocab.freqs["<pad>"] = 0  # Don't want to negative sample pads.
        sorted_freqs = torch.tensor(
            [self.vocab.freqs.get(key, 0) for key in self.vocab.itos]
        )
        self.sampling = torch.pow(sorted_freqs, 0.75)
        self.sampling = self.sampling / torch.sum(self.sampling)
        if use_cuda:
            self.sampling = self.sampling.cuda()

    def train_model(self, model, num_epochs=100, path="./", save_model=False):
        pass


### TrainerWordSimilarity

The `TrainerWordSimilarity` class inherits from `Trainer`.

The **constructor** takes in input the following parameters:

* `train_iter: LazyDatasetLoader`: an instance of `LazyDatasetLoader` for the training dataset.
* `val_iter: LazyDatasetLoader`: an instance of `LazyDatasetLoader` for the validation dataset.
* `vocab`: an instance of `torchtext.data.Field()`. This object has two main attributes: `stoi` (string-to-index) and `itos` (index-to-string).
`vocab.stoi` is a dictionary that maps each unique word in the vocabulary to a unique index. This is useful for converting a sequence of words into a sequence of indices, which can then be input to a model. `vocab.itos` is a list where the element at each index is the corresponding word. This is useful for converting a sequence of indices back into a sequence of words.
* `lr: float`: the learning rate. By default it is set to `0.001`.
* `n_gram: int`: the window size. By default is set to `4`.
* `loss_fn: str`: the loss function. It can be `'nce'`, `'nll'` or `'max-margin'`. By default it is set to `'max_margin'`
* `negative_samples: int`: number of negative samples for training. By default it is set to `5`.
* `log_frequency: int`: the log frequency.  By default it is set to `1000`.
* `model_mode: str`: indicates whether the model is based on *SkipGram* or *CBOW*. It can be `'SkipGram'` or `'CBOW'`. By default it is set to `'CBOW'`.
* `margin: float`: the margin applicable for `max_margin` loss. By default is set to `0.0`.
* `similarity_datasets_dir`: path to the similarity datasets dir.
* `subsampling_prob`: it's not used since the `training` method passes a `None` value.

It performs the negative sampling calling `RandomNegativeCBOW` or `RandomNegativeSkipGram`, depending on the value of `model_mode`.

**Methods**

* `train_model(model, num_epochs, path, save_model)`: this method is used to train a given model for a specified number of epochs. It uses the Adam optimizer and a custom loss function. The method also includes a training loop that iterates over the training data, computes the loss, and updates the model parameters. It also checks for NaN or infinite values in the loss or parameters, which could indicate problems with training. The method also includes code for intermediate and final evaluation of the model. It computes a metric (spearman correlation) on the **Simlex-999** dataset and keeps track of the best value of this metric. If the `save_model` flag is set, it saves the current model and the best model (according to the metric) to disk.

* `model_eval(model)`:
  - If the `similarity_datasets_dir` is not `None`, it evaluates the model using `model.word_similarity` on each row of each dataset present in the folder.
  - Then, it computes the `spearmanr` correlation between the predicted scores and the target scores.
  - It returns a dictionary containing the correlation value obtained for each similarity dataset.
* `to(batch, device)`: calls the `.to(device)` method for each item in the batch.


In [None]:
# train/Trainer.py

class TrainerWordSimilarity(Trainer):
    """TrainerWordSimilarity"""

    def __init__(
        self,
        train_iter,
        val_iter,
        vocab,
        lr=0.001,
        n_gram=4,
        loss_fn="max_margin",
        negative_samples=5,
        model_mode="CBOW",
        log_frequency=1000,
        margin=0.0,
        similarity_datasets_dir=None,
        subsampling_prob=None,
    ):
        super(TrainerWordSimilarity, self).__init__(
            train_iter,
            val_iter,
            vocab,
            lr=lr,
            n_gram=n_gram,
            loss_fn=loss_fn,
            negative_samples=negative_samples,
            log_frequency=log_frequency,
        )

        self.similarity_datasets_dir = similarity_datasets_dir
        self.margin = margin
        self.loss_fn = criterions[loss_fn]
        # If subsampling has been done earlier, then word count must have been changed
        # This is an expected word count based on the subsampling prob parameters.
        if subsampling_prob != None:
          # it never enters here because the value of subsampling_prob is always == None
            self.sampling = (
                torch.min(torch.tensor(1.0).to(device), 1 - subsampling_prob.to(device))
                * self.sampling
            )
        if model_mode == "CBOW":
            self.add_negatives = RandomNegativeCBOW(negative_samples, self.sampling)
        elif model_mode == "SkipGram":
            self.add_negatives = RandomNegativeSkipGram(negative_samples, self.sampling)

    def to(self, batch, device):
        for k, v in batch.items():
            batch[k] = batch[k].to(device)
        return batch

    def train_model(
        self, model, num_epochs=100, path="./checkpoints", save_model=False
    ):
        ## Setting up the optimizers
        parameters = filter(lambda p: p.requires_grad, model.parameters())
        optimizer = torch.optim.Adam(params=parameters, lr=self.lr)
        metric = {}
        best_simlex_ws = -1
        ## Setting Up the loss function
        for epoch in tqdm(range(num_epochs)):
            epoch_loss = []
            model.train()

            for i, batch in enumerate(tqdm(self.train_iter)):
                # Create negative samples for the batch
                batch = self.to(batch, device)
                batch = self.add_negatives(batch)

                # Start the optimization
                optimizer.zero_grad()
                score = model.forward(
                    batch["center_word"],
                    batch["context_words"],
                    batch["context_mask"],
                    train=True,
                )
                assert (
                    score.shape[-1] == self.negative_samples + 1
                )  # check the shape of the score

                # Score log_intersection_volume (un-normalised) for Word2Box
                pos_score = score[..., 0].reshape(
                    -1, 1
                )  # The first element correspond to the Positive
                neg_score = score[..., 1:].reshape(
                    -1, self.negative_samples
                )  # The rest of the elements are for negative samples
                # Calculate Loss
                loss = self.loss_fn(
                    pos_score, neg_score, margin=self.margin
                )  # Margin is not required for nll or nce
                # Handled through kwargs in loss.py
                total_loss = torch.sum(loss)
                avg_loss = torch.mean(loss)
                if torch.isnan(loss).any():
                    raise RuntimeError("Loss value is nan :(")

                # Back-propagation
                total_loss.backward()
                optimizer.step()

                for param in model.parameters():
                    if torch.isinf(param).any():
                        raise RuntimeError("parameters went to infinity")
                    if torch.isnan(param).any():
                        raise RuntimeError("parameters went to nan")
                    if param.grad is not None:
                        if torch.isnan(param.grad).any():
                            raise RuntimeError("Gradient went to nan")

                epoch_loss.append(avg_loss.data.item())

                # Intermediate eval
                if i % int(len(self.train_iter) / self.log_frequency) == 0:
                    # Start model eval
                    model.eval()
                    ws_metric = self.model_eval(
                        model
                    )  # This ws_metric contains correlations
                    metric.update({"epoch_loss": np.mean(epoch_loss)})
                    # Update the metric for wandb login
                    metric.update(ws_metric)

                    simlex_ws = metric["En-Simlex-999.Txt"]
                    best_simlex_ws = max(metric["En-Simlex-999.Txt"], best_simlex_ws)
                    metric.update({"best_simlex_ws": best_simlex_ws})
                    print(
                        "Epoch {0} | Loss: {1}| spearmanr: {2}".format(
                            epoch + 1, np.mean(epoch_loss), simlex_ws
                        )
                    )

                    if save_model:
                        model.save_checkpoint(Path(path) / "model.ckpt")
                        # savethe best hyperparameter
                        if simlex_ws == best_simlex_ws:
                            model.save_checkpoint(Path(path) / "best_model.ckpt")

                    model.train()

            # Logging training loss
            metric.update({"epoch_loss": np.mean(epoch_loss)})

            model.eval()
            ws_metric = self.model_eval(model)  # This ws_metric contains correlations

            # Update the metric
            metric.update(ws_metric)

            simlex_ws = metric["En-Simlex-999.Txt"]
            best_simlex_ws = max(simlex_ws, best_simlex_ws)
            metric.update({"best_simlex_ws": best_simlex_ws})
            print(
                "Epoch {0} | Loss: {1}| spearmanr: {2}".format(
                    epoch + 1, np.mean(epoch_loss), simlex_ws
                )
            )

            if save_model:
                model.save_checkpoint(Path(path) / "model.ckpt")
                # save the best hyperparameter
                if simlex_ws > best_simlex_ws:
                    model.save_checkpoint(Path(path) / "best_model.ckpt")

        print("Model trained.")
        print("Output saved.")

    def model_eval(self, model):
        if self.similarity_datasets_dir == None:
            return 0

        metrics = {}
        correlation = 0.0

        # similarity_file is expected to be in the format word1\tword2\tscore
        if self.similarity_datasets_dir is not None and self.vocab is not None:
            file_list = os.listdir(self.similarity_datasets_dir)
            for file in file_list:
                with xopen(os.path.join(self.similarity_datasets_dir, file)) as f:
                    reader = csv.reader(f, delimiter="\t")
                    real_scores = []
                    predicted_scores = []
                    missing_count = 0
                    total_count = 0
                    for row in reader:
                        row[0] = row[0].lower()
                        row[1] = row[1].lower()
                        if (
                            self.vocab.stoi.get(row[0], "<unk>") != "<unk>"
                            and self.vocab.stoi.get(row[1], "<unk>") != "<unk>"
                        ):
                            word1 = (
                                torch.tensor(self.vocab.stoi[row[0]], dtype=int)
                                .unsqueeze(0)
                                .to(device)
                            )
                            word2 = (
                                torch.tensor(self.vocab.stoi[row[1]], dtype=int)
                                .unsqueeze(0)
                                .to(device)
                            )
                            score = model.word_similarity(word1, word2)
                            if file.title() == "Hyperlex-Dev.Txt":
                                score = model.conditional_similarity(word1, word2)

                            predicted_scores.append(score.item())
                            real_scores.append(float(row[2]))
                        else:
                            missing_count += 1
                        total_count += 1
                    # print(f"{file.title()} missing data point: {missing_count} out of {total_count}")
                    # Calculate spearman's rank correlation coefficient between predicted scores and real scores
                    correlation = spearmanr(predicted_scores, real_scores)[0]
                    metrics[file.title()] = correlation

        pprint.pprint(metrics, width=1)

        return metrics


### Train

The `training` function takes in input the configurations needed to run the training of the model.
- It creates a `LazyDatasetLoader` instance for the training dataset.
- It creates the correct model, based on the `config["model_type"]` value.
- It creates a trainer of type `TrainerWordSimilarity`.
- It calls the method `trainer.train_model` to start the model's training.

I've changed the function to make it return the `model` and the `trainer`, since we will need them later to test the model.

In [None]:
# train/train.py

def training(config):

    # Set the seed
    if config["seed"] is None:
        config["seed"] = random.randint(0, 2**32)
    torch.manual_seed(config["seed"])
    random.seed(config["seed"])

    # get_iter_on_device is defined in datasets.utils
    TEXT, train_iter = get_iter_on_device(
        config["batch_size"],
        config["dataset"],
        config["n_gram"],
        config["subsample_thresh"],
        config["data_device"],
        config["eos_mask"],
    )

    if config["model_type"] == "Word2Box":
        model = Word2Box(
            TEXT=TEXT,
            embedding_dim=config["embedding_dim"],
            batch_size=config["batch_size"],
            n_gram=config["n_gram"],
            intersection_temp=config["int_temp"],
            volume_temp=config["vol_temp"],
            box_type=config["box_type"],
            pooling=config["pooling"],
        )


    elif config["model_type"] == "Word2BoxConjunction":
        model = Word2BoxConjunction(
            TEXT=TEXT,
            embedding_dim=config["embedding_dim"],
            batch_size=config["batch_size"],
            n_gram=config["n_gram"],
            intersection_temp=config["int_temp"],
            volume_temp=config["vol_temp"],
            box_type=config["box_type"],
        )

    else:
        raise ValueError("Model type is not valid. Please enter a valid model type")

    if use_cuda:
        model.cuda()

    # Instance of trainer
    if config["model_type"] == "Word2Box":
        model_mode = "SkipGram"
    elif config["model_type"] == "Word2BoxConjunction":
        model_mode = "CBOW"

    trainer = TrainerWordSimilarity(
        train_iter=train_iter,
        val_iter=None,
        vocab=TEXT,
        lr=config["lr"],
        n_gram=config["n_gram"],
        loss_fn=config["loss_fn"],
        negative_samples=config["negative_samples"],
        model_mode=model_mode,
        log_frequency=config["log_frequency"],
        margin=config["margin"],
        similarity_datasets_dir=config["eval_file"],
        subsampling_prob=None,  # pass: subsampling_prob, when you want to adjust neg_sampling distn
    )

    trainer.train_model(
        model=model,
        num_epochs=config["num_epochs"],
        path=config.get("save_dir", False),
        save_model=config.get("save_model", False),
    )

    return model, trainer


# 4 - Training

In this section, we are going to train the `Word2BoxConjunction` model. But before doing so, we need to do the following things:
- upload the training dataset on Colab. We can use an extract of the **Enwik8** dataset (https://mattmahoney.net/dc/textdata.html), that I previously preprocessed. Alternatively, we can use the dataset provided by the repository, which is a part of the **Penn Treebank** dataset and it's located in the `word2box/data/ptb` folder.
- create the `config` dictionary, which contains the configuration needed by the training algorithm.
- create the directory where the trained models will be saved.


## 4.1 Upload the dataset on Colab
> If you want to train the model using the **Penn Treebank** dataset, you can skip this section.


Before uploading the dataset, we need to create the folder where we will save the training data. To do so, we need to keep in mind the following requirements:
- the folder should be placed inside the `data` directory;
- the folder name (in this case `enwik8-preprocessed`) should be passed as dataset name in the algorithm configurations.

In [None]:
! rm -rf /content/word2box/data/enwik8-preprocessed
! mkdir -p /content/word2box/data/enwik8-preprocessed

In [None]:
from google_drive_downloader import GoogleDriveDownloader as gdd

gdd.download_file_from_google_drive(file_id='1--8USt54TH2SDL2YzRoBm7EhoRE4EuX9',
                                    dest_path='/content/word2box/data/enwik8-preprocessed/train.txt')

## 4.2 Training parameters and configurations

I initialize most of the configuration parameters using the default values provided by the authors. I use only 1 training epoch due to time and hardware constraints.

In [None]:
SAVED_MODELS_DIR = '/content/saved-models'

# DATASET_NAME = 'ptb' # Penn Treebank
DATASET_NAME = 'enwik8-preprocessed' # Enwik8 preprocessed

config = {
    "alpha_dim": 32,
    "batch_size": 1024,
    "cuda": True,
    "data_device": 'cpu',
    "diag_context": 0,
    "eval_file": "./data/similarity_datasets/",
    "eos_mask": True,
    "embedding_dim": 50,
    "int_temp": 0.01,
    "log_frequency": 100,
    "lr": 0.01,
    "margin": 10,
    "n_gram": 4,
    "negative_samples": 5,
    "sep_output": 0,
    "subsample_thresh": 0.001,
    "vol_temp": 1.0,

    "num_epochs": 1,
    "box_type": "DeltaBoxTensor",
    "dataset": DATASET_NAME,
    "loss_fn": "nce",
    "model_type": "Word2BoxConjunction",
    "save_dir": SAVED_MODELS_DIR,
    "save_model": True,
    "seed": 5,
}

In [None]:
# create the save_dir folder
! mkdir /content/saved-models

%cd /content/word2box
model, trainer = training(config)

/content/word2box
Loading VOCAB & Tokenized Training files ...
Creating iterable dataset ...
Creating iterable dataset on GPU/CPU...


  0%|          | 0/1 [00:00<?, ?it/s]


{'En-Mc-30.Txt': -0.015522977529359997,
 'En-Men-Tr-3K.Txt': 0.016517837806777076,
 'En-Mturk-287.Txt': 0.0016186324829713758,
 'En-Mturk-771.Txt': 0.05519493695610718,
 'En-Rg-65.Txt': -0.001639452637944709,
 'En-Rw-Stanford.Txt': 0.033186378737330956,
 'En-Simlex-999.Txt': -0.04636905200715898,
 'En-Simverb-3500.Txt': 0.018994889783990804,
 'En-Verb-143.Txt': 0.03267956472000217,
 'En-Ws-353-All.Txt': -0.04734796087208728,
 'En-Ws-353-Rel.Txt': -0.09566764781278951,
 'En-Ws-353-Sim.Txt': 0.024757073121765068,
 'En-Yp-130.Txt': -0.14308218803755457}
Epoch 1 | Loss: 39.35222244262695| spearmanr: -0.04636905200715898



  0%|          | 1/7370 [00:13<27:50:05, 13.60s/it][A
  0%|          | 3/7370 [00:13<7:20:19,  3.59s/it] [A
  0%|          | 5/7370 [00:13<3:38:48,  1.78s/it][A
  0%|          | 7/7370 [00:14<2:10:24,  1.06s/it][A
  0%|          | 9/7370 [00:14<1:24:54,  1.44it/s][A
  0%|          | 11/7370 [00:14<58:42,  2.09it/s] [A
  0%|          | 13/7370 [00:14<42:33,  2.88it/s][A
  0%|          | 15/7370 [00:14<31:02,  3.95it/s][A
  0%|          | 18/7370 [00:14<20:40,  5.93it/s][A
  0%|          | 20/7370 [00:14<16:37,  7.37it/s][A
  0%|          | 23/7370 [00:15<12:34,  9.74it/s][A
  0%|          | 25/7370 [00:15<10:56, 11.20it/s][A
  0%|          | 28/7370 [00:15<09:16, 13.20it/s][A
  0%|          | 30/7370 [00:15<08:28, 14.43it/s][A
  0%|          | 33/7370 [00:15<07:31, 16.26it/s][A
  0%|          | 36/7370 [00:15<06:59, 17.48it/s][A
  1%|          | 39/7370 [00:15<06:34, 18.60it/s][A
  1%|          | 42/7370 [00:15<06:16, 19.47it/s][A
  1%|          | 45/7370 [00:16<06:21

{'En-Mc-30.Txt': 0.08426759230224,
 'En-Men-Tr-3K.Txt': 0.032680797132741565,
 'En-Mturk-287.Txt': 0.1835035184732608,
 'En-Mturk-771.Txt': -0.001891978299496464,
 'En-Rg-65.Txt': -0.05952270787118613,
 'En-Rw-Stanford.Txt': -0.08810699320690556,
 'En-Simlex-999.Txt': -0.0833181369461044,
 'En-Simverb-3500.Txt': -0.008612081631543705,
 'En-Verb-143.Txt': 0.29526614188151523,
 'En-Ws-353-All.Txt': -0.0641912552397264,
 'En-Ws-353-Rel.Txt': -0.002437886494290873,
 'En-Ws-353-Sim.Txt': -0.039888671659203824,
 'En-Yp-130.Txt': 0.04023926245252218}
Epoch 1 | Loss: 28.16942441785658| spearmanr: -0.0833181369461044



  1%|          | 78/7370 [00:28<1:39:49,  1.22it/s][A
  1%|          | 81/7370 [00:28<1:11:27,  1.70it/s][A
  1%|          | 84/7370 [00:29<51:39,  2.35it/s]  [A
  1%|          | 87/7370 [00:29<37:47,  3.21it/s][A
  1%|          | 90/7370 [00:29<28:02,  4.33it/s][A
  1%|▏         | 93/7370 [00:29<21:23,  5.67it/s][A
  1%|▏         | 96/7370 [00:29<16:51,  7.19it/s][A
  1%|▏         | 99/7370 [00:29<13:25,  9.02it/s][A
  1%|▏         | 102/7370 [00:29<11:04, 10.94it/s][A
  1%|▏         | 105/7370 [00:30<09:25, 12.85it/s][A
  1%|▏         | 108/7370 [00:30<08:19, 14.54it/s][A
  2%|▏         | 111/7370 [00:30<07:26, 16.27it/s][A
  2%|▏         | 114/7370 [00:30<06:56, 17.41it/s][A
  2%|▏         | 117/7370 [00:30<06:43, 17.96it/s][A
  2%|▏         | 120/7370 [00:30<06:17, 19.20it/s][A
  2%|▏         | 123/7370 [00:30<06:04, 19.89it/s][A
  2%|▏         | 126/7370 [00:30<05:55, 20.38it/s][A
  2%|▏         | 129/7370 [00:31<05:47, 20.81it/s][A
  2%|▏         | 132/7370 [00

{'En-Mc-30.Txt': -0.055192808993279996,
 'En-Men-Tr-3K.Txt': 0.04749468563171436,
 'En-Mturk-287.Txt': 0.18320619724520243,
 'En-Mturk-771.Txt': 0.018872373693271362,
 'En-Rg-65.Txt': -0.11317511758715088,
 'En-Rw-Stanford.Txt': -0.05703685960149499,
 'En-Simlex-999.Txt': -0.07937292028048337,
 'En-Simverb-3500.Txt': -0.0130754469028293,
 'En-Verb-143.Txt': 0.21273420985342853,
 'En-Ws-353-All.Txt': -0.04083244539824313,
 'En-Ws-353-Rel.Txt': 0.034243065177351595,
 'En-Ws-353-Sim.Txt': -0.034386412737646276,
 'En-Yp-130.Txt': 0.049522362846506025}
Epoch 1 | Loss: 22.508932249886648| spearmanr: -0.07937292028048337



  2%|▏         | 150/7370 [00:43<1:37:51,  1.23it/s][A
  2%|▏         | 153/7370 [00:43<1:10:06,  1.72it/s][A
  2%|▏         | 156/7370 [00:43<50:40,  2.37it/s]  [A
  2%|▏         | 159/7370 [00:43<37:03,  3.24it/s][A
  2%|▏         | 162/7370 [00:43<27:39,  4.34it/s][A
  2%|▏         | 165/7370 [00:43<20:59,  5.72it/s][A
  2%|▏         | 168/7370 [00:43<16:22,  7.33it/s][A
  2%|▏         | 171/7370 [00:44<13:11,  9.09it/s][A
  2%|▏         | 174/7370 [00:44<10:51, 11.05it/s][A
  2%|▏         | 177/7370 [00:44<09:13, 12.99it/s][A
  2%|▏         | 180/7370 [00:44<08:12, 14.60it/s][A
  2%|▏         | 183/7370 [00:44<07:34, 15.82it/s][A
  3%|▎         | 186/7370 [00:44<07:04, 16.92it/s][A
  3%|▎         | 189/7370 [00:44<06:35, 18.14it/s][A
  3%|▎         | 192/7370 [00:45<06:23, 18.73it/s][A
  3%|▎         | 195/7370 [00:45<06:09, 19.44it/s][A
  3%|▎         | 198/7370 [00:45<05:55, 20.20it/s][A
  3%|▎         | 201/7370 [00:45<05:47, 20.61it/s][A
  3%|▎         | 204/

{'En-Mc-30.Txt': -0.04582974318192,
 'En-Men-Tr-3K.Txt': 0.03731127260151803,
 'En-Mturk-287.Txt': 0.17208638331582038,
 'En-Mturk-771.Txt': 0.030730869377538262,
 'En-Rg-65.Txt': -0.10981688395781253,
 'En-Rw-Stanford.Txt': -0.07597118893414309,
 'En-Simlex-999.Txt': -0.05727529447725836,
 'En-Simverb-3500.Txt': -0.007193143987253322,
 'En-Verb-143.Txt': 0.16836107889510438,
 'En-Ws-353-All.Txt': -0.06836832503009081,
 'En-Ws-353-Rel.Txt': 0.03575251130798222,
 'En-Ws-353-Sim.Txt': -0.07907158855648268,
 'En-Yp-130.Txt': 0.08309395202367238}
Epoch 1 | Loss: 18.922114896774293| spearmanr: -0.05727529447725836


  3%|▎         | 222/7370 [00:57<2:18:15,  1.16s/it][A
  3%|▎         | 225/7370 [00:57<1:38:26,  1.21it/s][A
  3%|▎         | 228/7370 [00:57<1:10:29,  1.69it/s][A
  3%|▎         | 231/7370 [00:57<50:56,  2.34it/s]  [A
  3%|▎         | 234/7370 [00:58<37:22,  3.18it/s][A
  3%|▎         | 237/7370 [00:58<27:45,  4.28it/s][A
  3%|▎         | 240/7370 [00:58<21:07,  5.63it/s][A
  3%|▎         | 243/7370 [00:58<16:26,  7.22it/s][A
  3%|▎         | 246/7370 [00:58<13:18,  8.93it/s][A
  3%|▎         | 249/7370 [00:58<10:56, 10.84it/s][A
  3%|▎         | 252/7370 [00:58<09:23, 12.63it/s][A
  3%|▎         | 255/7370 [00:59<08:23, 14.13it/s][A
  4%|▎         | 258/7370 [00:59<07:32, 15.71it/s][A
  4%|▎         | 261/7370 [00:59<07:02, 16.82it/s][A
  4%|▎         | 264/7370 [00:59<06:37, 17.89it/s][A
  4%|▎         | 267/7370 [00:59<06:16, 18.88it/s][A
  4%|▎         | 270/7370 [00:59<06:07, 19.33it/s][A
  4%|▎         | 273/7370 [00:59<05:51, 20.19it/s][A
  4%|▎         | 276

{'En-Mc-30.Txt': -0.0012319823435999998,
 'En-Men-Tr-3K.Txt': 0.036197596483796786,
 'En-Mturk-287.Txt': 0.16341036651115748,
 'En-Mturk-771.Txt': 0.03789988537606536,
 'En-Rg-65.Txt': -0.10111720786291237,
 'En-Rw-Stanford.Txt': -0.09319235937937928,
 'En-Simlex-999.Txt': -0.05239542152317089,
 'En-Simverb-3500.Txt': -0.01333882953867641,
 'En-Verb-143.Txt': 0.23375835375473022,
 'En-Ws-353-All.Txt': -0.061787955090395356,
 'En-Ws-353-Rel.Txt': 0.04961644149631373,
 'En-Ws-353-Sim.Txt': -0.07666367389777914,
 'En-Yp-130.Txt': 0.08319390465240976}
Epoch 1 | Loss: 16.382383976373248| spearmanr: -0.05239542152317089



  4%|▍         | 294/7370 [01:13<2:38:37,  1.35s/it][A
  4%|▍         | 296/7370 [01:14<2:05:23,  1.06s/it][A
  4%|▍         | 298/7370 [01:14<1:37:13,  1.21it/s][A
  4%|▍         | 300/7370 [01:14<1:14:24,  1.58it/s][A
  4%|▍         | 302/7370 [01:14<56:42,  2.08it/s]  [A
  4%|▍         | 304/7370 [01:14<43:20,  2.72it/s][A
  4%|▍         | 306/7370 [01:14<33:37,  3.50it/s][A
  4%|▍         | 308/7370 [01:14<26:34,  4.43it/s][A
  4%|▍         | 310/7370 [01:15<21:28,  5.48it/s][A
  4%|▍         | 312/7370 [01:15<17:53,  6.57it/s][A
  4%|▍         | 314/7370 [01:15<15:13,  7.72it/s][A
  4%|▍         | 316/7370 [01:15<13:13,  8.88it/s][A
  4%|▍         | 318/7370 [01:15<12:01,  9.78it/s][A
  4%|▍         | 320/7370 [01:15<11:18, 10.39it/s][A
  4%|▍         | 322/7370 [01:16<10:25, 11.26it/s][A
  4%|▍         | 325/7370 [01:16<08:30, 13.79it/s][A
  4%|▍         | 328/7370 [01:16<07:24, 15.85it/s][A
  4%|▍         | 331/7370 [01:16<06:42, 17.47it/s][A
  5%|▍         | 

{'En-Mc-30.Txt': -0.027596404496639998,
 'En-Men-Tr-3K.Txt': 0.04492964951077989,
 'En-Mturk-287.Txt': 0.18752403495629116,
 'En-Mturk-771.Txt': 0.02124380160986334,
 'En-Rg-65.Txt': -0.06536656324192454,
 'En-Rw-Stanford.Txt': -0.09235379357292155,
 'En-Simlex-999.Txt': -0.037841019697500325,
 'En-Simverb-3500.Txt': -0.011430556930702707,
 'En-Verb-143.Txt': 0.20338254837064176,
 'En-Ws-353-All.Txt': -0.04224833630498792,
 'En-Ws-353-Rel.Txt': 0.054800955669425,
 'En-Ws-353-Sim.Txt': -0.06045141253758996,
 'En-Yp-130.Txt': 0.08701292800875034}
Epoch 1 | Loss: 14.505314010088561| spearmanr: -0.037841019697500325



  5%|▍         | 367/7370 [01:30<2:35:40,  1.33s/it][A
  5%|▌         | 370/7370 [01:31<1:50:24,  1.06it/s][A
  5%|▌         | 373/7370 [01:31<1:18:50,  1.48it/s][A
  5%|▌         | 376/7370 [01:31<56:47,  2.05it/s]  [A
  5%|▌         | 379/7370 [01:31<41:17,  2.82it/s][A
  5%|▌         | 382/7370 [01:31<30:29,  3.82it/s][A
  5%|▌         | 385/7370 [01:31<22:54,  5.08it/s][A
  5%|▌         | 388/7370 [01:31<17:41,  6.58it/s][A
  5%|▌         | 391/7370 [01:32<13:57,  8.34it/s][A
  5%|▌         | 394/7370 [01:32<11:20, 10.26it/s][A
  5%|▌         | 397/7370 [01:32<09:32, 12.18it/s][A
  5%|▌         | 400/7370 [01:32<08:27, 13.73it/s][A
  5%|▌         | 403/7370 [01:32<07:29, 15.51it/s][A
  6%|▌         | 406/7370 [01:32<06:56, 16.73it/s][A
  6%|▌         | 409/7370 [01:32<06:28, 17.91it/s][A
  6%|▌         | 412/7370 [01:33<06:08, 18.90it/s][A
  6%|▌         | 415/7370 [01:33<05:53, 19.68it/s][A
  6%|▌         | 418/7370 [01:33<05:46, 20.08it/s][A
  6%|▌         | 42

{'En-Mc-30.Txt': -0.04582974318192,
 'En-Men-Tr-3K.Txt': 0.04762983693606284,
 'En-Mturk-287.Txt': 0.18517532823314478,
 'En-Mturk-771.Txt': 0.009691418673629165,
 'En-Rg-65.Txt': -0.047570569284879544,
 'En-Rw-Stanford.Txt': -0.0825938237337264,
 'En-Simlex-999.Txt': -0.029304549187209573,
 'En-Simverb-3500.Txt': -0.015306203370961885,
 'En-Verb-143.Txt': 0.14354398154078424,
 'En-Ws-353-All.Txt': -0.04811574930985884,
 'En-Ws-353-Rel.Txt': 0.07371961587757991,
 'En-Ws-353-Sim.Txt': -0.08480574937133432,
 'En-Yp-130.Txt': 0.07499778909594487}
Epoch 1 | Loss: 13.081641162446529| spearmanr: -0.029304549187209573



  6%|▌         | 439/7370 [01:45<2:13:02,  1.15s/it][A
  6%|▌         | 442/7370 [01:45<1:34:44,  1.22it/s][A
  6%|▌         | 445/7370 [01:45<1:07:50,  1.70it/s][A
  6%|▌         | 448/7370 [01:45<49:12,  2.34it/s]  [A
  6%|▌         | 451/7370 [01:45<36:01,  3.20it/s][A
  6%|▌         | 454/7370 [01:46<26:48,  4.30it/s][A
  6%|▌         | 457/7370 [01:46<20:21,  5.66it/s][A
  6%|▌         | 460/7370 [01:46<15:51,  7.26it/s][A
  6%|▋         | 463/7370 [01:46<12:40,  9.09it/s][A
  6%|▋         | 466/7370 [01:46<10:29, 10.96it/s][A
  6%|▋         | 469/7370 [01:46<08:52, 12.96it/s][A
  6%|▋         | 472/7370 [01:46<07:56, 14.49it/s][A
  6%|▋         | 475/7370 [01:47<07:12, 15.94it/s][A
  6%|▋         | 478/7370 [01:47<06:38, 17.28it/s][A
  7%|▋         | 481/7370 [01:47<06:08, 18.67it/s][A
  7%|▋         | 484/7370 [01:47<05:51, 19.59it/s][A
  7%|▋         | 487/7370 [01:47<05:42, 20.10it/s][A
  7%|▋         | 490/7370 [01:47<05:30, 20.82it/s][A
  7%|▋         | 49

{'En-Mc-30.Txt': -0.05272884430608,
 'En-Men-Tr-3K.Txt': 0.044135978929682754,
 'En-Mturk-287.Txt': 0.13354333583532296,
 'En-Mturk-771.Txt': 0.0007494900703616115,
 'En-Rg-65.Txt': -0.04680372853487315,
 'En-Rw-Stanford.Txt': -0.05656282678900459,
 'En-Simlex-999.Txt': -0.030815514487894283,
 'En-Simverb-3500.Txt': -0.00774954967022707,
 'En-Verb-143.Txt': 0.32352223732950536,
 'En-Ws-353-All.Txt': -0.06899151748831661,
 'En-Ws-353-Rel.Txt': 0.019035748085577665,
 'En-Ws-353-Sim.Txt': -0.04738915189464538,
 'En-Yp-130.Txt': 0.06369064797002916}
Epoch 1 | Loss: 11.974630227312446| spearmanr: -0.030815514487894283



  7%|▋         | 517/7370 [01:59<1:33:10,  1.23it/s][A
  7%|▋         | 520/7370 [02:00<1:06:45,  1.71it/s][A
  7%|▋         | 523/7370 [02:00<48:18,  2.36it/s]  [A
  7%|▋         | 526/7370 [02:00<35:21,  3.23it/s][A
  7%|▋         | 529/7370 [02:00<26:19,  4.33it/s][A
  7%|▋         | 532/7370 [02:00<20:00,  5.70it/s][A
  7%|▋         | 535/7370 [02:00<15:34,  7.31it/s][A
  7%|▋         | 538/7370 [02:00<12:40,  8.99it/s][A
  7%|▋         | 541/7370 [02:01<10:22, 10.97it/s][A
  7%|▋         | 544/7370 [02:01<08:57, 12.70it/s][A
  7%|▋         | 547/7370 [02:01<07:46, 14.63it/s][A
  7%|▋         | 550/7370 [02:01<07:03, 16.11it/s][A
  8%|▊         | 553/7370 [02:01<06:40, 17.02it/s][A
  8%|▊         | 556/7370 [02:01<06:24, 17.73it/s][A
  8%|▊         | 559/7370 [02:01<06:09, 18.43it/s][A
  8%|▊         | 562/7370 [02:02<05:52, 19.33it/s][A
  8%|▊         | 565/7370 [02:02<06:19, 17.95it/s][A
  8%|▊         | 568/7370 [02:02<05:57, 19.01it/s][A
  8%|▊         | 571/

{'En-Mc-30.Txt': -0.06775902889799999,
 'En-Men-Tr-3K.Txt': 0.04571839796938302,
 'En-Mturk-287.Txt': 0.14482582507979247,
 'En-Mturk-771.Txt': -0.006774119297924033,
 'En-Rg-65.Txt': -0.05431347932803923,
 'En-Rw-Stanford.Txt': -0.05761460279005529,
 'En-Simlex-999.Txt': -0.02171836823887775,
 'En-Simverb-3500.Txt': -0.010400913514419772,
 'En-Verb-143.Txt': 0.21757294050452206,
 'En-Ws-353-All.Txt': -0.06088426706736262,
 'En-Ws-353-Rel.Txt': 0.026398378918215736,
 'En-Ws-353-Sim.Txt': -0.04397478301904553,
 'En-Yp-130.Txt': 0.020157113462037607}
Epoch 1 | Loss: 11.089960466694627| spearmanr: -0.02171836823887775



  8%|▊         | 585/7370 [02:19<4:12:55,  2.24s/it][A
  8%|▊         | 587/7370 [02:19<3:01:56,  1.61s/it][A
  8%|▊         | 589/7370 [02:19<2:10:33,  1.16s/it][A
  8%|▊         | 592/7370 [02:19<1:22:01,  1.38it/s][A
  8%|▊         | 595/7370 [02:19<54:39,  2.07it/s]  [A
  8%|▊         | 598/7370 [02:19<38:00,  2.97it/s][A
  8%|▊         | 601/7370 [02:19<27:15,  4.14it/s][A
  8%|▊         | 604/7370 [02:20<20:22,  5.53it/s][A
  8%|▊         | 607/7370 [02:20<15:39,  7.20it/s][A
  8%|▊         | 610/7370 [02:20<12:29,  9.02it/s][A
  8%|▊         | 613/7370 [02:20<10:13, 11.02it/s][A
  8%|▊         | 616/7370 [02:20<08:40, 12.98it/s][A
  8%|▊         | 619/7370 [02:20<07:35, 14.81it/s][A
  8%|▊         | 622/7370 [02:20<06:46, 16.62it/s][A
  8%|▊         | 625/7370 [02:21<06:19, 17.76it/s][A
  9%|▊         | 628/7370 [02:21<06:10, 18.21it/s][A
  9%|▊         | 631/7370 [02:21<05:55, 18.95it/s][A
  9%|▊         | 634/7370 [02:21<05:44, 19.55it/s][A
  9%|▊         | 

{'En-Mc-30.Txt': 0.005667118780559999,
 'En-Men-Tr-3K.Txt': 0.037627335208557634,
 'En-Mturk-287.Txt': 0.10490331188391604,
 'En-Mturk-771.Txt': 0.015589143940881029,
 'En-Rg-65.Txt': -0.03450783375028783,
 'En-Rw-Stanford.Txt': -0.058185050013392636,
 'En-Simlex-999.Txt': -0.04816763186956716,
 'En-Simverb-3500.Txt': -0.010493697254246951,
 'En-Verb-143.Txt': 0.09289015406402076,
 'En-Ws-353-All.Txt': -0.050983938062133145,
 'En-Ws-353-Rel.Txt': 0.061794386861032995,
 'En-Ws-353-Sim.Txt': -0.061582320770554096,
 'En-Yp-130.Txt': 0.03105194999441165}
Epoch 1 | Loss: 10.363711568722247| spearmanr: -0.04816763186956716



  9%|▉         | 663/7370 [02:33<1:30:42,  1.23it/s][A
  9%|▉         | 666/7370 [02:33<1:04:44,  1.73it/s][A
  9%|▉         | 669/7370 [02:33<46:40,  2.39it/s]  [A
  9%|▉         | 672/7370 [02:34<34:13,  3.26it/s][A
  9%|▉         | 675/7370 [02:34<25:28,  4.38it/s][A
  9%|▉         | 678/7370 [02:34<19:19,  5.77it/s][A
  9%|▉         | 681/7370 [02:34<15:06,  7.38it/s][A
  9%|▉         | 684/7370 [02:34<12:08,  9.18it/s][A
  9%|▉         | 687/7370 [02:34<09:57, 11.18it/s][A
  9%|▉         | 690/7370 [02:34<08:28, 13.13it/s][A
  9%|▉         | 693/7370 [02:34<07:29, 14.87it/s][A
  9%|▉         | 696/7370 [02:35<06:43, 16.53it/s][A
  9%|▉         | 699/7370 [02:35<06:13, 17.84it/s][A
 10%|▉         | 702/7370 [02:35<05:55, 18.76it/s][A
 10%|▉         | 705/7370 [02:35<05:49, 19.07it/s][A
 10%|▉         | 708/7370 [02:35<05:34, 19.90it/s][A
 10%|▉         | 711/7370 [02:35<05:25, 20.46it/s][A
 10%|▉         | 714/7370 [02:35<05:16, 21.01it/s][A
 10%|▉         | 717/

{'En-Mc-30.Txt': -0.09855858748799999,
 'En-Men-Tr-3K.Txt': 0.03683574125529481,
 'En-Mturk-287.Txt': 0.14021485745311144,
 'En-Mturk-771.Txt': -0.012049814456660717,
 'En-Rg-65.Txt': -0.038104052439972995,
 'En-Rw-Stanford.Txt': -0.0553979921035665,
 'En-Simlex-999.Txt': -0.03966090224945068,
 'En-Simverb-3500.Txt': -2.4876348798352492e-05,
 'En-Verb-143.Txt': 0.10621174667950022,
 'En-Ws-353-All.Txt': -0.06964934812716754,
 'En-Ws-353-Rel.Txt': 0.01828364488671071,
 'En-Ws-353-Sim.Txt': -0.0988164887578517,
 'En-Yp-130.Txt': 0.043179535614546675}
Epoch 1 | Loss: 9.760335755902668| spearmanr: -0.03966090224945068



 10%|▉         | 734/7370 [02:47<1:39:54,  1.11it/s][A
 10%|█         | 737/7370 [02:48<1:09:15,  1.60it/s][A
 10%|█         | 740/7370 [02:48<50:01,  2.21it/s]  [A
 10%|█         | 742/7370 [02:48<39:43,  2.78it/s][A
 10%|█         | 745/7370 [02:48<28:12,  3.92it/s][A
 10%|█         | 748/7370 [02:48<20:45,  5.32it/s][A
 10%|█         | 751/7370 [02:48<15:48,  6.98it/s][A
 10%|█         | 754/7370 [02:48<12:29,  8.83it/s][A
 10%|█         | 757/7370 [02:49<10:10, 10.83it/s][A
 10%|█         | 760/7370 [02:49<08:34, 12.84it/s][A
 10%|█         | 763/7370 [02:49<07:32, 14.59it/s][A
 10%|█         | 766/7370 [02:49<06:47, 16.22it/s][A
 10%|█         | 769/7370 [02:49<06:15, 17.57it/s][A
 10%|█         | 772/7370 [02:49<05:48, 18.93it/s][A
 11%|█         | 775/7370 [02:49<05:32, 19.85it/s][A
 11%|█         | 778/7370 [02:50<05:28, 20.04it/s][A
 11%|█         | 781/7370 [02:50<05:22, 20.42it/s][A
 11%|█         | 784/7370 [02:50<05:15, 20.91it/s][A
 11%|█         | 787/

{'En-Mc-30.Txt': -0.06480227127336,
 'En-Men-Tr-3K.Txt': 0.03424383816521361,
 'En-Mturk-287.Txt': 0.1660377955132942,
 'En-Mturk-771.Txt': -0.03589509300883823,
 'En-Rg-65.Txt': -0.045481589310724185,
 'En-Rw-Stanford.Txt': -0.052714886821401144,
 'En-Simlex-999.Txt': -0.005991113031153622,
 'En-Simverb-3500.Txt': -0.011008804225875152,
 'En-Verb-143.Txt': 0.15664032983667833,
 'En-Ws-353-All.Txt': -0.02689890559962849,
 'En-Ws-353-Rel.Txt': 0.029259273079822186,
 'En-Ws-353-Sim.Txt': -0.025183772605426178,
 'En-Yp-130.Txt': 0.03446283345007463}
Epoch 1 | Loss: 9.514884224875056| spearmanr: -0.005991113031153622



 11%|█         | 805/7370 [03:02<2:05:34,  1.15s/it][A
 11%|█         | 807/7370 [03:02<1:38:54,  1.11it/s][A
 11%|█         | 810/7370 [03:02<1:08:41,  1.59it/s][A
 11%|█         | 813/7370 [03:02<48:41,  2.24it/s]  [A
 11%|█         | 816/7370 [03:02<35:06,  3.11it/s][A
 11%|█         | 819/7370 [03:02<25:59,  4.20it/s][A
 11%|█         | 822/7370 [03:03<19:35,  5.57it/s][A
 11%|█         | 825/7370 [03:03<15:15,  7.15it/s][A
 11%|█         | 828/7370 [03:03<12:25,  8.78it/s][A
 11%|█▏        | 831/7370 [03:03<10:10, 10.71it/s][A
 11%|█▏        | 834/7370 [03:03<08:36, 12.67it/s][A
 11%|█▏        | 837/7370 [03:03<07:31, 14.48it/s][A
 11%|█▏        | 840/7370 [03:03<06:50, 15.92it/s][A
 11%|█▏        | 843/7370 [03:04<06:14, 17.43it/s][A
 11%|█▏        | 846/7370 [03:04<05:52, 18.50it/s][A
 12%|█▏        | 849/7370 [03:04<05:45, 18.87it/s][A
 12%|█▏        | 852/7370 [03:04<05:31, 19.67it/s][A
 12%|█▏        | 855/7370 [03:04<05:25, 20.00it/s][A
 12%|█▏        | 85

{'En-Mc-30.Txt': -0.17469509632247998,
 'En-Men-Tr-3K.Txt': 0.020933509732785658,
 'En-Mturk-287.Txt': 0.10561688283125606,
 'En-Mturk-771.Txt': -0.005319727768175495,
 'En-Rg-65.Txt': -0.06623917512986284,
 'En-Rw-Stanford.Txt': -0.054030390850420246,
 'En-Simlex-999.Txt': 0.0069452763801586895,
 'En-Simverb-3500.Txt': -0.005142564168931948,
 'En-Verb-143.Txt': 0.2601380835761741,
 'En-Ws-353-All.Txt': -0.04262906149883779,
 'En-Ws-353-Rel.Txt': 0.054271339590480104,
 'En-Ws-353-Sim.Txt': -0.05603831950820361,
 'En-Yp-130.Txt': 0.06389471792036798}
Epoch 1 | Loss: 9.183371411489924| spearmanr: 0.0069452763801586895



 12%|█▏        | 877/7370 [03:16<2:33:23,  1.42s/it][A
 12%|█▏        | 879/7370 [03:16<1:54:37,  1.06s/it][A
 12%|█▏        | 882/7370 [03:16<1:14:59,  1.44it/s][A
 12%|█▏        | 885/7370 [03:17<51:08,  2.11it/s]  [A
 12%|█▏        | 888/7370 [03:17<36:00,  3.00it/s][A
 12%|█▏        | 891/7370 [03:17<26:12,  4.12it/s][A
 12%|█▏        | 894/7370 [03:17<19:35,  5.51it/s][A
 12%|█▏        | 897/7370 [03:17<15:07,  7.14it/s][A
 12%|█▏        | 900/7370 [03:17<12:06,  8.90it/s][A
 12%|█▏        | 903/7370 [03:17<09:52, 10.92it/s][A
 12%|█▏        | 906/7370 [03:18<08:25, 12.78it/s][A
 12%|█▏        | 909/7370 [03:18<07:51, 13.69it/s][A
 12%|█▏        | 911/7370 [03:18<07:50, 13.74it/s][A
 12%|█▏        | 913/7370 [03:18<07:58, 13.50it/s][A
 12%|█▏        | 915/7370 [03:18<08:02, 13.37it/s][A
 12%|█▏        | 917/7370 [03:18<08:08, 13.20it/s][A
 12%|█▏        | 919/7370 [03:18<08:07, 13.24it/s][A
 12%|█▏        | 921/7370 [03:19<08:09, 13.18it/s][A
 13%|█▎        | 92

{'En-Mc-30.Txt': 0.023900457465839995,
 'En-Men-Tr-3K.Txt': 0.020279380427799414,
 'En-Mturk-287.Txt': 0.07348497168159561,
 'En-Mturk-771.Txt': 0.009518868343324449,
 'En-Rg-65.Txt': -0.015971441827719424,
 'En-Rw-Stanford.Txt': -0.06195102791356305,
 'En-Simlex-999.Txt': -0.027491108297348892,
 'En-Simverb-3500.Txt': -0.01200795601394153,
 'En-Verb-143.Txt': 0.16976684478251686,
 'En-Ws-353-All.Txt': 0.007123304996430261,
 'En-Ws-353-Rel.Txt': 0.12160311648239928,
 'En-Ws-353-Sim.Txt': -0.021409182788909527,
 'En-Yp-130.Txt': 0.051246545692225774}
Epoch 1 | Loss: 8.846037524624874| spearmanr: -0.027491108297348892



 13%|█▎        | 952/7370 [03:31<1:43:24,  1.03it/s][A
 13%|█▎        | 954/7370 [03:31<1:18:42,  1.36it/s][A
 13%|█▎        | 956/7370 [03:32<59:41,  1.79it/s]  [A
 13%|█▎        | 958/7370 [03:32<45:32,  2.35it/s][A
 13%|█▎        | 960/7370 [03:32<34:48,  3.07it/s][A
 13%|█▎        | 962/7370 [03:32<27:07,  3.94it/s][A
 13%|█▎        | 964/7370 [03:32<21:40,  4.93it/s][A
 13%|█▎        | 966/7370 [03:32<17:17,  6.17it/s][A
 13%|█▎        | 969/7370 [03:32<12:27,  8.56it/s][A
 13%|█▎        | 972/7370 [03:33<09:44, 10.95it/s][A
 13%|█▎        | 974/7370 [03:33<08:45, 12.18it/s][A
 13%|█▎        | 977/7370 [03:33<07:20, 14.50it/s][A
 13%|█▎        | 980/7370 [03:33<06:31, 16.30it/s][A
 13%|█▎        | 983/7370 [03:33<06:02, 17.60it/s][A
 13%|█▎        | 986/7370 [03:33<05:46, 18.42it/s][A
 13%|█▎        | 989/7370 [03:33<05:32, 19.20it/s][A
 13%|█▎        | 992/7370 [03:34<05:21, 19.86it/s][A
 14%|█▎        | 995/7370 [03:34<05:11, 20.49it/s][A
 14%|█▎        | 998/

{'En-Mc-30.Txt': -0.07441173355343998,
 'En-Men-Tr-3K.Txt': 0.03199774574696021,
 'En-Mturk-287.Txt': 0.13186130975130303,
 'En-Mturk-771.Txt': 0.04423892669165944,
 'En-Rg-65.Txt': -0.08903285535419089,
 'En-Rw-Stanford.Txt': -0.04939967762143172,
 'En-Simlex-999.Txt': 0.021522343313181834,
 'En-Simverb-3500.Txt': 0.016042215379007924,
 'En-Verb-143.Txt': 0.13932266165855425,
 'En-Ws-353-All.Txt': 0.0038640732943752004,
 'En-Ws-353-Rel.Txt': 0.07028597860472482,
 'En-Ws-353-Sim.Txt': -0.031125099111620525,
 'En-Yp-130.Txt': 0.10305116022823524}
Epoch 1 | Loss: 8.52764982981416| spearmanr: 0.021522343313181834



 14%|█▍        | 1023/7370 [03:48<3:01:48,  1.72s/it][A
 14%|█▍        | 1026/7370 [03:49<2:00:02,  1.14s/it][A
 14%|█▍        | 1029/7370 [03:49<1:21:46,  1.29it/s][A
 14%|█▍        | 1032/7370 [03:49<56:55,  1.86it/s]  [A
 14%|█▍        | 1035/7370 [03:49<40:27,  2.61it/s][A
 14%|█▍        | 1038/7370 [03:49<29:27,  3.58it/s][A
 14%|█▍        | 1041/7370 [03:49<21:50,  4.83it/s][A
 14%|█▍        | 1044/7370 [03:49<16:46,  6.29it/s][A
 14%|█▍        | 1047/7370 [03:50<13:11,  7.99it/s][A
 14%|█▍        | 1050/7370 [03:50<10:39,  9.88it/s][A
 14%|█▍        | 1053/7370 [03:50<08:52, 11.85it/s][A
 14%|█▍        | 1056/7370 [03:50<07:42, 13.65it/s][A
 14%|█▍        | 1059/7370 [03:50<06:49, 15.39it/s][A
 14%|█▍        | 1062/7370 [03:50<06:15, 16.81it/s][A
 14%|█▍        | 1065/7370 [03:50<05:53, 17.84it/s][A
 14%|█▍        | 1068/7370 [03:51<05:37, 18.69it/s][A
 15%|█▍        | 1071/7370 [03:51<05:18, 19.78it/s][A
 15%|█▍        | 1074/7370 [03:51<05:09, 20.37it/s][A
 

{'En-Mc-30.Txt': 0.0381914526516,
 'En-Men-Tr-3K.Txt': 0.035842395054924266,
 'En-Mturk-287.Txt': 0.09861660514191407,
 'En-Mturk-771.Txt': 0.038403365108321194,
 'En-Rg-65.Txt': -0.033978978060628245,
 'En-Rw-Stanford.Txt': -0.061893920469012854,
 'En-Simlex-999.Txt': 0.013338236741935077,
 'En-Simverb-3500.Txt': 0.01786114124811637,
 'En-Verb-143.Txt': 0.17782536253768386,
 'En-Ws-353-All.Txt': -0.12753136931341094,
 'En-Ws-353-Rel.Txt': -0.05604196625319314,
 'En-Ws-353-Sim.Txt': -0.14250448643552355,
 'En-Yp-130.Txt': 0.06221634669615284}
Epoch 1 | Loss: 8.234414281853795| spearmanr: 0.013338236741935077



 15%|█▍        | 1101/7370 [04:04<1:30:59,  1.15it/s][A
 15%|█▍        | 1104/7370 [04:04<1:05:05,  1.60it/s][A
 15%|█▌        | 1107/7370 [04:04<46:58,  2.22it/s]  [A
 15%|█▌        | 1110/7370 [04:04<34:26,  3.03it/s][A
 15%|█▌        | 1113/7370 [04:04<25:29,  4.09it/s][A
 15%|█▌        | 1116/7370 [04:05<19:16,  5.41it/s][A
 15%|█▌        | 1119/7370 [04:05<14:55,  6.98it/s][A
 15%|█▌        | 1122/7370 [04:05<12:00,  8.68it/s][A
 15%|█▌        | 1125/7370 [04:05<09:52, 10.54it/s][A
 15%|█▌        | 1128/7370 [04:05<08:23, 12.39it/s][A
 15%|█▌        | 1131/7370 [04:05<07:20, 14.17it/s][A
 15%|█▌        | 1134/7370 [04:05<06:32, 15.87it/s][A
 15%|█▌        | 1137/7370 [04:06<05:58, 17.39it/s][A
 15%|█▌        | 1140/7370 [04:06<05:34, 18.60it/s][A
 16%|█▌        | 1143/7370 [04:06<05:23, 19.27it/s][A
 16%|█▌        | 1146/7370 [04:06<05:09, 20.14it/s][A
 16%|█▌        | 1149/7370 [04:06<05:04, 20.46it/s][A
 16%|█▌        | 1152/7370 [04:06<04:55, 21.06it/s][A
 16

{'En-Mc-30.Txt': 0.0850067817084,
 'En-Men-Tr-3K.Txt': 0.03536471763592297,
 'En-Mturk-287.Txt': 0.1314694220357301,
 'En-Mturk-771.Txt': 0.013024810613366024,
 'En-Rg-65.Txt': 0.08147021899205885,
 'En-Rw-Stanford.Txt': -0.06160731772427148,
 'En-Simlex-999.Txt': 0.013296052017197687,
 'En-Simverb-3500.Txt': 0.006111977712460313,
 'En-Verb-143.Txt': 0.09292434293395783,
 'En-Ws-353-All.Txt': 0.013158897422974698,
 'En-Ws-353-Rel.Txt': 0.08720547918459459,
 'En-Ws-353-Sim.Txt': -0.06044059044923624,
 'En-Yp-130.Txt': 0.16380569972910392}
Epoch 1 | Loss: 7.967373321768155| spearmanr: 0.013296052017197687



 16%|█▌        | 1173/7370 [04:18<1:23:29,  1.24it/s][A
 16%|█▌        | 1176/7370 [04:18<59:49,  1.73it/s]  [A
 16%|█▌        | 1179/7370 [04:18<43:18,  2.38it/s][A
 16%|█▌        | 1182/7370 [04:18<31:42,  3.25it/s][A
 16%|█▌        | 1185/7370 [04:19<23:34,  4.37it/s][A
 16%|█▌        | 1188/7370 [04:19<17:53,  5.76it/s][A
 16%|█▌        | 1191/7370 [04:19<13:54,  7.41it/s][A
 16%|█▌        | 1194/7370 [04:19<11:09,  9.23it/s][A
 16%|█▌        | 1197/7370 [04:19<09:21, 11.00it/s][A
 16%|█▋        | 1200/7370 [04:19<07:54, 13.00it/s][A
 16%|█▋        | 1203/7370 [04:19<06:58, 14.75it/s][A
 16%|█▋        | 1206/7370 [04:20<06:27, 15.92it/s][A
 16%|█▋        | 1209/7370 [04:20<06:45, 15.18it/s][A
 16%|█▋        | 1211/7370 [04:20<06:49, 15.03it/s][A
 16%|█▋        | 1213/7370 [04:20<07:11, 14.28it/s][A
 16%|█▋        | 1215/7370 [04:20<07:17, 14.08it/s][A
 17%|█▋        | 1217/7370 [04:20<07:19, 14.01it/s][A
 17%|█▋        | 1219/7370 [04:21<07:17, 14.05it/s][A
 17%|

{'En-Mc-30.Txt': 0.12615499198464,
 'En-Men-Tr-3K.Txt': 0.04731404663285288,
 'En-Mturk-287.Txt': 0.11835716294281341,
 'En-Mturk-771.Txt': 0.06203333545598377,
 'En-Rg-65.Txt': 0.09318437251801863,
 'En-Rw-Stanford.Txt': -0.05398821582752298,
 'En-Simlex-999.Txt': -0.04741770562365888,
 'En-Simverb-3500.Txt': 0.00031589186911576016,
 'En-Verb-143.Txt': 0.19346777608889149,
 'En-Ws-353-All.Txt': -0.002713643508080892,
 'En-Ws-353-Rel.Txt': 0.07554344598201344,
 'En-Ws-353-Sim.Txt': -0.02282919538218028,
 'En-Yp-130.Txt': 0.05492813418405248}
Epoch 1 | Loss: 7.724700844422247| spearmanr: -0.04741770562365888



 17%|█▋        | 1244/7370 [04:33<1:47:01,  1.05s/it][A
 17%|█▋        | 1246/7370 [04:33<1:19:10,  1.29it/s][A
 17%|█▋        | 1248/7370 [04:33<58:36,  1.74it/s]  [A
 17%|█▋        | 1250/7370 [04:33<43:46,  2.33it/s][A
 17%|█▋        | 1252/7370 [04:33<33:06,  3.08it/s][A
 17%|█▋        | 1254/7370 [04:33<25:32,  3.99it/s][A
 17%|█▋        | 1256/7370 [04:34<20:25,  4.99it/s][A
 17%|█▋        | 1258/7370 [04:34<16:47,  6.06it/s][A
 17%|█▋        | 1260/7370 [04:34<14:04,  7.23it/s][A
 17%|█▋        | 1262/7370 [04:34<12:11,  8.35it/s][A
 17%|█▋        | 1264/7370 [04:34<11:08,  9.14it/s][A
 17%|█▋        | 1266/7370 [04:34<10:12,  9.97it/s][A
 17%|█▋        | 1268/7370 [04:34<08:48, 11.54it/s][A
 17%|█▋        | 1271/7370 [04:35<07:15, 14.00it/s][A
 17%|█▋        | 1274/7370 [04:35<06:18, 16.12it/s][A
 17%|█▋        | 1277/7370 [04:35<05:44, 17.69it/s][A
 17%|█▋        | 1280/7370 [04:35<05:23, 18.80it/s][A
 17%|█▋        | 1282/7370 [04:35<05:27, 18.61it/s][A
 17

{'En-Mc-30.Txt': 0.1638536516988,
 'En-Men-Tr-3K.Txt': 0.03020844275829388,
 'En-Mturk-287.Txt': 0.12597382552168954,
 'En-Mturk-771.Txt': 0.05464883495977183,
 'En-Rg-65.Txt': 0.05571494690563713,
 'En-Rw-Stanford.Txt': -0.05987428776977367,
 'En-Simlex-999.Txt': -0.006799409776523653,
 'En-Simverb-3500.Txt': -0.0019052010117591286,
 'En-Verb-143.Txt': 0.011713710173439036,
 'En-Ws-353-All.Txt': 0.007829629088156376,
 'En-Ws-353-Rel.Txt': 0.06903610078066995,
 'En-Ws-353-Sim.Txt': 0.01460363522703924,
 'En-Yp-130.Txt': 0.084809805483664}
Epoch 1 | Loss: 7.5014986058151765| spearmanr: -0.006799409776523653



 18%|█▊        | 1318/7370 [04:48<1:22:06,  1.23it/s][A
 18%|█▊        | 1321/7370 [04:48<58:45,  1.72it/s]  [A
 18%|█▊        | 1324/7370 [04:48<42:36,  2.36it/s][A
 18%|█▊        | 1327/7370 [04:48<31:13,  3.23it/s][A
 18%|█▊        | 1330/7370 [04:48<23:13,  4.33it/s][A
 18%|█▊        | 1333/7370 [04:48<17:42,  5.68it/s][A
 18%|█▊        | 1336/7370 [04:49<13:48,  7.29it/s][A
 18%|█▊        | 1339/7370 [04:49<11:03,  9.09it/s][A
 18%|█▊        | 1342/7370 [04:49<09:09, 10.97it/s][A
 18%|█▊        | 1345/7370 [04:49<07:56, 12.65it/s][A
 18%|█▊        | 1348/7370 [04:49<06:59, 14.36it/s][A
 18%|█▊        | 1351/7370 [04:49<06:18, 15.92it/s][A
 18%|█▊        | 1354/7370 [04:49<05:49, 17.23it/s][A
 18%|█▊        | 1357/7370 [04:50<05:27, 18.33it/s][A
 18%|█▊        | 1360/7370 [04:50<05:13, 19.16it/s][A
 18%|█▊        | 1363/7370 [04:50<05:03, 19.82it/s][A
 19%|█▊        | 1366/7370 [04:50<05:07, 19.51it/s][A
 19%|█▊        | 1369/7370 [04:50<05:00, 19.99it/s][A
 19%|

{'En-Mc-30.Txt': 0.22964150884704,
 'En-Men-Tr-3K.Txt': 0.03536849578933267,
 'En-Mturk-287.Txt': 0.11854760570298821,
 'En-Mturk-771.Txt': 0.04198169606613129,
 'En-Rg-65.Txt': 0.1607721296565134,
 'En-Rw-Stanford.Txt': -0.040708048493946825,
 'En-Simlex-999.Txt': 0.02238832620989688,
 'En-Simverb-3500.Txt': 0.016647084223613787,
 'En-Verb-143.Txt': 0.2501670002145275,
 'En-Ws-353-All.Txt': 0.032968399224231566,
 'En-Ws-353-Rel.Txt': 0.09574543769348688,
 'En-Ws-353-Sim.Txt': 0.001078343803817475,
 'En-Yp-130.Txt': 0.019245045724809045}
Epoch 1 | Loss: 7.295779786123666| spearmanr: 0.02238832620989688



 19%|█▉        | 1390/7370 [05:02<1:54:45,  1.15s/it][A
 19%|█▉        | 1393/7370 [05:02<1:21:40,  1.22it/s][A
 19%|█▉        | 1396/7370 [05:02<58:36,  1.70it/s]  [A
 19%|█▉        | 1399/7370 [05:03<42:21,  2.35it/s][A
 19%|█▉        | 1402/7370 [05:03<31:01,  3.21it/s][A
 19%|█▉        | 1405/7370 [05:03<23:02,  4.31it/s][A
 19%|█▉        | 1408/7370 [05:03<17:30,  5.68it/s][A
 19%|█▉        | 1411/7370 [05:03<13:43,  7.23it/s][A
 19%|█▉        | 1414/7370 [05:03<10:54,  9.09it/s][A
 19%|█▉        | 1417/7370 [05:03<09:04, 10.93it/s][A
 19%|█▉        | 1420/7370 [05:04<07:43, 12.84it/s][A
 19%|█▉        | 1423/7370 [05:04<06:45, 14.65it/s][A
 19%|█▉        | 1426/7370 [05:04<06:06, 16.20it/s][A
 19%|█▉        | 1429/7370 [05:04<05:39, 17.51it/s][A
 19%|█▉        | 1432/7370 [05:04<05:17, 18.68it/s][A
 19%|█▉        | 1435/7370 [05:04<05:02, 19.59it/s][A
 20%|█▉        | 1438/7370 [05:04<04:58, 19.89it/s][A
 20%|█▉        | 1441/7370 [05:05<04:50, 20.44it/s][A
 20

{'En-Mc-30.Txt': 0.24615007225127997,
 'En-Men-Tr-3K.Txt': 0.019658005241232665,
 'En-Mturk-287.Txt': 0.12363140576724223,
 'En-Mturk-771.Txt': 0.025088416620319094,
 'En-Rg-65.Txt': 0.10949957054401677,
 'En-Rw-Stanford.Txt': -0.046145649031635434,
 'En-Simlex-999.Txt': 0.012652582713192553,
 'En-Simverb-3500.Txt': -0.0005720992746437917,
 'En-Verb-143.Txt': 0.10054644981992811,
 'En-Ws-353-All.Txt': 0.012273781360617578,
 'En-Ws-353-Rel.Txt': 0.08819135508193038,
 'En-Ws-353-Sim.Txt': 0.010283302954970517,
 'En-Yp-130.Txt': 0.0766178546200632}
Epoch 1 | Loss: 7.107727367856746| spearmanr: 0.012652582713192553



 20%|█▉        | 1464/7370 [05:17<1:26:31,  1.14it/s][A
 20%|█▉        | 1467/7370 [05:17<1:00:43,  1.62it/s][A
 20%|█▉        | 1470/7370 [05:17<43:13,  2.27it/s]  [A
 20%|█▉        | 1473/7370 [05:17<31:19,  3.14it/s][A
 20%|██        | 1476/7370 [05:17<24:00,  4.09it/s][A
 20%|██        | 1479/7370 [05:17<18:07,  5.42it/s][A
 20%|██        | 1482/7370 [05:18<14:00,  7.01it/s][A
 20%|██        | 1485/7370 [05:18<11:11,  8.76it/s][A
 20%|██        | 1488/7370 [05:18<09:15, 10.60it/s][A
 20%|██        | 1491/7370 [05:18<07:51, 12.47it/s][A
 20%|██        | 1494/7370 [05:18<06:58, 14.05it/s][A
 20%|██        | 1497/7370 [05:18<06:14, 15.69it/s][A
 20%|██        | 1500/7370 [05:18<05:43, 17.09it/s][A
 20%|██        | 1503/7370 [05:19<05:21, 18.24it/s][A
 20%|██        | 1506/7370 [05:19<05:08, 19.01it/s][A
 20%|██        | 1509/7370 [05:19<05:06, 19.11it/s][A
 21%|██        | 1512/7370 [05:19<04:57, 19.70it/s][A
 21%|██        | 1515/7370 [05:19<04:51, 20.11it/s][A
 21

{'En-Mc-30.Txt': 0.25452755218775996,
 'En-Men-Tr-3K.Txt': 0.02179656427552202,
 'En-Mturk-287.Txt': 0.13493118417563407,
 'En-Mturk-771.Txt': 0.0366970913218075,
 'En-Rg-65.Txt': 0.10061479495773577,
 'En-Rw-Stanford.Txt': -0.04671048903995938,
 'En-Simlex-999.Txt': -0.011775895105110153,
 'En-Simverb-3500.Txt': -0.015716076885593067,
 'En-Verb-143.Txt': 0.2510237330729505,
 'En-Ws-353-All.Txt': 0.018521772162825065,
 'En-Ws-353-Rel.Txt': 0.10637403434628516,
 'En-Ws-353-Sim.Txt': 0.0027217552209615266,
 'En-Yp-130.Txt': 0.02534632077065307}
Epoch 1 | Loss: 7.0132556897254155| spearmanr: -0.011775895105110153



 21%|██        | 1539/7370 [05:32<1:21:42,  1.19it/s][A
 21%|██        | 1542/7370 [05:32<58:32,  1.66it/s]  [A
 21%|██        | 1545/7370 [05:32<42:18,  2.29it/s][A
 21%|██        | 1548/7370 [05:32<31:00,  3.13it/s][A
 21%|██        | 1551/7370 [05:32<23:07,  4.19it/s][A
 21%|██        | 1554/7370 [05:32<17:37,  5.50it/s][A
 21%|██        | 1557/7370 [05:32<13:40,  7.09it/s][A
 21%|██        | 1560/7370 [05:33<10:52,  8.90it/s][A
 21%|██        | 1563/7370 [05:33<08:55, 10.85it/s][A
 21%|██        | 1566/7370 [05:33<07:34, 12.77it/s][A
 21%|██▏       | 1569/7370 [05:33<06:34, 14.69it/s][A
 21%|██▏       | 1572/7370 [05:33<06:02, 15.99it/s][A
 21%|██▏       | 1575/7370 [05:33<06:05, 15.84it/s][A
 21%|██▏       | 1577/7370 [05:33<06:18, 15.30it/s][A
 21%|██▏       | 1579/7370 [05:34<06:34, 14.69it/s][A
 21%|██▏       | 1581/7370 [05:34<06:41, 14.41it/s][A
 21%|██▏       | 1583/7370 [05:34<06:37, 14.55it/s][A
 22%|██▏       | 1585/7370 [05:34<06:50, 14.11it/s][A
 22%|

{'En-Mc-30.Txt': 0.10742886036192,
 'En-Men-Tr-3K.Txt': 0.03093906355312371,
 'En-Mturk-287.Txt': 0.22671254455655576,
 'En-Mturk-771.Txt': 0.04923908918683661,
 'En-Rg-65.Txt': 0.01285119325872788,
 'En-Rw-Stanford.Txt': -0.04633695934872325,
 'En-Simlex-999.Txt': -0.03756894423292303,
 'En-Simverb-3500.Txt': -0.0053607461284579955,
 'En-Verb-143.Txt': 0.005607980224677612,
 'En-Ws-353-All.Txt': -0.005493246955862835,
 'En-Ws-353-Rel.Txt': 0.05492066341329611,
 'En-Ws-353-Sim.Txt': -0.05919836930749095,
 'En-Yp-130.Txt': 0.07990379728980444}
Epoch 1 | Loss: 6.895372318939608| spearmanr: -0.03756894423292303



 22%|██▏       | 1609/7370 [05:47<2:01:10,  1.26s/it][A
 22%|██▏       | 1611/7370 [05:47<1:27:01,  1.10it/s][A
 22%|██▏       | 1613/7370 [05:47<1:03:15,  1.52it/s][A
 22%|██▏       | 1615/7370 [05:48<46:29,  2.06it/s]  [A
 22%|██▏       | 1617/7370 [05:48<35:03,  2.74it/s][A
 22%|██▏       | 1619/7370 [05:48<26:22,  3.63it/s][A
 22%|██▏       | 1622/7370 [05:48<17:48,  5.38it/s][A
 22%|██▏       | 1625/7370 [05:48<13:01,  7.35it/s][A
 22%|██▏       | 1628/7370 [05:48<10:04,  9.50it/s][A
 22%|██▏       | 1631/7370 [05:49<08:13, 11.63it/s][A
 22%|██▏       | 1634/7370 [05:49<06:57, 13.74it/s][A
 22%|██▏       | 1637/7370 [05:49<06:14, 15.29it/s][A
 22%|██▏       | 1640/7370 [05:49<05:43, 16.69it/s][A
 22%|██▏       | 1643/7370 [05:49<05:24, 17.67it/s][A
 22%|██▏       | 1646/7370 [05:49<05:07, 18.60it/s][A
 22%|██▏       | 1649/7370 [05:49<04:53, 19.49it/s][A
 22%|██▏       | 1652/7370 [05:49<04:43, 20.17it/s][A
 22%|██▏       | 1655/7370 [05:50<04:37, 20.58it/s][A
 

{'En-Mc-30.Txt': -0.041148210276239994,
 'En-Men-Tr-3K.Txt': 0.04155673970576589,
 'En-Mturk-287.Txt': 0.2349028930292185,
 'En-Mturk-771.Txt': 0.028178293448355086,
 'En-Rg-65.Txt': 0.05431347932803923,
 'En-Rw-Stanford.Txt': -0.04256875972184942,
 'En-Simlex-999.Txt': -0.041044798545229896,
 'En-Simverb-3500.Txt': -0.023669801693668942,
 'En-Verb-143.Txt': 0.0801336833575011,
 'En-Ws-353-All.Txt': 0.019317271145096734,
 'En-Ws-353-Rel.Txt': 0.0661752066190635,
 'En-Ws-353-Sim.Txt': 0.03575386090177038,
 'En-Yp-130.Txt': 0.03087703289412124}
Epoch 1 | Loss: 6.7695263353132065| spearmanr: -0.041044798545229896



 23%|██▎       | 1685/7370 [06:02<1:17:57,  1.22it/s][A
 23%|██▎       | 1688/7370 [06:02<55:58,  1.69it/s]  [A
 23%|██▎       | 1691/7370 [06:02<40:33,  2.33it/s][A
 23%|██▎       | 1694/7370 [06:03<29:43,  3.18it/s][A
 23%|██▎       | 1697/7370 [06:03<22:04,  4.28it/s][A
 23%|██▎       | 1700/7370 [06:03<16:45,  5.64it/s][A
 23%|██▎       | 1703/7370 [06:03<13:01,  7.25it/s][A
 23%|██▎       | 1706/7370 [06:03<10:28,  9.01it/s][A
 23%|██▎       | 1709/7370 [06:03<08:41, 10.85it/s][A
 23%|██▎       | 1712/7370 [06:03<07:23, 12.77it/s][A
 23%|██▎       | 1715/7370 [06:04<06:25, 14.69it/s][A
 23%|██▎       | 1718/7370 [06:04<05:46, 16.32it/s][A
 23%|██▎       | 1721/7370 [06:04<05:20, 17.62it/s][A
 23%|██▎       | 1724/7370 [06:04<05:02, 18.64it/s][A
 23%|██▎       | 1727/7370 [06:04<04:49, 19.46it/s][A
 23%|██▎       | 1730/7370 [06:04<04:38, 20.24it/s][A
 24%|██▎       | 1733/7370 [06:04<04:35, 20.45it/s][A
 24%|██▎       | 1736/7370 [06:05<04:28, 20.96it/s][A
 24%|

{'En-Mc-30.Txt': 0.19218924560159997,
 'En-Men-Tr-3K.Txt': 0.050999508715007476,
 'En-Mturk-287.Txt': 0.24021878702127836,
 'En-Mturk-771.Txt': 0.09170694757153315,
 'En-Rg-65.Txt': 0.17436372088076468,
 'En-Rw-Stanford.Txt': -0.046645212593482664,
 'En-Simlex-999.Txt': -0.02869795934043132,
 'En-Simverb-3500.Txt': -0.014719439489260805,
 'En-Verb-143.Txt': 0.26231410459216875,
 'En-Ws-353-All.Txt': 0.11554169473217608,
 'En-Ws-353-Rel.Txt': 0.18009062730104003,
 'En-Ws-353-Sim.Txt': 0.13311709779497916,
 'En-Yp-130.Txt': 0.039031501521945546}
Epoch 1 | Loss: 6.645148755071915| spearmanr: -0.02869795934043132



 24%|██▍       | 1756/7370 [06:17<1:22:47,  1.13it/s][A
 24%|██▍       | 1759/7370 [06:17<58:05,  1.61it/s]  [A
 24%|██▍       | 1762/7370 [06:17<41:19,  2.26it/s][A
 24%|██▍       | 1765/7370 [06:17<29:54,  3.12it/s][A
 24%|██▍       | 1768/7370 [06:17<22:07,  4.22it/s][A
 24%|██▍       | 1771/7370 [06:17<16:39,  5.60it/s][A
 24%|██▍       | 1774/7370 [06:17<12:49,  7.27it/s][A
 24%|██▍       | 1777/7370 [06:18<10:16,  9.08it/s][A
 24%|██▍       | 1780/7370 [06:18<08:30, 10.95it/s][A
 24%|██▍       | 1783/7370 [06:18<07:17, 12.76it/s][A
 24%|██▍       | 1786/7370 [06:18<06:21, 14.63it/s][A
 24%|██▍       | 1789/7370 [06:18<05:54, 15.73it/s][A
 24%|██▍       | 1792/7370 [06:18<05:26, 17.10it/s][A
 24%|██▍       | 1795/7370 [06:18<05:03, 18.39it/s][A
 24%|██▍       | 1798/7370 [06:19<04:47, 19.38it/s][A
 24%|██▍       | 1801/7370 [06:19<04:46, 19.42it/s][A
 24%|██▍       | 1804/7370 [06:19<04:36, 20.11it/s][A
 25%|██▍       | 1807/7370 [06:19<04:29, 20.65it/s][A
 25%|

{'En-Mc-30.Txt': 0.02020451043504,
 'En-Men-Tr-3K.Txt': 0.058557471261172334,
 'En-Mturk-287.Txt': 0.23450602813009214,
 'En-Mturk-771.Txt': 0.08803509986759123,
 'En-Rg-65.Txt': 0.04468830577623482,
 'En-Rw-Stanford.Txt': -0.03783124491633214,
 'En-Simlex-999.Txt': 0.024417594171877305,
 'En-Simverb-3500.Txt': -0.02893808392383041,
 'En-Verb-143.Txt': -0.010203366566219065,
 'En-Ws-353-All.Txt': -0.026251982302761642,
 'En-Ws-353-Rel.Txt': 0.006909393937072215,
 'En-Ws-353-Sim.Txt': -0.037078793718790554,
 'En-Yp-130.Txt': 0.0013035488664499529}
Epoch 1 | Loss: 6.523150530858995| spearmanr: 0.024417594171877305



 25%|██▍       | 1828/7370 [06:31<1:47:05,  1.16s/it][A
 25%|██▍       | 1831/7370 [06:31<1:16:13,  1.21it/s][A
 25%|██▍       | 1834/7370 [06:31<54:38,  1.69it/s]  [A
 25%|██▍       | 1837/7370 [06:31<39:27,  2.34it/s][A
 25%|██▍       | 1840/7370 [06:32<28:53,  3.19it/s][A
 25%|██▌       | 1843/7370 [06:32<21:29,  4.29it/s][A
 25%|██▌       | 1846/7370 [06:32<16:17,  5.65it/s][A
 25%|██▌       | 1849/7370 [06:32<12:45,  7.21it/s][A
 25%|██▌       | 1852/7370 [06:32<10:12,  9.01it/s][A
 25%|██▌       | 1855/7370 [06:32<08:22, 10.98it/s][A
 25%|██▌       | 1858/7370 [06:32<07:05, 12.96it/s][A
 25%|██▌       | 1861/7370 [06:33<06:10, 14.85it/s][A
 25%|██▌       | 1864/7370 [06:33<05:37, 16.30it/s][A
 25%|██▌       | 1867/7370 [06:33<05:21, 17.09it/s][A
 25%|██▌       | 1870/7370 [06:33<05:04, 18.09it/s][A
 25%|██▌       | 1873/7370 [06:33<04:59, 18.33it/s][A
 25%|██▌       | 1876/7370 [06:33<04:44, 19.28it/s][A
 25%|██▌       | 1879/7370 [06:33<04:33, 20.08it/s][A
 26

{'En-Mc-30.Txt': 0.08525317817711998,
 'En-Men-Tr-3K.Txt': 0.0680583803738295,
 'En-Mturk-287.Txt': 0.20640354000245237,
 'En-Mturk-771.Txt': 0.06309215223447573,
 'En-Rg-65.Txt': 0.03651748537099424,
 'En-Rw-Stanford.Txt': -0.03950093298910179,
 'En-Simlex-999.Txt': -0.013474233664473643,
 'En-Simverb-3500.Txt': 0.008281168804107255,
 'En-Verb-143.Txt': 0.09951072817183451,
 'En-Ws-353-All.Txt': 0.060846091370418326,
 'En-Ws-353-Rel.Txt': 0.12813262978445966,
 'En-Ws-353-Sim.Txt': 0.02990606815920464,
 'En-Yp-130.Txt': 0.09206053575998786}
Epoch 1 | Loss: 6.405578569087812| spearmanr: -0.013474233664473643



 26%|██▌       | 1903/7370 [06:46<1:16:17,  1.19it/s][A
 26%|██▌       | 1906/7370 [06:46<54:34,  1.67it/s]  [A
 26%|██▌       | 1909/7370 [06:46<39:30,  2.30it/s][A
 26%|██▌       | 1912/7370 [06:46<28:49,  3.15it/s][A
 26%|██▌       | 1915/7370 [06:46<21:30,  4.23it/s][A
 26%|██▌       | 1918/7370 [06:47<16:16,  5.58it/s][A
 26%|██▌       | 1921/7370 [06:47<12:36,  7.21it/s][A
 26%|██▌       | 1924/7370 [06:47<10:02,  9.03it/s][A
 26%|██▌       | 1927/7370 [06:47<08:25, 10.77it/s][A
 26%|██▌       | 1930/7370 [06:47<07:53, 11.49it/s][A
 26%|██▌       | 1932/7370 [06:47<07:40, 11.81it/s][A
 26%|██▌       | 1934/7370 [06:48<07:32, 12.02it/s][A
 26%|██▋       | 1936/7370 [06:48<07:11, 12.58it/s][A
 26%|██▋       | 1938/7370 [06:48<07:02, 12.85it/s][A
 26%|██▋       | 1940/7370 [06:48<07:04, 12.80it/s][A
 26%|██▋       | 1942/7370 [06:48<06:52, 13.16it/s][A
 26%|██▋       | 1944/7370 [06:48<06:46, 13.34it/s][A
 26%|██▋       | 1946/7370 [06:48<06:34, 13.73it/s][A
 26%|

{'En-Mc-30.Txt': 0.03006036918384,
 'En-Men-Tr-3K.Txt': 0.05619419313208971,
 'En-Mturk-287.Txt': 0.22738420237923557,
 'En-Mturk-771.Txt': 0.05349340952416805,
 'En-Rg-65.Txt': 0.02890196343989624,
 'En-Rw-Stanford.Txt': -0.045746803747009984,
 'En-Simlex-999.Txt': 0.022543635561858313,
 'En-Simverb-3500.Txt': -0.0372902222583903,
 'En-Verb-143.Txt': 0.17078848866063637,
 'En-Ws-353-All.Txt': 0.04756463374663599,
 'En-Ws-353-Rel.Txt': 0.09938906733858725,
 'En-Ws-353-Sim.Txt': 0.036651321228818465,
 'En-Yp-130.Txt': 0.1936082418642984}
Epoch 1 | Loss: 6.294677830008416| spearmanr: 0.022543635561858313



 27%|██▋       | 1974/7370 [07:01<1:30:18,  1.00s/it][A
 27%|██▋       | 1976/7370 [07:01<1:08:11,  1.32it/s][A
 27%|██▋       | 1978/7370 [07:01<51:11,  1.76it/s]  [A
 27%|██▋       | 1980/7370 [07:01<38:52,  2.31it/s][A
 27%|██▋       | 1982/7370 [07:01<29:48,  3.01it/s][A
 27%|██▋       | 1984/7370 [07:02<23:06,  3.88it/s][A
 27%|██▋       | 1986/7370 [07:02<18:05,  4.96it/s][A
 27%|██▋       | 1988/7370 [07:02<14:17,  6.28it/s][A
 27%|██▋       | 1990/7370 [07:02<11:28,  7.82it/s][A
 27%|██▋       | 1993/7370 [07:02<08:37, 10.38it/s][A
 27%|██▋       | 1996/7370 [07:02<07:01, 12.76it/s][A
 27%|██▋       | 1999/7370 [07:02<06:00, 14.89it/s][A
 27%|██▋       | 2002/7370 [07:03<05:22, 16.64it/s][A
 27%|██▋       | 2005/7370 [07:03<05:00, 17.85it/s][A
 27%|██▋       | 2008/7370 [07:03<04:45, 18.78it/s][A
 27%|██▋       | 2011/7370 [07:03<04:38, 19.22it/s][A
 27%|██▋       | 2014/7370 [07:03<04:30, 19.78it/s][A
 27%|██▋       | 2017/7370 [07:03<04:23, 20.32it/s][A
 27

{'En-Mc-30.Txt': 0.043119382026,
 'En-Men-Tr-3K.Txt': 0.04254827386602691,
 'En-Mturk-287.Txt': 0.2902981601074687,
 'En-Mturk-771.Txt': 0.04586921812194103,
 'En-Rg-65.Txt': 0.12367290302689361,
 'En-Rw-Stanford.Txt': -0.05568152674952696,
 'En-Simlex-999.Txt': -0.0039808498558646845,
 'En-Simverb-3500.Txt': -0.03422278597407857,
 'En-Verb-143.Txt': 0.17009264460191714,
 'En-Ws-353-All.Txt': -0.0014108057270359991,
 'En-Ws-353-Rel.Txt': 0.031241907396663847,
 'En-Ws-353-Sim.Txt': -0.00992617403929763,
 'En-Yp-130.Txt': 0.11841887689660688}
Epoch 1 | Loss: 6.188815730008636| spearmanr: -0.0039808498558646845



 28%|██▊       | 2049/7370 [07:16<1:20:10,  1.11it/s][A
 28%|██▊       | 2052/7370 [07:16<55:33,  1.60it/s]  [A
 28%|██▊       | 2055/7370 [07:16<39:19,  2.25it/s][A
 28%|██▊       | 2058/7370 [07:16<28:26,  3.11it/s][A
 28%|██▊       | 2061/7370 [07:16<21:03,  4.20it/s][A
 28%|██▊       | 2064/7370 [07:16<15:53,  5.56it/s][A
 28%|██▊       | 2067/7370 [07:17<12:19,  7.17it/s][A
 28%|██▊       | 2070/7370 [07:17<09:50,  8.97it/s][A
 28%|██▊       | 2073/7370 [07:17<08:07, 10.86it/s][A
 28%|██▊       | 2076/7370 [07:17<06:56, 12.71it/s][A
 28%|██▊       | 2079/7370 [07:17<06:04, 14.51it/s][A
 28%|██▊       | 2082/7370 [07:17<05:34, 15.82it/s][A
 28%|██▊       | 2085/7370 [07:17<05:07, 17.18it/s][A
 28%|██▊       | 2088/7370 [07:18<04:48, 18.29it/s][A
 28%|██▊       | 2091/7370 [07:18<04:36, 19.06it/s][A
 28%|██▊       | 2094/7370 [07:18<04:30, 19.53it/s][A
 28%|██▊       | 2097/7370 [07:18<04:23, 20.00it/s][A
 28%|██▊       | 2100/7370 [07:18<04:16, 20.57it/s][A
 29%|

{'En-Mc-30.Txt': 0.1700135634168,
 'En-Men-Tr-3K.Txt': 0.042100665858766345,
 'En-Mturk-287.Txt': 0.26187844201755694,
 'En-Mturk-771.Txt': 0.08044615353833355,
 'En-Rg-65.Txt': 0.07811198536272049,
 'En-Rw-Stanford.Txt': -0.051054924470629134,
 'En-Simlex-999.Txt': -0.004363343947314506,
 'En-Simverb-3500.Txt': -0.030833778675787467,
 'En-Verb-143.Txt': 0.09119881055713396,
 'En-Ws-353-All.Txt': 0.11566845573361274,
 'En-Ws-353-Rel.Txt': 0.10210768259922903,
 'En-Ws-353-Sim.Txt': 0.1449904747316361,
 'En-Yp-130.Txt': 0.057135421402002874}
Epoch 1 | Loss: 6.088721695995421| spearmanr: -0.004363343947314506



 29%|██▉       | 2120/7370 [07:30<1:17:11,  1.13it/s][A
 29%|██▉       | 2123/7370 [07:30<53:36,  1.63it/s]  [A
 29%|██▉       | 2126/7370 [07:30<38:00,  2.30it/s][A
 29%|██▉       | 2129/7370 [07:30<27:32,  3.17it/s][A
 29%|██▉       | 2132/7370 [07:30<20:26,  4.27it/s][A
 29%|██▉       | 2135/7370 [07:31<15:23,  5.67it/s][A
 29%|██▉       | 2138/7370 [07:31<11:54,  7.32it/s][A
 29%|██▉       | 2141/7370 [07:31<09:38,  9.04it/s][A
 29%|██▉       | 2144/7370 [07:31<07:56, 10.97it/s][A
 29%|██▉       | 2147/7370 [07:31<06:45, 12.88it/s][A
 29%|██▉       | 2150/7370 [07:31<05:52, 14.80it/s][A
 29%|██▉       | 2153/7370 [07:31<05:19, 16.33it/s][A
 29%|██▉       | 2156/7370 [07:32<04:55, 17.66it/s][A
 29%|██▉       | 2159/7370 [07:32<04:40, 18.61it/s][A
 29%|██▉       | 2162/7370 [07:32<04:32, 19.14it/s][A
 29%|██▉       | 2165/7370 [07:32<04:29, 19.33it/s][A
 29%|██▉       | 2168/7370 [07:32<04:19, 20.03it/s][A
 29%|██▉       | 2171/7370 [07:32<04:10, 20.72it/s][A
 29%|

{'En-Mc-30.Txt': 0.22077123597311998,
 'En-Men-Tr-3K.Txt': 0.06863036029285544,
 'En-Mturk-287.Txt': 0.2033195198985485,
 'En-Mturk-771.Txt': 0.036948105673736166,
 'En-Rg-65.Txt': 0.2089244402000185,
 'En-Rw-Stanford.Txt': -0.03962101958448978,
 'En-Simlex-999.Txt': 0.001243777588482483,
 'En-Simverb-3500.Txt': -0.016966931451100755,
 'En-Verb-143.Txt': 0.11036871104184867,
 'En-Ws-353-All.Txt': 0.17002106740892506,
 'En-Ws-353-Rel.Txt': 0.19926925887214736,
 'En-Ws-353-Sim.Txt': 0.2583101078960996,
 'En-Yp-130.Txt': 0.053208116031196795}
Epoch 1 | Loss: 5.993242207100161| spearmanr: 0.001243777588482483



 30%|██▉       | 2195/7370 [07:44<1:10:30,  1.22it/s][A
 30%|██▉       | 2198/7370 [07:45<50:36,  1.70it/s]  [A
 30%|██▉       | 2201/7370 [07:45<36:35,  2.35it/s][A
 30%|██▉       | 2204/7370 [07:45<26:47,  3.21it/s][A
 30%|██▉       | 2207/7370 [07:45<19:56,  4.32it/s][A
 30%|██▉       | 2210/7370 [07:45<15:09,  5.67it/s][A
 30%|███       | 2213/7370 [07:45<12:39,  6.79it/s][A
 30%|███       | 2216/7370 [07:45<10:03,  8.54it/s][A
 30%|███       | 2219/7370 [07:46<08:13, 10.43it/s][A
 30%|███       | 2222/7370 [07:46<06:58, 12.31it/s][A
 30%|███       | 2225/7370 [07:46<06:04, 14.13it/s][A
 30%|███       | 2228/7370 [07:46<05:26, 15.76it/s][A
 30%|███       | 2231/7370 [07:46<05:02, 17.00it/s][A
 30%|███       | 2234/7370 [07:46<04:45, 17.97it/s][A
 30%|███       | 2237/7370 [07:46<04:29, 19.02it/s][A
 30%|███       | 2240/7370 [07:47<04:21, 19.59it/s][A
 30%|███       | 2243/7370 [07:47<04:16, 19.99it/s][A
 30%|███       | 2246/7370 [07:47<04:08, 20.59it/s][A
 31%|

{'En-Mc-30.Txt': 0.18159419744664,
 'En-Men-Tr-3K.Txt': 0.07258107119574723,
 'En-Mturk-287.Txt': 0.19955231586268327,
 'En-Mturk-771.Txt': 0.0523165170048742,
 'En-Rg-65.Txt': 0.20680901744138014,
 'En-Rw-Stanford.Txt': -0.06220949634362129,
 'En-Simlex-999.Txt': 0.024257902480599502,
 'En-Simverb-3500.Txt': -0.0392990953513054,
 'En-Verb-143.Txt': 0.2019808047032219,
 'En-Ws-353-All.Txt': 0.13474068017535085,
 'En-Ws-353-Rel.Txt': 0.17546152481498056,
 'En-Ws-353-Sim.Txt': 0.17169552375707145,
 'En-Yp-130.Txt': 0.07409405074444443}
Epoch 1 | Loss: 5.941067886226169| spearmanr: 0.024257902480599502



 31%|███       | 2266/7370 [07:59<1:15:22,  1.13it/s][A
 31%|███       | 2269/7370 [07:59<52:14,  1.63it/s]  [A
 31%|███       | 2272/7370 [07:59<37:00,  2.30it/s][A
 31%|███       | 2275/7370 [07:59<26:45,  3.17it/s][A
 31%|███       | 2278/7370 [07:59<19:42,  4.31it/s][A
 31%|███       | 2281/7370 [07:59<14:54,  5.69it/s][A
 31%|███       | 2284/7370 [07:59<11:34,  7.32it/s][A
 31%|███       | 2287/7370 [08:00<09:18,  9.11it/s][A
 31%|███       | 2290/7370 [08:00<07:46, 10.88it/s][A
 31%|███       | 2293/7370 [08:00<06:43, 12.58it/s][A
 31%|███       | 2296/7370 [08:00<05:53, 14.34it/s][A
 31%|███       | 2299/7370 [08:00<05:17, 15.99it/s][A
 31%|███       | 2302/7370 [08:00<04:50, 17.43it/s][A
 31%|███▏      | 2305/7370 [08:00<04:32, 18.57it/s][A
 31%|███▏      | 2308/7370 [08:01<05:10, 16.30it/s][A
 31%|███▏      | 2310/7370 [08:01<05:28, 15.41it/s][A
 31%|███▏      | 2312/7370 [08:01<05:38, 14.93it/s][A
 31%|███▏      | 2314/7370 [08:01<05:58, 14.12it/s][A
 31%|

{'En-Mc-30.Txt': 0.13083652489031997,
 'En-Men-Tr-3K.Txt': 0.07596133693776083,
 'En-Mturk-287.Txt': 0.19589618160716754,
 'En-Mturk-771.Txt': -0.0021302656551028957,
 'En-Rg-65.Txt': 0.037072783845136806,
 'En-Rw-Stanford.Txt': -0.059950899556573976,
 'En-Simlex-999.Txt': 0.02799416066896055,
 'En-Simverb-3500.Txt': -0.042490902701274806,
 'En-Verb-143.Txt': 0.19213239905134946,
 'En-Ws-353-All.Txt': 0.04471958624694953,
 'En-Ws-353-Rel.Txt': 0.07220815446506594,
 'En-Ws-353-Sim.Txt': 0.07875465596898078,
 'En-Yp-130.Txt': 0.031876559181495005}
Epoch 1 | Loss: 5.88001440455886| spearmanr: 0.02799416066896055



 32%|███▏      | 2338/7370 [08:13<2:17:27,  1.64s/it][A
 32%|███▏      | 2340/7370 [08:14<1:38:01,  1.17s/it][A
 32%|███▏      | 2342/7370 [08:14<1:10:24,  1.19it/s][A
 32%|███▏      | 2344/7370 [08:14<50:59,  1.64it/s]  [A
 32%|███▏      | 2346/7370 [08:14<37:28,  2.23it/s][A
 32%|███▏      | 2348/7370 [08:14<28:11,  2.97it/s][A
 32%|███▏      | 2350/7370 [08:14<21:35,  3.88it/s][A
 32%|███▏      | 2352/7370 [08:14<16:58,  4.92it/s][A
 32%|███▏      | 2354/7370 [08:15<13:53,  6.02it/s][A
 32%|███▏      | 2356/7370 [08:15<11:29,  7.27it/s][A
 32%|███▏      | 2358/7370 [08:15<09:56,  8.40it/s][A
 32%|███▏      | 2360/7370 [08:15<09:04,  9.21it/s][A
 32%|███▏      | 2362/7370 [08:15<08:23,  9.95it/s][A
 32%|███▏      | 2364/7370 [08:15<07:26, 11.22it/s][A
 32%|███▏      | 2367/7370 [08:16<06:03, 13.78it/s][A
 32%|███▏      | 2370/7370 [08:16<05:17, 15.73it/s][A
 32%|███▏      | 2372/7370 [08:16<05:02, 16.53it/s][A
 32%|███▏      | 2374/7370 [08:16<04:51, 17.17it/s][A
 

{'En-Mc-30.Txt': 0.15498337882487997,
 'En-Men-Tr-3K.Txt': 0.0752393155116152,
 'En-Mturk-287.Txt': 0.21029962689664775,
 'En-Mturk-771.Txt': 0.05179231604464716,
 'En-Rg-65.Txt': 0.051193230759047685,
 'En-Rw-Stanford.Txt': -0.03360569453497609,
 'En-Simlex-999.Txt': 0.01925906596581489,
 'En-Simverb-3500.Txt': -0.005185509987886529,
 'En-Verb-143.Txt': 0.18739221279007454,
 'En-Ws-353-All.Txt': 0.08883727883184489,
 'En-Ws-353-Rel.Txt': 0.14734471503483063,
 'En-Ws-353-Sim.Txt': 0.11292539994586355,
 'En-Yp-130.Txt': -0.04471214258851978}
Epoch 1 | Loss: 5.811207247769684| spearmanr: 0.01925906596581489



 33%|███▎      | 2412/7370 [08:29<1:13:50,  1.12it/s][A
 33%|███▎      | 2415/7370 [08:29<51:13,  1.61it/s]  [A
 33%|███▎      | 2418/7370 [08:29<36:13,  2.28it/s][A
 33%|███▎      | 2421/7370 [08:29<26:10,  3.15it/s][A
 33%|███▎      | 2424/7370 [08:29<19:21,  4.26it/s][A
 33%|███▎      | 2427/7370 [08:29<14:37,  5.63it/s][A
 33%|███▎      | 2430/7370 [08:29<11:16,  7.30it/s][A
 33%|███▎      | 2433/7370 [08:29<09:01,  9.12it/s][A
 33%|███▎      | 2436/7370 [08:30<07:33, 10.89it/s][A
 33%|███▎      | 2439/7370 [08:30<06:27, 12.71it/s][A
 33%|███▎      | 2442/7370 [08:30<05:41, 14.44it/s][A
 33%|███▎      | 2445/7370 [08:30<05:07, 16.01it/s][A
 33%|███▎      | 2448/7370 [08:30<04:45, 17.24it/s][A
 33%|███▎      | 2451/7370 [08:30<04:26, 18.44it/s][A
 33%|███▎      | 2454/7370 [08:30<04:16, 19.18it/s][A
 33%|███▎      | 2457/7370 [08:31<04:14, 19.31it/s][A
 33%|███▎      | 2460/7370 [08:31<04:06, 19.92it/s][A
 33%|███▎      | 2463/7370 [08:31<03:59, 20.45it/s][A
 33%|

{'En-Mc-30.Txt': 0.13946040129551998,
 'En-Men-Tr-3K.Txt': 0.07670089199593734,
 'En-Mturk-287.Txt': 0.270760095090084,
 'En-Mturk-771.Txt': 0.037088177377737855,
 'En-Rg-65.Txt': 0.0892708404145377,
 'En-Rw-Stanford.Txt': -0.013715799575998399,
 'En-Simlex-999.Txt': 0.02520578740651521,
 'En-Simverb-3500.Txt': -0.024766003063484494,
 'En-Verb-143.Txt': 0.2720297769842855,
 'En-Ws-353-All.Txt': 0.0007062503934694655,
 'En-Ws-353-Rel.Txt': 0.05976499800449757,
 'En-Ws-353-Sim.Txt': 0.07384683890056705,
 'En-Yp-130.Txt': -0.040955589625140044}
Epoch 1 | Loss: 5.742200818121361| spearmanr: 0.02520578740651521



 34%|███▎      | 2486/7370 [08:43<1:12:45,  1.12it/s][A
 34%|███▍      | 2489/7370 [08:43<50:48,  1.60it/s]  [A
 34%|███▍      | 2491/7370 [08:43<39:52,  2.04it/s][A
 34%|███▍      | 2493/7370 [08:43<30:54,  2.63it/s][A
 34%|███▍      | 2495/7370 [08:43<23:52,  3.40it/s][A
 34%|███▍      | 2498/7370 [08:44<16:39,  4.87it/s][A
 34%|███▍      | 2500/7370 [08:44<13:28,  6.03it/s][A
 34%|███▍      | 2502/7370 [08:44<10:58,  7.40it/s][A
 34%|███▍      | 2504/7370 [08:44<09:03,  8.95it/s][A
 34%|███▍      | 2507/7370 [08:44<07:13, 11.22it/s][A
 34%|███▍      | 2510/7370 [08:44<06:03, 13.36it/s][A
 34%|███▍      | 2512/7370 [08:44<05:40, 14.27it/s][A
 34%|███▍      | 2515/7370 [08:44<05:01, 16.09it/s][A
 34%|███▍      | 2518/7370 [08:45<04:36, 17.56it/s][A
 34%|███▍      | 2521/7370 [08:45<04:23, 18.40it/s][A
 34%|███▍      | 2524/7370 [08:45<04:13, 19.14it/s][A
 34%|███▍      | 2527/7370 [08:45<04:05, 19.76it/s][A
 34%|███▍      | 2530/7370 [08:45<04:05, 19.73it/s][A
 34%|

{'En-Mc-30.Txt': 0.02414685393456,
 'En-Men-Tr-3K.Txt': 0.06154462933800555,
 'En-Mturk-287.Txt': 0.21099852825035667,
 'En-Mturk-771.Txt': 0.0846491590022075,
 'En-Rg-65.Txt': 0.10164606355257196,
 'En-Rw-Stanford.Txt': -0.015612239796600919,
 'En-Simlex-999.Txt': 0.004755406355552509,
 'En-Simverb-3500.Txt': -0.02841479114757367,
 'En-Verb-143.Txt': 0.11730704052907755,
 'En-Ws-353-All.Txt': 0.020422450201808426,
 'En-Ws-353-Rel.Txt': 0.07824593898758843,
 'En-Ws-353-Sim.Txt': 0.029719000631947417,
 'En-Yp-130.Txt': 0.06284521531862551}
Epoch 1 | Loss: 5.674052271484769| spearmanr: 0.004755406355552509



 35%|███▍      | 2559/7370 [08:57<1:11:55,  1.11it/s][A
 35%|███▍      | 2562/7370 [08:58<49:50,  1.61it/s]  [A
 35%|███▍      | 2565/7370 [08:58<35:15,  2.27it/s][A
 35%|███▍      | 2568/7370 [08:58<25:26,  3.15it/s][A
 35%|███▍      | 2571/7370 [08:58<18:44,  4.27it/s][A
 35%|███▍      | 2574/7370 [08:58<14:09,  5.64it/s][A
 35%|███▍      | 2577/7370 [08:58<10:59,  7.27it/s][A
 35%|███▌      | 2580/7370 [08:58<08:45,  9.11it/s][A
 35%|███▌      | 2583/7370 [08:59<07:20, 10.86it/s][A
 35%|███▌      | 2586/7370 [08:59<06:17, 12.66it/s][A
 35%|███▌      | 2589/7370 [08:59<05:33, 14.34it/s][A
 35%|███▌      | 2592/7370 [08:59<05:01, 15.84it/s][A
 35%|███▌      | 2595/7370 [08:59<04:41, 16.99it/s][A
 35%|███▌      | 2598/7370 [08:59<04:20, 18.29it/s][A
 35%|███▌      | 2601/7370 [08:59<04:07, 19.26it/s][A
 35%|███▌      | 2604/7370 [09:00<04:04, 19.48it/s][A
 35%|███▌      | 2607/7370 [09:00<03:56, 20.16it/s][A
 35%|███▌      | 2610/7370 [09:00<03:52, 20.50it/s][A
 35%|

{'En-Mc-30.Txt': 0.052482447837359995,
 'En-Men-Tr-3K.Txt': 0.08898938511050769,
 'En-Mturk-287.Txt': 0.23663469094877243,
 'En-Mturk-771.Txt': 0.10761248350835406,
 'En-Rg-65.Txt': 0.1423679516563599,
 'En-Rw-Stanford.Txt': -0.00696945087212354,
 'En-Simlex-999.Txt': -0.02357836363594601,
 'En-Simverb-3500.Txt': -0.024593315892109915,
 'En-Verb-143.Txt': 0.16171033813734614,
 'En-Ws-353-All.Txt': 0.09197358184180907,
 'En-Ws-353-Rel.Txt': 0.1367701279365062,
 'En-Ws-353-Sim.Txt': 0.1021149066868045,
 'En-Yp-130.Txt': 0.10726166471379722}
Epoch 1 | Loss: 5.606983765579984| spearmanr: -0.02357836363594601



 36%|███▌      | 2634/7370 [09:13<1:08:55,  1.15it/s][A
 36%|███▌      | 2637/7370 [09:13<49:21,  1.60it/s]  [A
 36%|███▌      | 2640/7370 [09:13<35:39,  2.21it/s][A
 36%|███▌      | 2643/7370 [09:13<26:04,  3.02it/s][A
 36%|███▌      | 2646/7370 [09:13<19:18,  4.08it/s][A
 36%|███▌      | 2649/7370 [09:13<14:47,  5.32it/s][A
 36%|███▌      | 2652/7370 [09:14<11:46,  6.68it/s][A
 36%|███▌      | 2654/7370 [09:14<10:09,  7.74it/s][A
 36%|███▌      | 2656/7370 [09:14<09:07,  8.60it/s][A
 36%|███▌      | 2658/7370 [09:14<08:36,  9.13it/s][A
 36%|███▌      | 2660/7370 [09:14<08:04,  9.71it/s][A
 36%|███▌      | 2662/7370 [09:14<07:37, 10.30it/s][A
 36%|███▌      | 2664/7370 [09:15<07:12, 10.89it/s][A
 36%|███▌      | 2666/7370 [09:15<06:51, 11.43it/s][A
 36%|███▌      | 2668/7370 [09:15<06:38, 11.80it/s][A
 36%|███▌      | 2670/7370 [09:15<06:45, 11.58it/s][A
 36%|███▋      | 2672/7370 [09:15<06:44, 11.62it/s][A
 36%|███▋      | 2674/7370 [09:15<06:58, 11.22it/s][A
 36%|

{'En-Mc-30.Txt': -0.05272884430608,
 'En-Men-Tr-3K.Txt': 0.05149006879334246,
 'En-Mturk-287.Txt': 0.31355339536815546,
 'En-Mturk-771.Txt': 0.06420951190639158,
 'En-Rg-65.Txt': 0.16087790079444533,
 'En-Rw-Stanford.Txt': -0.026069174370931633,
 'En-Simlex-999.Txt': 0.004121492811489375,
 'En-Simverb-3500.Txt': 0.0010383895809322364,
 'En-Verb-143.Txt': -0.02331580374208352,
 'En-Ws-353-All.Txt': 0.05060424464387637,
 'En-Ws-353-Rel.Txt': 0.061243005737747894,
 'En-Ws-353-Sim.Txt': 0.10137823167244028,
 'En-Yp-130.Txt': 0.002115663974941137}
Epoch 1 | Loss: 5.541747260146631| spearmanr: 0.004121492811489375



 37%|███▋      | 2703/7370 [09:30<2:26:09,  1.88s/it][A
 37%|███▋      | 2705/7370 [09:30<1:42:49,  1.32s/it][A
 37%|███▋      | 2707/7370 [09:30<1:13:16,  1.06it/s][A
 37%|███▋      | 2709/7370 [09:30<52:36,  1.48it/s]  [A
 37%|███▋      | 2711/7370 [09:30<38:29,  2.02it/s][A
 37%|███▋      | 2713/7370 [09:31<28:52,  2.69it/s][A
 37%|███▋      | 2715/7370 [09:31<22:00,  3.52it/s][A
 37%|███▋      | 2717/7370 [09:31<16:56,  4.58it/s][A
 37%|███▋      | 2719/7370 [09:31<13:01,  5.95it/s][A
 37%|███▋      | 2722/7370 [09:31<09:19,  8.31it/s][A
 37%|███▋      | 2724/7370 [09:31<07:53,  9.81it/s][A
 37%|███▋      | 2727/7370 [09:31<06:18, 12.27it/s][A
 37%|███▋      | 2730/7370 [09:32<05:23, 14.36it/s][A
 37%|███▋      | 2733/7370 [09:32<04:51, 15.93it/s][A
 37%|███▋      | 2736/7370 [09:32<04:28, 17.25it/s][A
 37%|███▋      | 2739/7370 [09:32<04:10, 18.51it/s][A
 37%|███▋      | 2742/7370 [09:32<03:59, 19.30it/s][A
 37%|███▋      | 2745/7370 [09:32<03:51, 20.02it/s][A
 

{'En-Mc-30.Txt': 0.057656773680479986,
 'En-Men-Tr-3K.Txt': 0.09579035421105073,
 'En-Mturk-287.Txt': 0.299145234852151,
 'En-Mturk-771.Txt': 0.06375420089256095,
 'En-Rg-65.Txt': 0.10952601332849975,
 'En-Rw-Stanford.Txt': 0.018275606472816154,
 'En-Simlex-999.Txt': -0.025337979604661633,
 'En-Simverb-3500.Txt': -0.015955522967740308,
 'En-Verb-143.Txt': 0.207396723923253,
 'En-Ws-353-All.Txt': 0.07558441613164245,
 'En-Ws-353-Rel.Txt': 0.048746242779082356,
 'En-Ws-353-Sim.Txt': 0.12575266667027013,
 'En-Yp-130.Txt': -0.03726567174758523}
Epoch 1 | Loss: 5.479034632605475| spearmanr: -0.025337979604661633



 38%|███▊      | 2777/7370 [09:45<1:08:51,  1.11it/s][A
 38%|███▊      | 2780/7370 [09:45<47:46,  1.60it/s]  [A
 38%|███▊      | 2783/7370 [09:45<33:47,  2.26it/s][A
 38%|███▊      | 2786/7370 [09:45<24:32,  3.11it/s][A
 38%|███▊      | 2789/7370 [09:45<18:07,  4.21it/s][A
 38%|███▊      | 2792/7370 [09:45<13:39,  5.59it/s][A
 38%|███▊      | 2795/7370 [09:46<10:34,  7.21it/s][A
 38%|███▊      | 2798/7370 [09:46<08:25,  9.04it/s][A
 38%|███▊      | 2801/7370 [09:46<06:56, 10.96it/s][A
 38%|███▊      | 2804/7370 [09:46<05:54, 12.90it/s][A
 38%|███▊      | 2807/7370 [09:46<05:17, 14.39it/s][A
 38%|███▊      | 2810/7370 [09:46<04:44, 16.03it/s][A
 38%|███▊      | 2813/7370 [09:46<04:23, 17.31it/s][A
 38%|███▊      | 2816/7370 [09:47<04:07, 18.40it/s][A
 38%|███▊      | 2819/7370 [09:47<03:57, 19.18it/s][A
 38%|███▊      | 2822/7370 [09:47<03:50, 19.76it/s][A
 38%|███▊      | 2825/7370 [09:47<03:43, 20.32it/s][A
 38%|███▊      | 2828/7370 [09:47<03:45, 20.12it/s][A
 38%|

{'En-Mc-30.Txt': 0.07736849117807999,
 'En-Men-Tr-3K.Txt': 0.08289763905038564,
 'En-Mturk-287.Txt': 0.3648330555618025,
 'En-Mturk-771.Txt': 0.07718489262711686,
 'En-Rg-65.Txt': 0.29399087788176254,
 'En-Rw-Stanford.Txt': 0.011812497696053338,
 'En-Simlex-999.Txt': -0.002282950786117447,
 'En-Simverb-3500.Txt': -0.044976803419653565,
 'En-Verb-143.Txt': 0.25726621850146025,
 'En-Ws-353-All.Txt': 0.07528402203754021,
 'En-Ws-353-Rel.Txt': 0.1032176998605793,
 'En-Ws-353-Sim.Txt': 0.10925052794346335,
 'En-Yp-130.Txt': -0.04978057380407758}
Epoch 1 | Loss: 5.418460072091456| spearmanr: -0.002282950786117447



 39%|███▊      | 2851/7370 [09:59<1:08:04,  1.11it/s][A
 39%|███▊      | 2854/7370 [09:59<47:11,  1.60it/s]  [A
 39%|███▉      | 2857/7370 [10:00<33:30,  2.25it/s][A
 39%|███▉      | 2860/7370 [10:00<24:11,  3.11it/s][A
 39%|███▉      | 2863/7370 [10:00<17:52,  4.20it/s][A
 39%|███▉      | 2866/7370 [10:00<13:32,  5.54it/s][A
 39%|███▉      | 2869/7370 [10:00<10:31,  7.12it/s][A
 39%|███▉      | 2872/7370 [10:00<08:23,  8.93it/s][A
 39%|███▉      | 2875/7370 [10:00<06:59, 10.72it/s][A
 39%|███▉      | 2878/7370 [10:01<06:04, 12.31it/s][A
 39%|███▉      | 2881/7370 [10:01<05:25, 13.79it/s][A
 39%|███▉      | 2884/7370 [10:01<04:50, 15.44it/s][A
 39%|███▉      | 2887/7370 [10:01<04:32, 16.47it/s][A
 39%|███▉      | 2890/7370 [10:01<04:14, 17.59it/s][A
 39%|███▉      | 2893/7370 [10:01<04:04, 18.33it/s][A
 39%|███▉      | 2896/7370 [10:01<03:54, 19.08it/s][A
 39%|███▉      | 2899/7370 [10:02<03:52, 19.19it/s][A
 39%|███▉      | 2902/7370 [10:02<03:45, 19.84it/s][A
 39%|

{'En-Mc-30.Txt': -0.07391894061599999,
 'En-Men-Tr-3K.Txt': 0.10835712417669417,
 'En-Mturk-287.Txt': 0.29534476227358675,
 'En-Mturk-771.Txt': 0.10804460518983473,
 'En-Rg-65.Txt': -0.010444899870776774,
 'En-Rw-Stanford.Txt': 0.0028806953777924388,
 'En-Simlex-999.Txt': -0.025272627498724512,
 'En-Simverb-3500.Txt': -0.04416644803607402,
 'En-Verb-143.Txt': 0.17502388631284044,
 'En-Ws-353-All.Txt': 0.0917062045666857,
 'En-Ws-353-Rel.Txt': 0.08277750183049366,
 'En-Ws-353-Sim.Txt': 0.12012827275157771,
 'En-Yp-130.Txt': -0.005359959716041818}
Epoch 1 | Loss: 5.360816467710245| spearmanr: -0.025272627498724512



 40%|███▉      | 2926/7370 [10:14<1:00:10,  1.23it/s][A
 40%|███▉      | 2929/7370 [10:14<43:13,  1.71it/s]  [A
 40%|███▉      | 2932/7370 [10:14<31:18,  2.36it/s][A
 40%|███▉      | 2935/7370 [10:14<23:01,  3.21it/s][A
 40%|███▉      | 2938/7370 [10:14<17:10,  4.30it/s][A
 40%|███▉      | 2941/7370 [10:15<13:03,  5.65it/s][A
 40%|███▉      | 2944/7370 [10:15<10:11,  7.23it/s][A
 40%|███▉      | 2947/7370 [10:15<08:09,  9.03it/s][A
 40%|████      | 2950/7370 [10:15<07:28,  9.85it/s][A
 40%|████      | 2953/7370 [10:15<06:16, 11.74it/s][A
 40%|████      | 2956/7370 [10:15<05:23, 13.65it/s][A
 40%|████      | 2959/7370 [10:15<04:46, 15.40it/s][A
 40%|████      | 2962/7370 [10:16<04:25, 16.58it/s][A
 40%|████      | 2965/7370 [10:16<04:10, 17.56it/s][A
 40%|████      | 2968/7370 [10:16<04:04, 17.99it/s][A
 40%|████      | 2971/7370 [10:16<03:56, 18.63it/s][A
 40%|████      | 2974/7370 [10:16<03:45, 19.54it/s][A
 40%|████      | 2977/7370 [10:16<03:37, 20.23it/s][A
 40%|

{'En-Mc-30.Txt': 0.22767033709728,
 'En-Men-Tr-3K.Txt': 0.09962189709264009,
 'En-Mturk-287.Txt': 0.2969482012488688,
 'En-Mturk-771.Txt': 0.10399980250443727,
 'En-Rg-65.Txt': 0.12539168401828726,
 'En-Rw-Stanford.Txt': 0.0028611910355423337,
 'En-Simlex-999.Txt': -0.010349230191246239,
 'En-Simverb-3500.Txt': -0.031059634391287037,
 'En-Verb-143.Txt': 0.1650648696111716,
 'En-Ws-353-All.Txt': 0.10494727554582427,
 'En-Ws-353-Rel.Txt': 0.12312465430433003,
 'En-Ws-353-Sim.Txt': 0.15785098072741482,
 'En-Yp-130.Txt': -0.0317599477813014}
Epoch 1 | Loss: 5.326330688291179| spearmanr: -0.010349230191246239



 41%|████      | 2997/7370 [10:28<1:05:15,  1.12it/s][A
 41%|████      | 3000/7370 [10:28<45:15,  1.61it/s]  [A
 41%|████      | 3003/7370 [10:28<32:08,  2.26it/s][A
 41%|████      | 3006/7370 [10:29<23:16,  3.13it/s][A
 41%|████      | 3009/7370 [10:29<17:09,  4.24it/s][A
 41%|████      | 3012/7370 [10:29<12:56,  5.61it/s][A
 41%|████      | 3015/7370 [10:29<10:06,  7.18it/s][A
 41%|████      | 3018/7370 [10:29<08:04,  8.99it/s][A
 41%|████      | 3021/7370 [10:29<06:41, 10.84it/s][A
 41%|████      | 3024/7370 [10:29<05:46, 12.53it/s][A
 41%|████      | 3027/7370 [10:30<05:04, 14.27it/s][A
 41%|████      | 3030/7370 [10:30<04:47, 15.09it/s][A
 41%|████      | 3032/7370 [10:30<04:52, 14.82it/s][A
 41%|████      | 3034/7370 [10:30<05:00, 14.44it/s][A
 41%|████      | 3036/7370 [10:30<05:00, 14.41it/s][A
 41%|████      | 3038/7370 [10:30<05:01, 14.39it/s][A
 41%|████      | 3040/7370 [10:31<05:22, 13.42it/s][A
 41%|████▏     | 3042/7370 [10:31<05:25, 13.30it/s][A
 41%|

{'En-Mc-30.Txt': -0.05741037721176,
 'En-Men-Tr-3K.Txt': 0.10357623470215697,
 'En-Mturk-287.Txt': 0.28556145310680703,
 'En-Mturk-771.Txt': 0.08995786166675696,
 'En-Rg-65.Txt': 0.13179083786316823,
 'En-Rw-Stanford.Txt': -0.010121257362881794,
 'En-Simlex-999.Txt': -0.006209150108200821,
 'En-Simverb-3500.Txt': -0.035246336840258616,
 'En-Verb-143.Txt': 0.24962802273551954,
 'En-Ws-353-All.Txt': 0.11910456325162595,
 'En-Ws-353-Rel.Txt': 0.16022115710031967,
 'En-Ws-353-Sim.Txt': 0.10221926253878684,
 'En-Yp-130.Txt': 0.10971883350359106}
Epoch 1 | Loss: 5.288927525932952| spearmanr: -0.006209150108200821



 42%|████▏     | 3069/7370 [10:43<1:16:20,  1.07s/it][A
 42%|████▏     | 3071/7370 [10:43<56:23,  1.27it/s]  [A
 42%|████▏     | 3073/7370 [10:43<41:39,  1.72it/s][A
 42%|████▏     | 3075/7370 [10:44<31:03,  2.30it/s][A
 42%|████▏     | 3077/7370 [10:44<23:27,  3.05it/s][A
 42%|████▏     | 3079/7370 [10:44<18:09,  3.94it/s][A
 42%|████▏     | 3081/7370 [10:44<14:27,  4.94it/s][A
 42%|████▏     | 3083/7370 [10:44<11:51,  6.02it/s][A
 42%|████▏     | 3085/7370 [10:44<10:05,  7.07it/s][A
 42%|████▏     | 3087/7370 [10:45<08:55,  7.99it/s][A
 42%|████▏     | 3089/7370 [10:45<07:50,  9.10it/s][A
 42%|████▏     | 3091/7370 [10:45<06:38, 10.75it/s][A
 42%|████▏     | 3093/7370 [10:45<05:46, 12.33it/s][A
 42%|████▏     | 3095/7370 [10:45<05:14, 13.61it/s][A
 42%|████▏     | 3097/7370 [10:45<04:47, 14.86it/s][A
 42%|████▏     | 3099/7370 [10:45<04:28, 15.91it/s][A
 42%|████▏     | 3101/7370 [10:45<04:12, 16.88it/s][A
 42%|████▏     | 3104/7370 [10:45<03:51, 18.42it/s][A
 42%|

{'En-Mc-30.Txt': 0.25600593100008,
 'En-Men-Tr-3K.Txt': 0.13239416976453475,
 'En-Mturk-287.Txt': 0.3123075277376309,
 'En-Mturk-771.Txt': 0.10791603322491654,
 'En-Rg-65.Txt': 0.12867058929417666,
 'En-Rw-Stanford.Txt': -0.010524546100914089,
 'En-Simlex-999.Txt': -0.0026858999799923952,
 'En-Simverb-3500.Txt': -0.02020760224801259,
 'En-Verb-143.Txt': 0.22420357016231732,
 'En-Ws-353-All.Txt': 0.18529414672039587,
 'En-Ws-353-Rel.Txt': 0.2205228246703539,
 'En-Ws-353-Sim.Txt': 0.16566220949987043,
 'En-Yp-130.Txt': 0.06317422605488604}
Epoch 1 | Loss: 5.245500094799479| spearmanr: -0.0026858999799923952



 43%|████▎     | 3142/7370 [10:58<1:03:09,  1.12it/s][A
 43%|████▎     | 3145/7370 [10:58<43:47,  1.61it/s]  [A
 43%|████▎     | 3148/7370 [10:58<31:05,  2.26it/s][A
 43%|████▎     | 3151/7370 [10:59<22:25,  3.13it/s][A
 43%|████▎     | 3154/7370 [10:59<16:31,  4.25it/s][A
 43%|████▎     | 3157/7370 [10:59<12:27,  5.63it/s][A
 43%|████▎     | 3160/7370 [10:59<09:40,  7.26it/s][A
 43%|████▎     | 3163/7370 [10:59<07:47,  9.00it/s][A
 43%|████▎     | 3166/7370 [10:59<06:24, 10.94it/s][A
 43%|████▎     | 3169/7370 [10:59<05:30, 12.71it/s][A
 43%|████▎     | 3172/7370 [11:00<04:48, 14.58it/s][A
 43%|████▎     | 3175/7370 [11:00<04:20, 16.08it/s][A
 43%|████▎     | 3178/7370 [11:00<04:02, 17.31it/s][A
 43%|████▎     | 3181/7370 [11:00<03:51, 18.13it/s][A
 43%|████▎     | 3184/7370 [11:00<03:44, 18.68it/s][A
 43%|████▎     | 3187/7370 [11:00<03:34, 19.47it/s][A
 43%|████▎     | 3190/7370 [11:00<03:36, 19.34it/s][A
 43%|████▎     | 3193/7370 [11:01<03:31, 19.79it/s][A
 43%|

{'En-Mc-30.Txt': 0.02291487159096,
 'En-Men-Tr-3K.Txt': 0.1425259837199025,
 'En-Mturk-287.Txt': 0.29322526795092585,
 'En-Mturk-771.Txt': 0.1373117937436714,
 'En-Rg-65.Txt': 0.11811991828546799,
 'En-Rw-Stanford.Txt': 0.004766646630148874,
 'En-Simlex-999.Txt': 0.005624886333928325,
 'En-Simverb-3500.Txt': -0.005043596554983078,
 'En-Verb-143.Txt': 0.1497321669993939,
 'En-Ws-353-All.Txt': 0.17446300873368675,
 'En-Ws-353-Rel.Txt': 0.2186167710650796,
 'En-Ws-353-Sim.Txt': 0.21231004834228576,
 'En-Yp-130.Txt': 0.026133447721959913}
Epoch 1 | Loss: 5.20038526569465| spearmanr: 0.005624886333928325



 44%|████▎     | 3216/7370 [11:13<1:02:17,  1.11it/s][A
 44%|████▎     | 3219/7370 [11:13<43:14,  1.60it/s]  [A
 44%|████▎     | 3222/7370 [11:13<30:35,  2.26it/s][A
 44%|████▍     | 3225/7370 [11:13<22:03,  3.13it/s][A
 44%|████▍     | 3228/7370 [11:13<16:17,  4.24it/s][A
 44%|████▍     | 3231/7370 [11:13<12:16,  5.62it/s][A
 44%|████▍     | 3234/7370 [11:13<09:32,  7.22it/s][A
 44%|████▍     | 3237/7370 [11:14<07:36,  9.06it/s][A
 44%|████▍     | 3240/7370 [11:14<06:21, 10.83it/s][A
 44%|████▍     | 3243/7370 [11:14<05:22, 12.78it/s][A
 44%|████▍     | 3246/7370 [11:14<04:43, 14.55it/s][A
 44%|████▍     | 3249/7370 [11:14<04:19, 15.89it/s][A
 44%|████▍     | 3252/7370 [11:14<04:00, 17.12it/s][A
 44%|████▍     | 3255/7370 [11:14<03:48, 18.00it/s][A
 44%|████▍     | 3258/7370 [11:15<03:38, 18.82it/s][A
 44%|████▍     | 3261/7370 [11:15<03:45, 18.20it/s][A
 44%|████▍     | 3263/7370 [11:15<03:44, 18.32it/s][A
 44%|████▍     | 3265/7370 [11:15<03:41, 18.53it/s][A
 44%|

{'En-Mc-30.Txt': 0.14586670948223998,
 'En-Men-Tr-3K.Txt': 0.1531753464127879,
 'En-Mturk-287.Txt': 0.2555629200023511,
 'En-Mturk-771.Txt': 0.14854564204390056,
 'En-Rg-65.Txt': 0.12539168401828726,
 'En-Rw-Stanford.Txt': -0.008954964196988746,
 'En-Simlex-999.Txt': 0.060644258636468926,
 'En-Simverb-3500.Txt': 0.004972694959580587,
 'En-Verb-143.Txt': 0.2667546354639953,
 'En-Ws-353-All.Txt': 0.2335864033549296,
 'En-Ws-353-Rel.Txt': 0.21466077272805967,
 'En-Ws-353-Sim.Txt': 0.3090216409153381,
 'En-Yp-130.Txt': 0.017808226686709255}
Epoch 1 | Loss: 5.154922969154084| spearmanr: 0.060644258636468926



 45%|████▍     | 3288/7370 [11:27<1:19:58,  1.18s/it][A
 45%|████▍     | 3291/7370 [11:27<56:34,  1.20it/s]  [A
 45%|████▍     | 3294/7370 [11:27<40:21,  1.68it/s][A
 45%|████▍     | 3297/7370 [11:28<29:06,  2.33it/s][A
 45%|████▍     | 3300/7370 [11:28<21:17,  3.19it/s][A
 45%|████▍     | 3303/7370 [11:28<15:57,  4.25it/s][A
 45%|████▍     | 3306/7370 [11:28<12:10,  5.56it/s][A
 45%|████▍     | 3309/7370 [11:28<09:36,  7.04it/s][A
 45%|████▍     | 3311/7370 [11:28<08:15,  8.20it/s][A
 45%|████▍     | 3314/7370 [11:28<06:36, 10.22it/s][A
 45%|████▍     | 3316/7370 [11:29<05:52, 11.51it/s][A
 45%|████▌     | 3319/7370 [11:29<04:56, 13.67it/s][A
 45%|████▌     | 3322/7370 [11:29<04:23, 15.35it/s][A
 45%|████▌     | 3325/7370 [11:29<04:01, 16.78it/s][A
 45%|████▌     | 3328/7370 [11:29<03:44, 18.01it/s][A
 45%|████▌     | 3331/7370 [11:29<03:41, 18.26it/s][A
 45%|████▌     | 3334/7370 [11:29<03:31, 19.07it/s][A
 45%|████▌     | 3337/7370 [11:30<03:24, 19.74it/s][A
 45%|

{'En-Mc-30.Txt': 0.23777259231480002,
 'En-Men-Tr-3K.Txt': 0.13138526565660286,
 'En-Mturk-287.Txt': 0.25739819094840993,
 'En-Mturk-771.Txt': 0.15446438688141118,
 'En-Rg-65.Txt': 0.14287036456153648,
 'En-Rw-Stanford.Txt': -0.014796994619542087,
 'En-Simlex-999.Txt': 0.07140211685527444,
 'En-Simverb-3500.Txt': 0.013844236108218904,
 'En-Verb-143.Txt': 0.05643476316112275,
 'En-Ws-353-All.Txt': 0.17128882481980492,
 'En-Ws-353-Rel.Txt': 0.2240592413192599,
 'En-Ws-353-Sim.Txt': 0.19719777496262147,
 'En-Yp-130.Txt': 0.06183319495265958}
Epoch 1 | Loss: 5.11000075472292| spearmanr: 0.07140211685527444



 46%|████▌     | 3361/7370 [11:42<1:16:26,  1.14s/it][A
 46%|████▌     | 3363/7370 [11:42<1:00:11,  1.11it/s][A
 46%|████▌     | 3366/7370 [11:42<41:47,  1.60it/s]  [A
 46%|████▌     | 3369/7370 [11:42<29:35,  2.25it/s][A
 46%|████▌     | 3372/7370 [11:42<21:22,  3.12it/s][A
 46%|████▌     | 3375/7370 [11:42<15:47,  4.21it/s][A
 46%|████▌     | 3378/7370 [11:42<11:56,  5.57it/s][A
 46%|████▌     | 3381/7370 [11:43<09:22,  7.09it/s][A
 46%|████▌     | 3384/7370 [11:43<07:28,  8.89it/s][A
 46%|████▌     | 3387/7370 [11:43<06:11, 10.73it/s][A
 46%|████▌     | 3390/7370 [11:43<05:15, 12.60it/s][A
 46%|████▌     | 3393/7370 [11:43<04:35, 14.43it/s][A
 46%|████▌     | 3396/7370 [11:43<04:07, 16.03it/s][A
 46%|████▌     | 3399/7370 [11:43<03:48, 17.36it/s][A
 46%|████▌     | 3402/7370 [11:44<03:54, 16.95it/s][A
 46%|████▌     | 3404/7370 [11:44<04:12, 15.73it/s][A
 46%|████▌     | 3406/7370 [11:44<04:28, 14.76it/s][A
 46%|████▌     | 3408/7370 [11:44<04:34, 14.45it/s][A
 46

{'En-Mc-30.Txt': -0.04804731140039999,
 'En-Men-Tr-3K.Txt': 0.12456302201042217,
 'En-Mturk-287.Txt': 0.27674764690482373,
 'En-Mturk-771.Txt': 0.16674776266617927,
 'En-Rg-65.Txt': 0.029668804189902635,
 'En-Rw-Stanford.Txt': -0.02977946552253908,
 'En-Simlex-999.Txt': 0.03991207056514042,
 'En-Simverb-3500.Txt': -0.001046494029926947,
 'En-Verb-143.Txt': 0.1996378615575345,
 'En-Ws-353-All.Txt': 0.17377805713638894,
 'En-Ws-353-Rel.Txt': 0.21561924079178188,
 'En-Ws-353-Sim.Txt': 0.2194217064033042,
 'En-Yp-130.Txt': 0.10722418247802071}
Epoch 1 | Loss: 5.065603004260496| spearmanr: 0.03991207056514042



 47%|████▋     | 3434/7370 [11:56<1:15:54,  1.16s/it][A
 47%|████▋     | 3436/7370 [11:57<54:31,  1.20it/s]  [A
 47%|████▋     | 3438/7370 [11:57<39:35,  1.66it/s][A
 47%|████▋     | 3440/7370 [11:57<29:11,  2.24it/s][A
 47%|████▋     | 3442/7370 [11:57<22:04,  2.97it/s][A
 47%|████▋     | 3444/7370 [11:57<16:55,  3.87it/s][A
 47%|████▋     | 3446/7370 [11:57<13:18,  4.92it/s][A
 47%|████▋     | 3448/7370 [11:58<10:43,  6.09it/s][A
 47%|████▋     | 3450/7370 [11:58<09:00,  7.25it/s][A
 47%|████▋     | 3452/7370 [11:58<07:47,  8.38it/s][A
 47%|████▋     | 3454/7370 [11:58<06:54,  9.44it/s][A
 47%|████▋     | 3456/7370 [11:58<06:24, 10.19it/s][A
 47%|████▋     | 3458/7370 [11:58<06:01, 10.82it/s][A
 47%|████▋     | 3460/7370 [11:58<05:52, 11.10it/s][A
 47%|████▋     | 3462/7370 [11:59<05:31, 11.78it/s][A
 47%|████▋     | 3464/7370 [11:59<04:52, 13.35it/s][A
 47%|████▋     | 3467/7370 [11:59<04:11, 15.53it/s][A
 47%|████▋     | 3469/7370 [11:59<04:02, 16.09it/s][A
 47%|

{'En-Mc-30.Txt': -0.053960826649679994,
 'En-Men-Tr-3K.Txt': 0.11582580690928636,
 'En-Mturk-287.Txt': 0.294866166781373,
 'En-Mturk-771.Txt': 0.16196365423123588,
 'En-Rg-65.Txt': -0.058412110922900995,
 'En-Rw-Stanford.Txt': -0.0039014503308242815,
 'En-Simlex-999.Txt': 0.08424970425041088,
 'En-Simverb-3500.Txt': -0.0074335234582544535,
 'En-Verb-143.Txt': 0.23482021983277568,
 'En-Ws-353-All.Txt': 0.12984210445239056,
 'En-Ws-353-Rel.Txt': 0.1632537532783877,
 'En-Ws-353-Sim.Txt': 0.18286701096335828,
 'En-Yp-130.Txt': 0.09672499176773211}
Epoch 1 | Loss: 5.022302140114821| spearmanr: 0.08424970425041088



 48%|████▊     | 3506/7370 [12:12<1:15:28,  1.17s/it][A
 48%|████▊     | 3508/7370 [12:12<59:21,  1.08it/s]  [A
 48%|████▊     | 3511/7370 [12:12<41:10,  1.56it/s][A
 48%|████▊     | 3514/7370 [12:12<29:06,  2.21it/s][A
 48%|████▊     | 3517/7370 [12:12<21:05,  3.05it/s][A
 48%|████▊     | 3520/7370 [12:13<15:30,  4.14it/s][A
 48%|████▊     | 3523/7370 [12:13<11:43,  5.47it/s][A
 48%|████▊     | 3526/7370 [12:13<09:04,  7.05it/s][A
 48%|████▊     | 3529/7370 [12:13<07:15,  8.83it/s][A
 48%|████▊     | 3532/7370 [12:13<06:01, 10.62it/s][A
 48%|████▊     | 3535/7370 [12:13<05:07, 12.49it/s][A
 48%|████▊     | 3538/7370 [12:14<04:34, 13.98it/s][A
 48%|████▊     | 3541/7370 [12:14<04:09, 15.37it/s][A
 48%|████▊     | 3544/7370 [12:14<03:48, 16.74it/s][A
 48%|████▊     | 3547/7370 [12:14<03:35, 17.71it/s][A
 48%|████▊     | 3550/7370 [12:14<03:28, 18.35it/s][A
 48%|████▊     | 3553/7370 [12:14<03:21, 18.96it/s][A
 48%|████▊     | 3556/7370 [12:14<03:15, 19.49it/s][A
 48%|

{'En-Mc-30.Txt': -0.23555502409631995,
 'En-Men-Tr-3K.Txt': 0.13277404779456925,
 'En-Mturk-287.Txt': 0.2964195719640919,
 'En-Mturk-771.Txt': 0.12435099111199652,
 'En-Rg-65.Txt': -0.04267865415552839,
 'En-Rw-Stanford.Txt': -0.02613771539542154,
 'En-Simlex-999.Txt': 0.01777473373587523,
 'En-Simverb-3500.Txt': -0.020562097922080472,
 'En-Verb-143.Txt': 0.011772032363331684,
 'En-Ws-353-All.Txt': 0.1827375541972345,
 'En-Ws-353-Rel.Txt': 0.21507471162690014,
 'En-Ws-353-Sim.Txt': 0.20243643873213477,
 'En-Yp-130.Txt': 0.1861867591805482}
Epoch 1 | Loss: 4.980132772577349| spearmanr: 0.01777473373587523



 49%|████▊     | 3581/7370 [12:27<1:00:41,  1.04it/s][A
 49%|████▊     | 3584/7370 [12:27<41:32,  1.52it/s]  [A
 49%|████▊     | 3587/7370 [12:27<29:06,  2.17it/s][A
 49%|████▊     | 3590/7370 [12:27<20:51,  3.02it/s][A
 49%|████▉     | 3593/7370 [12:27<15:16,  4.12it/s][A
 49%|████▉     | 3596/7370 [12:28<11:28,  5.48it/s][A
 49%|████▉     | 3599/7370 [12:28<08:49,  7.12it/s][A
 49%|████▉     | 3602/7370 [12:28<07:03,  8.89it/s][A
 49%|████▉     | 3605/7370 [12:28<05:56, 10.57it/s][A
 49%|████▉     | 3608/7370 [12:28<05:03, 12.39it/s][A
 49%|████▉     | 3611/7370 [12:28<04:28, 13.99it/s][A
 49%|████▉     | 3614/7370 [12:28<04:02, 15.46it/s][A
 49%|████▉     | 3617/7370 [12:29<03:44, 16.74it/s][A
 49%|████▉     | 3620/7370 [12:29<03:30, 17.82it/s][A
 49%|████▉     | 3623/7370 [12:29<03:20, 18.68it/s][A
 49%|████▉     | 3626/7370 [12:29<03:20, 18.71it/s][A
 49%|████▉     | 3629/7370 [12:29<03:15, 19.17it/s][A
 49%|████▉     | 3632/7370 [12:29<03:09, 19.71it/s][A
 49%|

{'En-Mc-30.Txt': 0.08870272873919999,
 'En-Men-Tr-3K.Txt': 0.11832040371337368,
 'En-Mturk-287.Txt': 0.2846755144343019,
 'En-Mturk-771.Txt': 0.20469603373687895,
 'En-Rg-65.Txt': 0.07726581625926515,
 'En-Rw-Stanford.Txt': -0.019007839230145162,
 'En-Simlex-999.Txt': 0.02042464265550788,
 'En-Simverb-3500.Txt': -0.025548736952816765,
 'En-Verb-143.Txt': 0.12550432487398913,
 'En-Ws-353-All.Txt': 0.18569550742610755,
 'En-Ws-353-Rel.Txt': 0.19778198084223358,
 'En-Ws-353-Sim.Txt': 0.2509495418029455,
 'En-Yp-130.Txt': 0.08206110819338615}
Epoch 1 | Loss: 4.938815254017086| spearmanr: 0.02042464265550788


[A
 50%|████▉     | 3655/7370 [12:42<58:39,  1.06it/s]  [A
 50%|████▉     | 3658/7370 [12:42<40:37,  1.52it/s][A
 50%|████▉     | 3661/7370 [12:42<28:44,  2.15it/s][A
 50%|████▉     | 3664/7370 [12:42<20:48,  2.97it/s][A
 50%|████▉     | 3667/7370 [12:43<15:17,  4.03it/s][A
 50%|████▉     | 3670/7370 [12:43<11:30,  5.36it/s][A
 50%|████▉     | 3673/7370 [12:43<08:54,  6.92it/s][A
 50%|████▉     | 3676/7370 [12:43<07:04,  8.71it/s][A
 50%|████▉     | 3679/7370 [12:43<05:50, 10.54it/s][A
 50%|████▉     | 3682/7370 [12:43<04:57, 12.41it/s][A
 50%|█████     | 3685/7370 [12:43<04:23, 13.98it/s][A
 50%|█████     | 3688/7370 [12:44<04:27, 13.76it/s][A
 50%|█████     | 3691/7370 [12:44<03:59, 15.33it/s][A
 50%|█████     | 3694/7370 [12:44<03:38, 16.79it/s][A
 50%|█████     | 3696/7370 [12:44<03:31, 17.38it/s][A
 50%|█████     | 3699/7370 [12:44<03:19, 18.43it/s][A
 50%|█████     | 3702/7370 [12:44<03:12, 19.06it/s][A
 50%|█████     | 3705/7370 [12:44<03:10, 19.22it/s][A
 50%

{'En-Mc-30.Txt': 0.021929285716079997,
 'En-Men-Tr-3K.Txt': 0.10937070449307905,
 'En-Mturk-287.Txt': 0.30893037771813464,
 'En-Mturk-771.Txt': 0.16057742035317796,
 'En-Rg-65.Txt': 0.0384742514227347,
 'En-Rw-Stanford.Txt': -0.03201806676925561,
 'En-Simlex-999.Txt': 0.029604488293457437,
 'En-Simverb-3500.Txt': 0.007317122497010099,
 'En-Verb-143.Txt': 0.05697374064013068,
 'En-Ws-353-All.Txt': 0.07604561982175326,
 'En-Ws-353-Rel.Txt': 0.08965001610910053,
 'En-Ws-353-Sim.Txt': 0.10665090771963671,
 'En-Yp-130.Txt': 0.031472583973681446}
Epoch 1 | Loss: 4.913448519980741| spearmanr: 0.029604488293457437



 51%|█████     | 3726/7370 [12:57<1:05:28,  1.08s/it][A
 51%|█████     | 3729/7370 [12:57<42:49,  1.42it/s]  [A
 51%|█████     | 3731/7370 [12:57<32:41,  1.86it/s][A
 51%|█████     | 3734/7370 [12:57<22:09,  2.74it/s][A
 51%|█████     | 3737/7370 [12:57<15:37,  3.87it/s][A
 51%|█████     | 3740/7370 [12:57<11:30,  5.25it/s][A
 51%|█████     | 3743/7370 [12:57<08:47,  6.88it/s][A
 51%|█████     | 3746/7370 [12:58<06:57,  8.68it/s][A
 51%|█████     | 3749/7370 [12:58<06:13,  9.70it/s][A
 51%|█████     | 3751/7370 [12:58<05:59, 10.06it/s][A
 51%|█████     | 3753/7370 [12:58<05:34, 10.81it/s][A
 51%|█████     | 3755/7370 [12:58<05:16, 11.43it/s][A
 51%|█████     | 3757/7370 [12:58<05:09, 11.69it/s][A
 51%|█████     | 3759/7370 [12:59<04:59, 12.04it/s][A
 51%|█████     | 3761/7370 [12:59<04:50, 12.42it/s][A
 51%|█████     | 3763/7370 [12:59<04:49, 12.47it/s][A
 51%|█████     | 3765/7370 [12:59<04:38, 12.93it/s][A
 51%|█████     | 3767/7370 [12:59<04:31, 13.26it/s][A
 51%|

{'En-Mc-30.Txt': -0.12270544142255999,
 'En-Men-Tr-3K.Txt': 0.11881957221204095,
 'En-Mturk-287.Txt': 0.2671112956313838,
 'En-Mturk-771.Txt': 0.10570736458719263,
 'En-Rg-65.Txt': 0.030488530508874987,
 'En-Rw-Stanford.Txt': 0.002973320222023178,
 'En-Simlex-999.Txt': 0.00478078374167969,
 'En-Simverb-3500.Txt': -0.00810609477081714,
 'En-Verb-143.Txt': 0.18989202251547324,
 'En-Ws-353-All.Txt': 0.13999816640819282,
 'En-Ws-353-Rel.Txt': 0.1829837659727801,
 'En-Ws-353-Sim.Txt': 0.15369761781851782,
 'En-Yp-130.Txt': 0.06421539927090039}
Epoch 1 | Loss: 4.889377675183296| spearmanr: 0.00478078374167969



 52%|█████▏    | 3799/7370 [13:12<1:05:49,  1.11s/it][A
 52%|█████▏    | 3801/7370 [13:12<48:27,  1.23it/s]  [A
 52%|█████▏    | 3803/7370 [13:12<36:02,  1.65it/s][A
 52%|█████▏    | 3805/7370 [13:13<26:50,  2.21it/s][A
 52%|█████▏    | 3808/7370 [13:13<17:35,  3.37it/s][A
 52%|█████▏    | 3811/7370 [13:13<12:19,  4.81it/s][A
 52%|█████▏    | 3814/7370 [13:13<09:05,  6.52it/s][A
 52%|█████▏    | 3816/7370 [13:13<07:37,  7.77it/s][A
 52%|█████▏    | 3819/7370 [13:13<05:57,  9.93it/s][A
 52%|█████▏    | 3822/7370 [13:13<04:59, 11.83it/s][A
 52%|█████▏    | 3824/7370 [13:14<04:31, 13.06it/s][A
 52%|█████▏    | 3827/7370 [13:14<03:55, 15.02it/s][A
 52%|█████▏    | 3830/7370 [13:14<03:35, 16.46it/s][A
 52%|█████▏    | 3833/7370 [13:14<03:20, 17.60it/s][A
 52%|█████▏    | 3836/7370 [13:14<03:12, 18.41it/s][A
 52%|█████▏    | 3839/7370 [13:14<03:02, 19.34it/s][A
 52%|█████▏    | 3842/7370 [13:14<02:59, 19.64it/s][A
 52%|█████▏    | 3845/7370 [13:15<03:02, 19.33it/s][A
 52%|

{'En-Mc-30.Txt': 0.15301220707512,
 'En-Men-Tr-3K.Txt': 0.13473390706729274,
 'En-Mturk-287.Txt': 0.26899581449891835,
 'En-Mturk-771.Txt': 0.15768575129217544,
 'En-Rg-65.Txt': 0.07631387601787791,
 'En-Rw-Stanford.Txt': 0.039331152261735225,
 'En-Simlex-999.Txt': 0.044373963780477026,
 'En-Simverb-3500.Txt': -0.011790193662096079,
 'En-Verb-143.Txt': 0.19484739754635208,
 'En-Ws-353-All.Txt': 0.11687445400542057,
 'En-Ws-353-Rel.Txt': 0.19989157791772644,
 'En-Ws-353-Sim.Txt': 0.09129977538987954,
 'En-Yp-130.Txt': 0.08350625661721406}
Epoch 1 | Loss: 4.861027526424221| spearmanr: 0.044373963780477026



 53%|█████▎    | 3872/7370 [13:27<1:02:33,  1.07s/it][A
 53%|█████▎    | 3875/7370 [13:27<40:56,  1.42it/s]  [A
 53%|█████▎    | 3878/7370 [13:27<27:57,  2.08it/s][A
 53%|█████▎    | 3881/7370 [13:27<19:41,  2.95it/s][A
 53%|█████▎    | 3884/7370 [13:28<14:18,  4.06it/s][A
 53%|█████▎    | 3887/7370 [13:28<10:41,  5.43it/s][A
 53%|█████▎    | 3890/7370 [13:28<08:16,  7.01it/s][A
 53%|█████▎    | 3893/7370 [13:28<06:33,  8.83it/s][A
 53%|█████▎    | 3896/7370 [13:28<05:25, 10.68it/s][A
 53%|█████▎    | 3899/7370 [13:28<04:34, 12.64it/s][A
 53%|█████▎    | 3902/7370 [13:28<04:00, 14.40it/s][A
 53%|█████▎    | 3905/7370 [13:29<03:37, 15.91it/s][A
 53%|█████▎    | 3908/7370 [13:29<03:21, 17.14it/s][A
 53%|█████▎    | 3911/7370 [13:29<03:10, 18.19it/s][A
 53%|█████▎    | 3914/7370 [13:29<03:07, 18.44it/s][A
 53%|█████▎    | 3917/7370 [13:29<03:00, 19.10it/s][A
 53%|█████▎    | 3920/7370 [13:29<02:53, 19.91it/s][A
 53%|█████▎    | 3923/7370 [13:29<02:50, 20.20it/s][A
 53%|

{'En-Mc-30.Txt': 0.055685601930719995,
 'En-Men-Tr-3K.Txt': 0.13059055116118987,
 'En-Mturk-287.Txt': 0.19819956976427505,
 'En-Mturk-771.Txt': 0.14378499416156856,
 'En-Rg-65.Txt': 0.09976862585428044,
 'En-Rw-Stanford.Txt': 0.009019976152191246,
 'En-Simlex-999.Txt': 0.006762756342527984,
 'En-Simverb-3500.Txt': -0.02432069619217826,
 'En-Verb-143.Txt': 0.1455209026671454,
 'En-Ws-353-All.Txt': 0.06287426739340485,
 'En-Ws-353-Rel.Txt': 0.09070723298510698,
 'En-Ws-353-Sim.Txt': 0.09407100301474383,
 'En-Yp-130.Txt': -0.05825572378243431}
Epoch 1 | Loss: 4.831399621378187| spearmanr: 0.006762756342527984



 54%|█████▎    | 3946/7370 [13:41<51:19,  1.11it/s]  [A
 54%|█████▎    | 3949/7370 [13:42<35:42,  1.60it/s][A
 54%|█████▎    | 3951/7370 [13:42<27:58,  2.04it/s][A
 54%|█████▎    | 3953/7370 [13:42<21:40,  2.63it/s][A
 54%|█████▎    | 3956/7370 [13:42<15:01,  3.79it/s][A
 54%|█████▎    | 3959/7370 [13:42<10:55,  5.20it/s][A
 54%|█████▍    | 3962/7370 [13:42<08:21,  6.80it/s][A
 54%|█████▍    | 3964/7370 [13:42<07:09,  7.94it/s][A
 54%|█████▍    | 3966/7370 [13:43<06:07,  9.26it/s][A
 54%|█████▍    | 3968/7370 [13:43<05:15, 10.79it/s][A
 54%|█████▍    | 3970/7370 [13:43<04:40, 12.11it/s][A
 54%|█████▍    | 3972/7370 [13:43<04:11, 13.51it/s][A
 54%|█████▍    | 3975/7370 [13:43<03:36, 15.70it/s][A
 54%|█████▍    | 3977/7370 [13:43<03:25, 16.48it/s][A
 54%|█████▍    | 3979/7370 [13:43<03:17, 17.21it/s][A
 54%|█████▍    | 3981/7370 [13:43<03:11, 17.74it/s][A
 54%|█████▍    | 3983/7370 [13:43<03:06, 18.12it/s][A
 54%|█████▍    | 3986/7370 [13:44<02:56, 19.12it/s][A
 54%|██

{'En-Mc-30.Txt': 0.07244056180368,
 'En-Men-Tr-3K.Txt': 0.13363279251938026,
 'En-Mturk-287.Txt': 0.2303451026794501,
 'En-Mturk-771.Txt': 0.1341619636768522,
 'En-Rg-65.Txt': 0.17592384516526047,
 'En-Rw-Stanford.Txt': 0.03414441036230095,
 'En-Simlex-999.Txt': 0.010005354680219054,
 'En-Simverb-3500.Txt': -0.022695378082110817,
 'En-Verb-143.Txt': 0.2412618051509189,
 'En-Ws-353-All.Txt': 0.06624478346300022,
 'En-Ws-353-Rel.Txt': 0.10746631712706192,
 'En-Ws-353-Sim.Txt': 0.09662346985360067,
 'En-Yp-130.Txt': -0.009149830222334012}
Epoch 1 | Loss: 4.801021152995972| spearmanr: 0.010005354680219054


 55%|█████▍    | 4018/7370 [13:56<51:09,  1.09it/s]  [A
 55%|█████▍    | 4021/7370 [13:56<35:26,  1.58it/s][A
 55%|█████▍    | 4024/7370 [13:56<25:05,  2.22it/s][A
 55%|█████▍    | 4027/7370 [13:57<18:06,  3.08it/s][A
 55%|█████▍    | 4030/7370 [13:57<13:24,  4.15it/s][A
 55%|█████▍    | 4033/7370 [13:57<10:05,  5.51it/s][A
 55%|█████▍    | 4036/7370 [13:57<07:54,  7.02it/s][A
 55%|█████▍    | 4039/7370 [13:57<06:17,  8.82it/s][A
 55%|█████▍    | 4042/7370 [13:57<05:13, 10.62it/s][A
 55%|█████▍    | 4045/7370 [13:57<04:27, 12.43it/s][A
 55%|█████▍    | 4048/7370 [13:58<03:52, 14.28it/s][A
 55%|█████▍    | 4051/7370 [13:58<03:33, 15.55it/s][A
 55%|█████▌    | 4054/7370 [13:58<03:17, 16.82it/s][A
 55%|█████▌    | 4057/7370 [13:58<03:06, 17.74it/s][A
 55%|█████▌    | 4060/7370 [13:58<02:57, 18.68it/s][A
 55%|█████▌    | 4063/7370 [13:58<02:51, 19.32it/s][A
 55%|█████▌    | 4066/7370 [13:58<02:45, 19.95it/s][A
 55%|█████▌    | 4069/7370 [13:59<02:45, 19.96it/s][A
 55%|███

{'En-Mc-30.Txt': 0.03523469502696,
 'En-Men-Tr-3K.Txt': 0.13969064023740374,
 'En-Mturk-287.Txt': 0.3060478025696289,
 'En-Mturk-771.Txt': 0.15046025367421112,
 'En-Rg-65.Txt': -0.03358233629338356,
 'En-Rw-Stanford.Txt': 0.012097921565401059,
 'En-Simlex-999.Txt': 0.00498249691869456,
 'En-Simverb-3500.Txt': 0.006078959489240057,
 'En-Verb-143.Txt': 0.2074148239132197,
 'En-Ws-353-All.Txt': 0.11295031663187625,
 'En-Ws-353-Rel.Txt': 0.13897927993703663,
 'En-Ws-353-Sim.Txt': 0.18447872912175214,
 'En-Yp-130.Txt': 0.0654731365158457}
Epoch 1 | Loss: 4.770196891641465| spearmanr: 0.00498249691869456



 56%|█████▌    | 4091/7370 [14:11<54:09,  1.01it/s]  [A
 56%|█████▌    | 4094/7370 [14:11<36:34,  1.49it/s][A
 56%|█████▌    | 4096/7370 [14:11<28:21,  1.92it/s][A
 56%|█████▌    | 4099/7370 [14:11<19:27,  2.80it/s][A
 56%|█████▌    | 4101/7370 [14:11<15:21,  3.55it/s][A
 56%|█████▌    | 4103/7370 [14:12<12:05,  4.50it/s][A
 56%|█████▌    | 4105/7370 [14:12<09:35,  5.67it/s][A
 56%|█████▌    | 4107/7370 [14:12<08:08,  6.68it/s][A
 56%|█████▌    | 4109/7370 [14:12<07:02,  7.71it/s][A
 56%|█████▌    | 4111/7370 [14:12<06:14,  8.70it/s][A
 56%|█████▌    | 4113/7370 [14:12<05:41,  9.55it/s][A
 56%|█████▌    | 4115/7370 [14:12<05:08, 10.55it/s][A
 56%|█████▌    | 4117/7370 [14:13<04:52, 11.13it/s][A
 56%|█████▌    | 4119/7370 [14:13<04:35, 11.81it/s][A
 56%|█████▌    | 4121/7370 [14:13<04:27, 12.15it/s][A
 56%|█████▌    | 4123/7370 [14:13<04:23, 12.30it/s][A
 56%|█████▌    | 4125/7370 [14:13<04:14, 12.73it/s][A
 56%|█████▌    | 4127/7370 [14:13<04:19, 12.52it/s][A
 56%|██

{'En-Mc-30.Txt': 0.15498337882487997,
 'En-Men-Tr-3K.Txt': 0.14329116269051728,
 'En-Mturk-287.Txt': 0.276634219511212,
 'En-Mturk-771.Txt': 0.13677660836285752,
 'En-Rg-65.Txt': 0.034904475517532514,
 'En-Rw-Stanford.Txt': -0.0017926689583869291,
 'En-Simlex-999.Txt': 0.0007103250922721551,
 'En-Simverb-3500.Txt': 0.008574898757822679,
 'En-Verb-143.Txt': 0.2910407997792927,
 'En-Ws-353-All.Txt': 0.09432897812431809,
 'En-Ws-353-Rel.Txt': 0.12485739366763605,
 'En-Ws-353-Sim.Txt': 0.15064888092801165,
 'En-Yp-130.Txt': 0.07326111217163296}
Epoch 1 | Loss: 4.739660561686236| spearmanr: 0.0007103250922721551



 56%|█████▋    | 4164/7370 [14:27<49:48,  1.07it/s]  [A
 57%|█████▋    | 4167/7370 [14:27<34:22,  1.55it/s][A
 57%|█████▋    | 4169/7370 [14:27<26:49,  1.99it/s][A
 57%|█████▋    | 4172/7370 [14:27<18:35,  2.87it/s][A
 57%|█████▋    | 4175/7370 [14:27<13:23,  3.98it/s][A
 57%|█████▋    | 4178/7370 [14:27<09:55,  5.36it/s][A
 57%|█████▋    | 4181/7370 [14:27<07:37,  6.98it/s][A
 57%|█████▋    | 4184/7370 [14:28<06:02,  8.80it/s][A
 57%|█████▋    | 4187/7370 [14:28<05:00, 10.58it/s][A
 57%|█████▋    | 4190/7370 [14:28<04:14, 12.49it/s][A
 57%|█████▋    | 4193/7370 [14:28<03:42, 14.25it/s][A
 57%|█████▋    | 4196/7370 [14:28<03:24, 15.53it/s][A
 57%|█████▋    | 4199/7370 [14:28<03:09, 16.73it/s][A
 57%|█████▋    | 4202/7370 [14:28<02:58, 17.80it/s][A
 57%|█████▋    | 4205/7370 [14:29<02:49, 18.71it/s][A
 57%|█████▋    | 4208/7370 [14:29<02:47, 18.87it/s][A
 57%|█████▋    | 4211/7370 [14:29<02:45, 19.11it/s][A
 57%|█████▋    | 4214/7370 [14:29<02:41, 19.59it/s][A
 57%|██

{'En-Mc-30.Txt': 0.15744734351207998,
 'En-Men-Tr-3K.Txt': 0.15878499424339146,
 'En-Mturk-287.Txt': 0.2287115386458185,
 'En-Mturk-771.Txt': 0.11106233742833747,
 'En-Rg-65.Txt': -0.008567462172485253,
 'En-Rw-Stanford.Txt': -0.0043957618516549165,
 'En-Simlex-999.Txt': 0.019485811215687026,
 'En-Simverb-3500.Txt': 0.0027687027697711826,
 'En-Verb-143.Txt': 0.13979125028769177,
 'En-Ws-353-All.Txt': 0.08377361901399037,
 'En-Ws-353-Rel.Txt': 0.07087121646365024,
 'En-Ws-353-Sim.Txt': 0.1897181658975764,
 'En-Yp-130.Txt': 0.04150116439033156}
Epoch 1 | Loss: 4.709816035357389| spearmanr: 0.019485811215687026



 57%|█████▋    | 4237/7370 [14:41<46:47,  1.12it/s][A
 58%|█████▊    | 4240/7370 [14:41<32:27,  1.61it/s][A
 58%|█████▊    | 4243/7370 [14:41<22:58,  2.27it/s][A
 58%|█████▊    | 4246/7370 [14:41<16:32,  3.15it/s][A
 58%|█████▊    | 4249/7370 [14:42<12:12,  4.26it/s][A
 58%|█████▊    | 4252/7370 [14:42<09:13,  5.63it/s][A
 58%|█████▊    | 4255/7370 [14:42<07:11,  7.23it/s][A
 58%|█████▊    | 4258/7370 [14:42<05:45,  9.01it/s][A
 58%|█████▊    | 4261/7370 [14:42<04:50, 10.68it/s][A
 58%|█████▊    | 4264/7370 [14:42<04:05, 12.67it/s][A
 58%|█████▊    | 4267/7370 [14:42<03:34, 14.46it/s][A
 58%|█████▊    | 4270/7370 [14:43<03:13, 16.00it/s][A
 58%|█████▊    | 4273/7370 [14:43<03:00, 17.17it/s][A
 58%|█████▊    | 4276/7370 [14:43<02:48, 18.34it/s][A
 58%|█████▊    | 4279/7370 [14:43<02:42, 19.04it/s][A
 58%|█████▊    | 4282/7370 [14:43<02:41, 19.13it/s][A
 58%|█████▊    | 4285/7370 [14:43<02:36, 19.68it/s][A
 58%|█████▊    | 4288/7370 [14:43<02:34, 19.91it/s][A
 58%|████

{'En-Mc-30.Txt': 0.05001848315016,
 'En-Men-Tr-3K.Txt': 0.12171357723956738,
 'En-Mturk-287.Txt': 0.29956279435657834,
 'En-Mturk-771.Txt': 0.10984576538480033,
 'En-Rg-65.Txt': 0.03138758518129628,
 'En-Rw-Stanford.Txt': -0.0008331248431024132,
 'En-Simlex-999.Txt': -0.04684901861707108,
 'En-Simverb-3500.Txt': 0.007571209693879626,
 'En-Verb-143.Txt': -0.0015817380120885572,
 'En-Ws-353-All.Txt': 0.09704402189927533,
 'En-Ws-353-Rel.Txt': 0.07634552817158752,
 'En-Ws-353-Sim.Txt': 0.16871326540930953,
 'En-Yp-130.Txt': 0.1403251613615486}
Epoch 1 | Loss: 4.68017487052202| spearmanr: -0.04684901861707108



 58%|█████▊    | 4311/7370 [14:56<46:37,  1.09it/s][A
 59%|█████▊    | 4314/7370 [14:56<32:19,  1.58it/s][A
 59%|█████▊    | 4317/7370 [14:56<22:49,  2.23it/s][A
 59%|█████▊    | 4320/7370 [14:56<16:29,  3.08it/s][A
 59%|█████▊    | 4323/7370 [14:56<12:08,  4.18it/s][A
 59%|█████▊    | 4326/7370 [14:56<09:07,  5.56it/s][A
 59%|█████▊    | 4329/7370 [14:57<07:08,  7.10it/s][A
 59%|█████▉    | 4332/7370 [14:57<05:40,  8.93it/s][A
 59%|█████▉    | 4335/7370 [14:57<04:41, 10.78it/s][A
 59%|█████▉    | 4338/7370 [14:57<04:00, 12.60it/s][A
 59%|█████▉    | 4341/7370 [14:57<03:31, 14.32it/s][A
 59%|█████▉    | 4344/7370 [14:57<03:13, 15.68it/s][A
 59%|█████▉    | 4347/7370 [14:57<02:56, 17.09it/s][A
 59%|█████▉    | 4350/7370 [14:58<02:51, 17.64it/s][A
 59%|█████▉    | 4353/7370 [14:58<02:42, 18.59it/s][A
 59%|█████▉    | 4356/7370 [14:58<02:37, 19.11it/s][A
 59%|█████▉    | 4359/7370 [14:58<02:32, 19.78it/s][A
 59%|█████▉    | 4362/7370 [14:58<02:30, 19.94it/s][A
 59%|████

{'En-Mc-30.Txt': 0.12837256020312002,
 'En-Men-Tr-3K.Txt': 0.14582829429531585,
 'En-Mturk-287.Txt': 0.2084900277394433,
 'En-Mturk-771.Txt': 0.11573300798894898,
 'En-Rg-65.Txt': 0.16386593544102196,
 'En-Rw-Stanford.Txt': 0.004074352055226607,
 'En-Simlex-999.Txt': -0.0351136978211759,
 'En-Simverb-3500.Txt': -0.013234831310687747,
 'En-Verb-143.Txt': 0.07157439921325583,
 'En-Ws-353-All.Txt': 0.11844599563020852,
 'En-Ws-353-Rel.Txt': 0.11774385167578026,
 'En-Ws-353-Sim.Txt': 0.15274063600552426,
 'En-Yp-130.Txt': 0.1497290378485901}
Epoch 1 | Loss: 4.651294029675997| spearmanr: -0.0351136978211759



 59%|█████▉    | 4385/7370 [15:10<44:46,  1.11it/s][A
 60%|█████▉    | 4388/7370 [15:10<31:04,  1.60it/s][A
 60%|█████▉    | 4391/7370 [15:10<22:00,  2.26it/s][A
 60%|█████▉    | 4394/7370 [15:11<15:54,  3.12it/s][A
 60%|█████▉    | 4397/7370 [15:11<11:47,  4.20it/s][A
 60%|█████▉    | 4400/7370 [15:11<08:58,  5.51it/s][A
 60%|█████▉    | 4403/7370 [15:11<06:56,  7.13it/s][A
 60%|█████▉    | 4406/7370 [15:11<05:34,  8.86it/s][A
 60%|█████▉    | 4409/7370 [15:11<04:37, 10.66it/s][A
 60%|█████▉    | 4412/7370 [15:11<03:56, 12.52it/s][A
 60%|█████▉    | 4415/7370 [15:12<03:28, 14.18it/s][A
 60%|█████▉    | 4418/7370 [15:12<03:07, 15.73it/s][A
 60%|█████▉    | 4421/7370 [15:12<02:57, 16.57it/s][A
 60%|██████    | 4424/7370 [15:12<03:10, 15.45it/s][A
 60%|██████    | 4426/7370 [15:12<03:02, 16.17it/s][A
 60%|██████    | 4429/7370 [15:12<02:48, 17.45it/s][A
 60%|██████    | 4431/7370 [15:13<02:45, 17.75it/s][A
 60%|██████    | 4434/7370 [15:13<02:36, 18.80it/s][A
 60%|████

{'En-Mc-30.Txt': 0.27103611559199997,
 'En-Men-Tr-3K.Txt': 0.1290605034626398,
 'En-Mturk-287.Txt': 0.3095813409355135,
 'En-Mturk-771.Txt': 0.06553209038909817,
 'En-Rg-65.Txt': 0.24681695036412762,
 'En-Rw-Stanford.Txt': -0.011680629903622368,
 'En-Simlex-999.Txt': -0.0015106482403187019,
 'En-Simverb-3500.Txt': -0.05422081737633361,
 'En-Verb-143.Txt': 0.1489679452008006,
 'En-Ws-353-All.Txt': 0.1411880984600511,
 'En-Ws-353-Rel.Txt': 0.204529144587696,
 'En-Ws-353-Sim.Txt': 0.15891850244287858,
 'En-Yp-130.Txt': 0.08385192612493082}
Epoch 1 | Loss: 4.632525125050898| spearmanr: -0.0015106482403187019



 60%|██████    | 4457/7370 [15:25<57:12,  1.18s/it]  [A
 61%|██████    | 4460/7370 [15:25<35:53,  1.35it/s][A
 61%|██████    | 4463/7370 [15:25<23:56,  2.02it/s][A
 61%|██████    | 4465/7370 [15:25<18:29,  2.62it/s][A
 61%|██████    | 4468/7370 [15:26<12:56,  3.74it/s][A
 61%|██████    | 4470/7370 [15:26<10:37,  4.55it/s][A
 61%|██████    | 4472/7370 [15:26<08:45,  5.52it/s][A
 61%|██████    | 4474/7370 [15:26<07:23,  6.53it/s][A
 61%|██████    | 4476/7370 [15:26<06:23,  7.55it/s][A
 61%|██████    | 4478/7370 [15:26<05:40,  8.49it/s][A
 61%|██████    | 4480/7370 [15:26<05:05,  9.45it/s][A
 61%|██████    | 4482/7370 [15:27<04:38, 10.38it/s][A
 61%|██████    | 4484/7370 [15:27<04:20, 11.06it/s][A
 61%|██████    | 4486/7370 [15:27<04:09, 11.54it/s][A
 61%|██████    | 4488/7370 [15:27<04:06, 11.70it/s][A
 61%|██████    | 4490/7370 [15:27<03:57, 12.13it/s][A
 61%|██████    | 4492/7370 [15:27<04:03, 11.82it/s][A
 61%|██████    | 4494/7370 [15:28<04:00, 11.98it/s][A
 61%|██

{'En-Mc-30.Txt': 0.30701000002512,
 'En-Men-Tr-3K.Txt': 0.1388887773931026,
 'En-Mturk-287.Txt': 0.2829287849640877,
 'En-Mturk-771.Txt': 0.09128387108577028,
 'En-Rg-65.Txt': 0.0947180540180314,
 'En-Rw-Stanford.Txt': -0.006879901683567266,
 'En-Simlex-999.Txt': 0.023428277931994927,
 'En-Simverb-3500.Txt': -0.04434528409731383,
 'En-Verb-143.Txt': 0.30838863460736116,
 'En-Ws-353-All.Txt': 0.20869422758560274,
 'En-Ws-353-Rel.Txt': 0.2722809062241058,
 'En-Ws-353-Sim.Txt': 0.21133837940938355,
 'En-Yp-130.Txt': 0.12042209416421847}
Epoch 1 | Loss: 4.615082237402914| spearmanr: 0.023428277931994927


 61%|██████▏   | 4529/7370 [15:40<48:15,  1.02s/it]  [A
 61%|██████▏   | 4531/7370 [15:40<36:02,  1.31it/s][A
 62%|██████▏   | 4533/7370 [15:41<26:47,  1.76it/s][A
 62%|██████▏   | 4535/7370 [15:41<20:01,  2.36it/s][A
 62%|██████▏   | 4537/7370 [15:41<14:56,  3.16it/s][A
 62%|██████▏   | 4539/7370 [15:41<11:18,  4.17it/s][A
 62%|██████▏   | 4541/7370 [15:41<08:43,  5.41it/s][A
 62%|██████▏   | 4544/7370 [15:41<06:14,  7.55it/s][A
 62%|██████▏   | 4546/7370 [15:41<05:14,  8.99it/s][A
 62%|██████▏   | 4548/7370 [15:41<04:26, 10.61it/s][A
 62%|██████▏   | 4550/7370 [15:41<03:54, 12.02it/s][A
 62%|██████▏   | 4552/7370 [15:42<03:33, 13.20it/s][A
 62%|██████▏   | 4554/7370 [15:42<03:22, 13.93it/s][A
 62%|██████▏   | 4556/7370 [15:42<03:06, 15.11it/s][A
 62%|██████▏   | 4558/7370 [15:42<02:53, 16.17it/s][A
 62%|██████▏   | 4561/7370 [15:42<02:38, 17.70it/s][A
 62%|██████▏   | 4564/7370 [15:42<02:28, 18.89it/s][A
 62%|██████▏   | 4567/7370 [15:42<02:23, 19.54it/s][A
 62%|███

{'En-Mc-30.Txt': 0.17962302569688,
 'En-Men-Tr-3K.Txt': 0.15631470938725028,
 'En-Mturk-287.Txt': 0.19345526600976873,
 'En-Mturk-771.Txt': 0.10605974751182073,
 'En-Rg-65.Txt': 0.214583196079376,
 'En-Rw-Stanford.Txt': -0.007012238003398277,
 'En-Simlex-999.Txt': 0.029870975961485205,
 'En-Simverb-3500.Txt': -0.02148582141377883,
 'En-Verb-143.Txt': 0.3423060046949309,
 'En-Ws-353-All.Txt': 0.12122795742220326,
 'En-Ws-353-Rel.Txt': 0.19148543412588873,
 'En-Ws-353-Sim.Txt': 0.12559497338283016,
 'En-Yp-130.Txt': 0.06791781122704738}
Epoch 1 | Loss: 4.5936770831502| spearmanr: 0.029870975961485205



 62%|██████▏   | 4603/7370 [15:55<42:42,  1.08it/s][A
 62%|██████▏   | 4606/7370 [15:55<29:22,  1.57it/s][A
 63%|██████▎   | 4609/7370 [15:55<20:40,  2.23it/s][A
 63%|██████▎   | 4612/7370 [15:56<14:52,  3.09it/s][A
 63%|██████▎   | 4615/7370 [15:56<10:57,  4.19it/s][A
 63%|██████▎   | 4618/7370 [15:56<08:15,  5.55it/s][A
 63%|██████▎   | 4621/7370 [15:56<06:23,  7.17it/s][A
 63%|██████▎   | 4624/7370 [15:56<05:11,  8.83it/s][A
 63%|██████▎   | 4627/7370 [15:56<04:18, 10.63it/s][A
 63%|██████▎   | 4630/7370 [15:56<03:38, 12.52it/s][A
 63%|██████▎   | 4633/7370 [15:57<03:11, 14.28it/s][A
 63%|██████▎   | 4636/7370 [15:57<02:54, 15.64it/s][A
 63%|██████▎   | 4639/7370 [15:57<02:40, 17.02it/s][A
 63%|██████▎   | 4642/7370 [15:57<02:30, 18.15it/s][A
 63%|██████▎   | 4645/7370 [15:57<02:33, 17.78it/s][A
 63%|██████▎   | 4648/7370 [15:57<02:25, 18.71it/s][A
 63%|██████▎   | 4651/7370 [15:57<02:20, 19.29it/s][A
 63%|██████▎   | 4654/7370 [15:58<02:20, 19.37it/s][A
 63%|████

{'En-Mc-30.Txt': 0.2242207865352,
 'En-Men-Tr-3K.Txt': 0.14837754185452015,
 'En-Mturk-287.Txt': 0.25936077301062416,
 'En-Mturk-771.Txt': 0.08359854663662299,
 'En-Rg-65.Txt': 0.14540887187190252,
 'En-Rw-Stanford.Txt': -0.03332985280664857,
 'En-Simlex-999.Txt': 0.024718996150704987,
 'En-Simverb-3500.Txt': 0.0042365004023651506,
 'En-Verb-143.Txt': 0.23745276281793007,
 'En-Ws-353-All.Txt': 0.14073603335376478,
 'En-Ws-353-Rel.Txt': 0.1289254416773703,
 'En-Ws-353-Sim.Txt': 0.20565601001736755,
 'En-Yp-130.Txt': 0.04326282947182782}
Epoch 1 | Loss: 4.571014025521029| spearmanr: 0.024718996150704987



 63%|██████▎   | 4674/7370 [16:10<56:47,  1.26s/it][A
 63%|██████▎   | 4676/7370 [16:10<44:25,  1.01it/s][A
 63%|██████▎   | 4678/7370 [16:10<34:14,  1.31it/s][A
 64%|██████▎   | 4680/7370 [16:11<26:15,  1.71it/s][A
 64%|██████▎   | 4682/7370 [16:11<19:56,  2.25it/s][A
 64%|██████▎   | 4684/7370 [16:11<14:59,  2.99it/s][A
 64%|██████▎   | 4687/7370 [16:11<10:08,  4.41it/s][A
 64%|██████▎   | 4689/7370 [16:11<08:03,  5.55it/s][A
 64%|██████▎   | 4691/7370 [16:11<06:28,  6.90it/s][A
 64%|██████▎   | 4694/7370 [16:11<04:51,  9.18it/s][A
 64%|██████▎   | 4696/7370 [16:11<04:10, 10.66it/s][A
 64%|██████▎   | 4698/7370 [16:12<03:39, 12.19it/s][A
 64%|██████▍   | 4700/7370 [16:12<03:20, 13.30it/s][A
 64%|██████▍   | 4702/7370 [16:12<03:02, 14.59it/s][A
 64%|██████▍   | 4705/7370 [16:12<02:40, 16.64it/s][A
 64%|██████▍   | 4707/7370 [16:12<02:35, 17.18it/s][A
 64%|██████▍   | 4709/7370 [16:12<02:29, 17.76it/s][A
 64%|██████▍   | 4712/7370 [16:12<02:19, 19.01it/s][A
 64%|████

{'En-Mc-30.Txt': 0.28212395668439993,
 'En-Men-Tr-3K.Txt': 0.16636769012166952,
 'En-Mturk-287.Txt': 0.2658824552077524,
 'En-Mturk-771.Txt': 0.11418389278290536,
 'En-Rg-65.Txt': 0.12060554002686803,
 'En-Rw-Stanford.Txt': -0.019931510771752325,
 'En-Simlex-999.Txt': 0.033721168693648815,
 'En-Simverb-3500.Txt': -0.005047600218378512,
 'En-Verb-143.Txt': 0.30608189144160713,
 'En-Ws-353-All.Txt': 0.17897245766153919,
 'En-Ws-353-Rel.Txt': 0.19116540736281243,
 'En-Ws-353-Sim.Txt': 0.245482841171693,
 'En-Yp-130.Txt': 0.11574930877074611}
Epoch 1 | Loss: 4.54782206340023| spearmanr: 0.033721168693648815


 64%|██████▍   | 4748/7370 [16:25<50:43,  1.16s/it][A
 64%|██████▍   | 4750/7370 [16:25<39:56,  1.09it/s][A
 64%|██████▍   | 4752/7370 [16:25<30:50,  1.41it/s][A
 65%|██████▍   | 4754/7370 [16:25<23:30,  1.85it/s][A
 65%|██████▍   | 4756/7370 [16:26<17:52,  2.44it/s][A
 65%|██████▍   | 4758/7370 [16:26<13:33,  3.21it/s][A
 65%|██████▍   | 4760/7370 [16:26<10:20,  4.20it/s][A
 65%|██████▍   | 4762/7370 [16:26<07:59,  5.44it/s][A
 65%|██████▍   | 4764/7370 [16:26<06:17,  6.90it/s][A
 65%|██████▍   | 4766/7370 [16:26<05:05,  8.53it/s][A
 65%|██████▍   | 4768/7370 [16:26<04:17, 10.12it/s][A
 65%|██████▍   | 4771/7370 [16:26<03:25, 12.66it/s][A
 65%|██████▍   | 4774/7370 [16:26<02:56, 14.71it/s][A
 65%|██████▍   | 4776/7370 [16:27<02:44, 15.75it/s][A
 65%|██████▍   | 4778/7370 [16:27<02:35, 16.63it/s][A
 65%|██████▍   | 4780/7370 [16:27<02:31, 17.09it/s][A
 65%|██████▍   | 4782/7370 [16:27<02:47, 15.48it/s][A
 65%|██████▍   | 4784/7370 [16:27<02:56, 14.67it/s][A
 65%|█████

{'En-Mc-30.Txt': 0.26660097915503994,
 'En-Men-Tr-3K.Txt': 0.18977799857678398,
 'En-Mturk-287.Txt': 0.2553596413477491,
 'En-Mturk-771.Txt': 0.1482521952820754,
 'En-Rg-65.Txt': 0.12637006704415749,
 'En-Rw-Stanford.Txt': 0.017349123766808743,
 'En-Simlex-999.Txt': 0.015639128122030364,
 'En-Simverb-3500.Txt': 0.004638554528885481,
 'En-Verb-143.Txt': 0.19751815162143613,
 'En-Ws-353-All.Txt': 0.16571134075659202,
 'En-Ws-353-Rel.Txt': 0.1262777643390225,
 'En-Ws-353-Sim.Txt': 0.24684333227901825,
 'En-Yp-130.Txt': 0.004885184729539279}
Epoch 1 | Loss: 4.524435676363747| spearmanr: 0.015639128122030364



 65%|██████▌   | 4820/7370 [16:40<1:09:55,  1.65s/it][A
 65%|██████▌   | 4822/7370 [16:40<49:48,  1.17s/it]  [A
 65%|██████▌   | 4824/7370 [16:41<35:49,  1.18it/s][A
 65%|██████▌   | 4826/7370 [16:41<26:05,  1.62it/s][A
 66%|██████▌   | 4828/7370 [16:41<19:16,  2.20it/s][A
 66%|██████▌   | 4830/7370 [16:41<14:27,  2.93it/s][A
 66%|██████▌   | 4832/7370 [16:41<11:11,  3.78it/s][A
 66%|██████▌   | 4834/7370 [16:41<08:41,  4.86it/s][A
 66%|██████▌   | 4836/7370 [16:41<07:05,  5.96it/s][A
 66%|██████▌   | 4838/7370 [16:42<06:01,  7.01it/s][A
 66%|██████▌   | 4840/7370 [16:42<05:18,  7.94it/s][A
 66%|██████▌   | 4842/7370 [16:42<04:36,  9.14it/s][A
 66%|██████▌   | 4844/7370 [16:42<03:51, 10.89it/s][A
 66%|██████▌   | 4847/7370 [16:42<03:07, 13.48it/s][A
 66%|██████▌   | 4850/7370 [16:42<02:40, 15.68it/s][A
 66%|██████▌   | 4852/7370 [16:42<02:33, 16.36it/s][A
 66%|██████▌   | 4855/7370 [16:43<02:21, 17.84it/s][A
 66%|██████▌   | 4857/7370 [16:43<02:21, 17.71it/s][A
 66%|

{'En-Mc-30.Txt': 0.39177038526479996,
 'En-Men-Tr-3K.Txt': 0.17808216453272696,
 'En-Mturk-287.Txt': 0.30432517314609436,
 'En-Mturk-771.Txt': 0.142067532539272,
 'En-Rg-65.Txt': 0.14599061313052802,
 'En-Rw-Stanford.Txt': -0.022252353690963192,
 'En-Simlex-999.Txt': 0.06734550212877893,
 'En-Simverb-3500.Txt': -0.0020319778552384407,
 'En-Verb-143.Txt': 0.1657607136698908,
 'En-Ws-353-All.Txt': 0.13811782908804462,
 'En-Ws-353-Rel.Txt': 0.14739993375843446,
 'En-Ws-353-Sim.Txt': 0.18257404157149676,
 'En-Yp-130.Txt': 0.01192768036266027}
Epoch 1 | Loss: 4.5010027147567575| spearmanr: 0.06734550212877893



 66%|██████▋   | 4896/7370 [16:56<37:31,  1.10it/s][A
 66%|██████▋   | 4899/7370 [16:56<25:57,  1.59it/s][A
 67%|██████▋   | 4902/7370 [16:56<18:21,  2.24it/s][A
 67%|██████▋   | 4905/7370 [16:56<13:13,  3.11it/s][A
 67%|██████▋   | 4908/7370 [16:56<09:46,  4.20it/s][A
 67%|██████▋   | 4911/7370 [16:56<07:23,  5.55it/s][A
 67%|██████▋   | 4914/7370 [16:56<05:43,  7.15it/s][A
 67%|██████▋   | 4917/7370 [16:57<04:35,  8.91it/s][A
 67%|██████▋   | 4920/7370 [16:57<03:47, 10.75it/s][A
 67%|██████▋   | 4923/7370 [16:57<03:16, 12.47it/s][A
 67%|██████▋   | 4926/7370 [16:57<02:52, 14.20it/s][A
 67%|██████▋   | 4929/7370 [16:57<02:38, 15.38it/s][A
 67%|██████▋   | 4932/7370 [16:57<02:26, 16.67it/s][A
 67%|██████▋   | 4935/7370 [16:58<02:17, 17.71it/s][A
 67%|██████▋   | 4938/7370 [16:58<02:10, 18.57it/s][A
 67%|██████▋   | 4941/7370 [16:58<02:07, 19.12it/s][A
 67%|██████▋   | 4944/7370 [16:58<02:03, 19.67it/s][A
 67%|██████▋   | 4947/7370 [16:58<02:02, 19.75it/s][A
 67%|████

{'En-Mc-30.Txt': 0.22988790531575998,
 'En-Men-Tr-3K.Txt': 0.15266778999560132,
 'En-Mturk-287.Txt': 0.30101351238387974,
 'En-Mturk-771.Txt': 0.12541745630186008,
 'En-Rg-65.Txt': 0.15400277682887073,
 'En-Rw-Stanford.Txt': -0.006854147790352094,
 'En-Simlex-999.Txt': 0.016936174991214498,
 'En-Simverb-3500.Txt': -0.010971702202641678,
 'En-Verb-143.Txt': 0.209938766958574,
 'En-Ws-353-All.Txt': 0.14023680136322297,
 'En-Ws-353-Rel.Txt': 0.17439745598057074,
 'En-Ws-353-Sim.Txt': 0.1576128947836329,
 'En-Yp-130.Txt': 0.08414345462541482}
Epoch 1 | Loss: 4.477853247382249| spearmanr: 0.016936174991214498



 67%|██████▋   | 4967/7370 [17:10<36:42,  1.09it/s][A
 67%|██████▋   | 4970/7370 [17:10<25:26,  1.57it/s][A
 67%|██████▋   | 4973/7370 [17:11<18:02,  2.21it/s][A
 68%|██████▊   | 4975/7370 [17:11<14:20,  2.78it/s][A
 68%|██████▊   | 4977/7370 [17:11<11:23,  3.50it/s][A
 68%|██████▊   | 4979/7370 [17:11<08:56,  4.46it/s][A
 68%|██████▊   | 4981/7370 [17:11<07:02,  5.65it/s][A
 68%|██████▊   | 4983/7370 [17:11<05:37,  7.07it/s][A
 68%|██████▊   | 4986/7370 [17:11<04:12,  9.43it/s][A
 68%|██████▊   | 4988/7370 [17:11<03:38, 10.93it/s][A
 68%|██████▊   | 4991/7370 [17:11<02:59, 13.28it/s][A
 68%|██████▊   | 4994/7370 [17:12<02:40, 14.80it/s][A
 68%|██████▊   | 4996/7370 [17:12<02:30, 15.77it/s][A
 68%|██████▊   | 4999/7370 [17:12<02:17, 17.30it/s][A
 68%|██████▊   | 5002/7370 [17:12<02:09, 18.32it/s][A
 68%|██████▊   | 5005/7370 [17:12<02:04, 19.03it/s][A
 68%|██████▊   | 5008/7370 [17:12<02:01, 19.43it/s][A
 68%|██████▊   | 5011/7370 [17:12<01:58, 19.85it/s][A
 68%|████

{'En-Mc-30.Txt': 0.3104595505872,
 'En-Men-Tr-3K.Txt': 0.15444253543236572,
 'En-Mturk-287.Txt': 0.3149747742081948,
 'En-Mturk-771.Txt': 0.14825252074638906,
 'En-Rg-65.Txt': 0.21212401712245896,
 'En-Rw-Stanford.Txt': 0.023216038983912588,
 'En-Simlex-999.Txt': 0.042349837847006785,
 'En-Simverb-3500.Txt': 0.004948523177432985,
 'En-Verb-143.Txt': 0.2798831615198301,
 'En-Ws-353-All.Txt': 0.18253458919842253,
 'En-Ws-353-Rel.Txt': 0.24937037566140724,
 'En-Ws-353-Sim.Txt': 0.18503452065934697,
 'En-Yp-130.Txt': 0.08151553342819463}
Epoch 1 | Loss: 4.455141586555096| spearmanr: 0.042349837847006785



 68%|██████▊   | 5040/7370 [17:27<40:59,  1.06s/it][A
 68%|██████▊   | 5043/7370 [17:27<28:21,  1.37it/s][A
 68%|██████▊   | 5046/7370 [17:27<20:03,  1.93it/s][A
 68%|██████▊   | 5048/7370 [17:27<15:54,  2.43it/s][A
 69%|██████▊   | 5050/7370 [17:27<12:27,  3.10it/s][A
 69%|██████▊   | 5053/7370 [17:27<08:45,  4.41it/s][A
 69%|██████▊   | 5055/7370 [17:28<07:04,  5.45it/s][A
 69%|██████▊   | 5057/7370 [17:28<05:44,  6.72it/s][A
 69%|██████▊   | 5060/7370 [17:28<04:18,  8.92it/s][A
 69%|██████▊   | 5063/7370 [17:28<03:29, 11.00it/s][A
 69%|██████▊   | 5066/7370 [17:28<02:57, 13.00it/s][A
 69%|██████▉   | 5069/7370 [17:28<02:38, 14.53it/s][A
 69%|██████▉   | 5072/7370 [17:28<02:22, 16.17it/s][A
 69%|██████▉   | 5075/7370 [17:29<02:12, 17.35it/s][A
 69%|██████▉   | 5078/7370 [17:29<02:17, 16.64it/s][A
 69%|██████▉   | 5080/7370 [17:29<02:24, 15.85it/s][A
 69%|██████▉   | 5082/7370 [17:29<02:31, 15.11it/s][A
 69%|██████▉   | 5084/7370 [17:29<02:38, 14.44it/s][A
 69%|████

{'En-Mc-30.Txt': 0.1010225521752,
 'En-Men-Tr-3K.Txt': 0.15986502538618602,
 'En-Mturk-287.Txt': 0.31283301353805815,
 'En-Mturk-771.Txt': 0.16022345079002065,
 'En-Rg-65.Txt': 0.22571560834671023,
 'En-Rw-Stanford.Txt': 0.013659598958674093,
 'En-Simlex-999.Txt': 0.03724512828366634,
 'En-Simverb-3500.Txt': 0.014181234772500242,
 'En-Verb-143.Txt': 0.25733861846132694,
 'En-Ws-353-All.Txt': 0.17188887602544287,
 'En-Ws-353-Rel.Txt': 0.1537650000588133,
 'En-Ws-353-Sim.Txt': 0.2264104564608768,
 'En-Yp-130.Txt': 0.07564331648987377}
Epoch 1 | Loss: 4.432569207708515| spearmanr: 0.03724512828366634



 69%|██████▉   | 5112/7370 [17:42<1:01:17,  1.63s/it][A
 69%|██████▉   | 5114/7370 [17:42<43:45,  1.16s/it]  [A
 69%|██████▉   | 5116/7370 [17:42<31:28,  1.19it/s][A
 69%|██████▉   | 5118/7370 [17:42<22:54,  1.64it/s][A
 69%|██████▉   | 5120/7370 [17:42<16:54,  2.22it/s][A
 69%|██████▉   | 5122/7370 [17:42<12:42,  2.95it/s][A
 70%|██████▉   | 5124/7370 [17:43<09:52,  3.79it/s][A
 70%|██████▉   | 5126/7370 [17:43<07:44,  4.83it/s][A
 70%|██████▉   | 5128/7370 [17:43<06:17,  5.94it/s][A
 70%|██████▉   | 5130/7370 [17:43<05:12,  7.16it/s][A
 70%|██████▉   | 5132/7370 [17:43<04:32,  8.20it/s][A
 70%|██████▉   | 5134/7370 [17:43<04:01,  9.25it/s][A
 70%|██████▉   | 5136/7370 [17:44<03:39, 10.17it/s][A
 70%|██████▉   | 5138/7370 [17:44<03:34, 10.40it/s][A
 70%|██████▉   | 5140/7370 [17:44<03:08, 11.85it/s][A
 70%|██████▉   | 5143/7370 [17:44<02:34, 14.37it/s][A
 70%|██████▉   | 5145/7370 [17:44<02:24, 15.41it/s][A
 70%|██████▉   | 5148/7370 [17:44<02:10, 17.01it/s][A
 70%|

{'En-Mc-30.Txt': 0.18331897272768,
 'En-Men-Tr-3K.Txt': 0.15873397484047494,
 'En-Mturk-287.Txt': 0.3124678454394561,
 'En-Mturk-771.Txt': 0.13741403022120732,
 'En-Rg-65.Txt': 0.23251140395883593,
 'En-Rw-Stanford.Txt': 0.01048035338743923,
 'En-Simlex-999.Txt': 0.035005049709614565,
 'En-Simverb-3500.Txt': -0.037291497662066415,
 'En-Verb-143.Txt': 0.3172837741209883,
 'En-Ws-353-All.Txt': 0.13976159500434884,
 'En-Ws-353-Rel.Txt': 0.18191163601083687,
 'En-Ws-353-Sim.Txt': 0.13074010338871483,
 'En-Yp-130.Txt': 0.08603838987856094}
Epoch 1 | Loss: 4.415200567447845| spearmanr: 0.035005049709614565



 70%|███████   | 5186/7370 [17:57<40:42,  1.12s/it][A
 70%|███████   | 5189/7370 [17:57<26:07,  1.39it/s][A
 70%|███████   | 5191/7370 [17:57<19:46,  1.84it/s][A
 70%|███████   | 5193/7370 [17:58<14:52,  2.44it/s][A
 70%|███████   | 5195/7370 [17:58<11:12,  3.23it/s][A
 71%|███████   | 5198/7370 [17:58<07:37,  4.74it/s][A
 71%|███████   | 5201/7370 [17:58<05:34,  6.49it/s][A
 71%|███████   | 5204/7370 [17:58<04:20,  8.31it/s][A
 71%|███████   | 5207/7370 [17:58<03:31, 10.24it/s][A
 71%|███████   | 5209/7370 [17:58<03:07, 11.52it/s][A
 71%|███████   | 5211/7370 [17:58<02:48, 12.83it/s][A
 71%|███████   | 5213/7370 [17:59<02:33, 14.06it/s][A
 71%|███████   | 5215/7370 [17:59<02:23, 15.02it/s][A
 71%|███████   | 5218/7370 [17:59<02:08, 16.79it/s][A
 71%|███████   | 5220/7370 [17:59<02:03, 17.43it/s][A
 71%|███████   | 5222/7370 [17:59<02:00, 17.80it/s][A
 71%|███████   | 5224/7370 [17:59<02:00, 17.77it/s][A
 71%|███████   | 5226/7370 [17:59<01:59, 17.94it/s][A
 71%|████

{'En-Mc-30.Txt': 0.11753111557943999,
 'En-Men-Tr-3K.Txt': 0.2096586458011454,
 'En-Mturk-287.Txt': 0.30407762375357006,
 'En-Mturk-771.Txt': 0.12831971651407037,
 'En-Rg-65.Txt': 0.1399352154839258,
 'En-Rw-Stanford.Txt': 0.0028240035623874455,
 'En-Simlex-999.Txt': 0.04003440563903655,
 'En-Simverb-3500.Txt': -0.014591219951996501,
 'En-Verb-143.Txt': 0.18293559303827767,
 'En-Ws-353-All.Txt': 0.15715674191777726,
 'En-Ws-353-Rel.Txt': 0.2111179071771269,
 'En-Ws-353-Sim.Txt': 0.1550380107617576,
 'En-Yp-130.Txt': 0.051837932078921914}
Epoch 1 | Loss: 4.400592603950145| spearmanr: 0.04003440563903655



 71%|███████▏  | 5261/7370 [18:12<32:15,  1.09it/s][A
 71%|███████▏  | 5264/7370 [18:12<22:19,  1.57it/s][A
 71%|███████▏  | 5267/7370 [18:12<15:48,  2.22it/s][A
 72%|███████▏  | 5270/7370 [18:12<11:24,  3.07it/s][A
 72%|███████▏  | 5273/7370 [18:13<08:25,  4.15it/s][A
 72%|███████▏  | 5276/7370 [18:13<06:20,  5.50it/s][A
 72%|███████▏  | 5279/7370 [18:13<04:55,  7.08it/s][A
 72%|███████▏  | 5282/7370 [18:13<03:58,  8.76it/s][A
 72%|███████▏  | 5285/7370 [18:13<03:16, 10.62it/s][A
 72%|███████▏  | 5288/7370 [18:13<02:47, 12.43it/s][A
 72%|███████▏  | 5291/7370 [18:14<02:28, 13.99it/s][A
 72%|███████▏  | 5294/7370 [18:14<02:16, 15.24it/s][A
 72%|███████▏  | 5297/7370 [18:14<02:04, 16.64it/s][A
 72%|███████▏  | 5300/7370 [18:14<01:55, 17.86it/s][A
 72%|███████▏  | 5303/7370 [18:14<01:52, 18.42it/s][A
 72%|███████▏  | 5306/7370 [18:14<01:47, 19.17it/s][A
 72%|███████▏  | 5309/7370 [18:14<01:45, 19.59it/s][A
 72%|███████▏  | 5312/7370 [18:15<01:44, 19.78it/s][A
 72%|████

{'En-Mc-30.Txt': 0.020697303372480002,
 'En-Men-Tr-3K.Txt': 0.1934457985189773,
 'En-Mturk-287.Txt': 0.32695013979470333,
 'En-Mturk-771.Txt': 0.15960920470300777,
 'En-Rg-65.Txt': 0.06898922471609267,
 'En-Rw-Stanford.Txt': 0.031917020249524426,
 'En-Simlex-999.Txt': 0.044821182135555034,
 'En-Simverb-3500.Txt': -0.0007924083583208985,
 'En-Verb-143.Txt': 0.26398131477910003,
 'En-Ws-353-All.Txt': 0.13278929773579282,
 'En-Ws-353-Rel.Txt': 0.15474160565948566,
 'En-Ws-353-Sim.Txt': 0.12856872866117125,
 'En-Yp-130.Txt': 0.20034255022547914}
Epoch 1 | Loss: 4.3835021811623065| spearmanr: 0.044821182135555034



 72%|███████▏  | 5332/7370 [18:27<31:03,  1.09it/s][A
 72%|███████▏  | 5335/7370 [18:27<21:32,  1.57it/s][A
 72%|███████▏  | 5338/7370 [18:27<15:13,  2.22it/s][A
 72%|███████▏  | 5341/7370 [18:27<11:01,  3.07it/s][A
 73%|███████▎  | 5344/7370 [18:27<08:08,  4.15it/s][A
 73%|███████▎  | 5347/7370 [18:27<06:07,  5.50it/s][A
 73%|███████▎  | 5350/7370 [18:28<04:46,  7.04it/s][A
 73%|███████▎  | 5353/7370 [18:28<03:48,  8.81it/s][A
 73%|███████▎  | 5356/7370 [18:28<03:08, 10.67it/s][A
 73%|███████▎  | 5359/7370 [18:28<02:40, 12.53it/s][A
 73%|███████▎  | 5362/7370 [18:28<02:23, 13.97it/s][A
 73%|███████▎  | 5365/7370 [18:28<02:09, 15.51it/s][A
 73%|███████▎  | 5368/7370 [18:28<02:00, 16.65it/s][A
 73%|███████▎  | 5371/7370 [18:29<01:52, 17.75it/s][A
 73%|███████▎  | 5374/7370 [18:29<01:49, 18.29it/s][A
 73%|███████▎  | 5377/7370 [18:29<01:44, 19.13it/s][A
 73%|███████▎  | 5380/7370 [18:29<01:42, 19.44it/s][A
 73%|███████▎  | 5383/7370 [18:29<01:44, 19.04it/s][A
 73%|████

{'En-Mc-30.Txt': 0.06258470305488,
 'En-Men-Tr-3K.Txt': 0.1599207869380512,
 'En-Mturk-287.Txt': 0.3404155169628823,
 'En-Mturk-771.Txt': 0.16082983148945287,
 'En-Rg-65.Txt': 0.0859654923541653,
 'En-Rw-Stanford.Txt': 0.03820726082554591,
 'En-Simlex-999.Txt': 0.101616594761613,
 'En-Simverb-3500.Txt': -0.0034523562617459263,
 'En-Verb-143.Txt': 0.23299413195613689,
 'En-Ws-353-All.Txt': 0.1524565619017167,
 'En-Ws-353-Rel.Txt': 0.17931554988870288,
 'En-Ws-353-Sim.Txt': 0.13537891426090742,
 'En-Yp-130.Txt': 0.06658094481768495}
Epoch 1 | Loss: 4.365424667974765| spearmanr: 0.101616594761613



 73%|███████▎  | 5405/7370 [18:41<38:19,  1.17s/it][A
 73%|███████▎  | 5407/7370 [18:41<30:01,  1.09it/s][A
 73%|███████▎  | 5410/7370 [18:41<20:46,  1.57it/s][A
 73%|███████▎  | 5412/7370 [18:42<16:14,  2.01it/s][A
 73%|███████▎  | 5415/7370 [18:42<11:17,  2.89it/s][A
 74%|███████▎  | 5418/7370 [18:42<08:07,  4.00it/s][A
 74%|███████▎  | 5421/7370 [18:42<06:03,  5.36it/s][A
 74%|███████▎  | 5424/7370 [18:42<04:38,  6.99it/s][A
 74%|███████▎  | 5427/7370 [18:42<03:40,  8.81it/s][A
 74%|███████▎  | 5430/7370 [18:42<03:04, 10.52it/s][A
 74%|███████▎  | 5433/7370 [18:43<02:37, 12.34it/s][A
 74%|███████▍  | 5436/7370 [18:43<02:33, 12.64it/s][A
 74%|███████▍  | 5438/7370 [18:43<02:32, 12.66it/s][A
 74%|███████▍  | 5440/7370 [18:43<02:35, 12.45it/s][A
 74%|███████▍  | 5442/7370 [18:43<02:34, 12.44it/s][A
 74%|███████▍  | 5444/7370 [18:44<02:34, 12.51it/s][A
 74%|███████▍  | 5446/7370 [18:44<02:36, 12.29it/s][A
 74%|███████▍  | 5448/7370 [18:44<02:32, 12.57it/s][A
 74%|████

{'En-Mc-30.Txt': -0.021190096309919998,
 'En-Men-Tr-3K.Txt': 0.1564178637529913,
 'En-Mturk-287.Txt': 0.2886528080075264,
 'En-Mturk-771.Txt': 0.1501034634203352,
 'En-Rg-65.Txt': 0.11499966971647645,
 'En-Rw-Stanford.Txt': 0.031505807626245975,
 'En-Simlex-999.Txt': 0.07104664509680121,
 'En-Simverb-3500.Txt': -0.013498424258867254,
 'En-Verb-143.Txt': 0.29560601947088966,
 'En-Ws-353-All.Txt': 0.13551112175036178,
 'En-Ws-353-Rel.Txt': 0.20609219721642377,
 'En-Ws-353-Sim.Txt': 0.08778105066229737,
 'En-Yp-130.Txt': 0.01740841617175975}
Epoch 1 | Loss: 4.346950324544775| spearmanr: 0.07104664509680121



 74%|███████▍  | 5478/7370 [18:57<31:48,  1.01s/it][A
 74%|███████▍  | 5480/7370 [18:57<23:55,  1.32it/s][A
 74%|███████▍  | 5482/7370 [18:57<17:56,  1.75it/s][A
 74%|███████▍  | 5484/7370 [18:57<13:37,  2.31it/s][A
 74%|███████▍  | 5486/7370 [18:57<10:29,  2.99it/s][A
 74%|███████▍  | 5488/7370 [18:57<08:11,  3.83it/s][A
 74%|███████▍  | 5490/7370 [18:57<06:31,  4.80it/s][A
 75%|███████▍  | 5492/7370 [18:58<05:21,  5.85it/s][A
 75%|███████▍  | 5495/7370 [18:58<03:50,  8.14it/s][A
 75%|███████▍  | 5497/7370 [18:58<03:13,  9.68it/s][A
 75%|███████▍  | 5499/7370 [18:58<02:49, 11.07it/s][A
 75%|███████▍  | 5501/7370 [18:58<02:27, 12.66it/s][A
 75%|███████▍  | 5504/7370 [18:58<02:04, 15.03it/s][A
 75%|███████▍  | 5507/7370 [18:58<01:50, 16.88it/s][A
 75%|███████▍  | 5509/7370 [18:58<01:47, 17.39it/s][A
 75%|███████▍  | 5512/7370 [18:59<01:39, 18.69it/s][A
 75%|███████▍  | 5515/7370 [18:59<01:36, 19.19it/s][A
 75%|███████▍  | 5518/7370 [18:59<01:33, 19.88it/s][A
 75%|████

{'En-Mc-30.Txt': 0.06258470305488,
 'En-Men-Tr-3K.Txt': 0.17782188065210852,
 'En-Mturk-287.Txt': 0.31703716189850606,
 'En-Mturk-771.Txt': 0.14096216080291016,
 'En-Rg-65.Txt': 0.19062603333779687,
 'En-Rw-Stanford.Txt': 0.028507458972788332,
 'En-Simlex-999.Txt': 0.07868358709581048,
 'En-Simverb-3500.Txt': -0.020059494565693857,
 'En-Verb-143.Txt': 0.1798042947740413,
 'En-Ws-353-All.Txt': 0.19314699075939676,
 'En-Ws-353-Rel.Txt': 0.2384822107020332,
 'En-Ws-353-Sim.Txt': 0.16566375551249238,
 'En-Yp-130.Txt': 0.0730320540641098}
Epoch 1 | Loss: 4.328200702345851| spearmanr: 0.07868358709581048



 75%|███████▌  | 5553/7370 [19:12<27:44,  1.09it/s][A
 75%|███████▌  | 5555/7370 [19:12<21:24,  1.41it/s][A
 75%|███████▌  | 5558/7370 [19:12<14:34,  2.07it/s][A
 75%|███████▌  | 5561/7370 [19:12<10:16,  2.94it/s][A
 75%|███████▌  | 5564/7370 [19:12<07:27,  4.03it/s][A
 76%|███████▌  | 5567/7370 [19:12<05:36,  5.36it/s][A
 76%|███████▌  | 5569/7370 [19:13<04:41,  6.40it/s][A
 76%|███████▌  | 5571/7370 [19:13<03:54,  7.67it/s][A
 76%|███████▌  | 5573/7370 [19:13<03:17,  9.10it/s][A
 76%|███████▌  | 5575/7370 [19:13<02:49, 10.56it/s][A
 76%|███████▌  | 5577/7370 [19:13<02:27, 12.15it/s][A
 76%|███████▌  | 5579/7370 [19:13<02:12, 13.53it/s][A
 76%|███████▌  | 5582/7370 [19:13<01:53, 15.80it/s][A
 76%|███████▌  | 5584/7370 [19:13<01:46, 16.69it/s][A
 76%|███████▌  | 5586/7370 [19:13<01:45, 16.94it/s][A
 76%|███████▌  | 5589/7370 [19:14<01:37, 18.33it/s][A
 76%|███████▌  | 5591/7370 [19:14<01:35, 18.62it/s][A
 76%|███████▌  | 5594/7370 [19:14<01:31, 19.37it/s][A
 76%|████

{'En-Mc-30.Txt': 0.11161760033015998,
 'En-Men-Tr-3K.Txt': 0.17864149924932365,
 'En-Mturk-287.Txt': 0.3089274961908143,
 'En-Mturk-771.Txt': 0.17639946113252597,
 'En-Rg-65.Txt': 0.15500760263922395,
 'En-Rw-Stanford.Txt': -0.02089648073660506,
 'En-Simlex-999.Txt': 0.05356728291437633,
 'En-Simverb-3500.Txt': -0.023565236194290136,
 'En-Verb-143.Txt': 0.2578695515003497,
 'En-Ws-353-All.Txt': 0.18691315002246592,
 'En-Ws-353-Rel.Txt': 0.21261405244732742,
 'En-Ws-353-Sim.Txt': 0.20361295433744672,
 'En-Yp-130.Txt': 0.1454893805129797}
Epoch 1 | Loss: 4.309444308662618| spearmanr: 0.05356728291437633



 76%|███████▋  | 5624/7370 [19:27<33:32,  1.15s/it][A
 76%|███████▋  | 5627/7370 [19:27<21:22,  1.36it/s][A
 76%|███████▋  | 5629/7370 [19:27<16:07,  1.80it/s][A
 76%|███████▋  | 5631/7370 [19:27<12:08,  2.39it/s][A
 76%|███████▋  | 5633/7370 [19:27<09:08,  3.17it/s][A
 76%|███████▋  | 5636/7370 [19:27<06:12,  4.66it/s][A
 76%|███████▋  | 5638/7370 [19:27<04:57,  5.82it/s][A
 77%|███████▋  | 5640/7370 [19:27<04:00,  7.19it/s][A
 77%|███████▋  | 5643/7370 [19:27<03:01,  9.51it/s][A
 77%|███████▋  | 5646/7370 [19:28<02:26, 11.79it/s][A
 77%|███████▋  | 5649/7370 [19:28<02:05, 13.67it/s][A
 77%|███████▋  | 5652/7370 [19:28<01:52, 15.20it/s][A
 77%|███████▋  | 5655/7370 [19:28<01:42, 16.71it/s][A
 77%|███████▋  | 5658/7370 [19:28<01:35, 17.87it/s][A
 77%|███████▋  | 5661/7370 [19:28<01:31, 18.77it/s][A
 77%|███████▋  | 5664/7370 [19:28<01:27, 19.43it/s][A
 77%|███████▋  | 5667/7370 [19:29<01:25, 19.84it/s][A
 77%|███████▋  | 5670/7370 [19:29<01:23, 20.36it/s][A
 77%|████

{'En-Mc-30.Txt': -0.04459776083831999,
 'En-Men-Tr-3K.Txt': 0.17246453727382516,
 'En-Mturk-287.Txt': 0.31728837868943816,
 'En-Mturk-771.Txt': 0.1636268853622649,
 'En-Rg-65.Txt': 0.041277186577930496,
 'En-Rw-Stanford.Txt': -0.0027833248044128946,
 'En-Simlex-999.Txt': 0.0738769393902142,
 'En-Simverb-3500.Txt': 0.011591379963247063,
 'En-Verb-143.Txt': 0.13672028532334435,
 'En-Ws-353-All.Txt': 0.19871504144575855,
 'En-Ws-353-Rel.Txt': 0.25076817517570016,
 'En-Ws-353-Sim.Txt': 0.1550581089258431,
 'En-Yp-130.Txt': 0.28483167435861073}
Epoch 1 | Loss: 4.290905476537475| spearmanr: 0.0738769393902142



 77%|███████▋  | 5699/7370 [19:42<26:30,  1.05it/s][A
 77%|███████▋  | 5702/7370 [19:42<18:21,  1.51it/s][A
 77%|███████▋  | 5705/7370 [19:42<12:57,  2.14it/s][A
 77%|███████▋  | 5708/7370 [19:42<09:20,  2.97it/s][A
 77%|███████▋  | 5711/7370 [19:42<06:54,  4.00it/s][A
 78%|███████▊  | 5713/7370 [19:42<05:40,  4.87it/s][A
 78%|███████▊  | 5715/7370 [19:43<04:39,  5.93it/s][A
 78%|███████▊  | 5717/7370 [19:43<03:49,  7.20it/s][A
 78%|███████▊  | 5719/7370 [19:43<03:09,  8.70it/s][A
 78%|███████▊  | 5721/7370 [19:43<02:40, 10.29it/s][A
 78%|███████▊  | 5724/7370 [19:43<02:09, 12.67it/s][A
 78%|███████▊  | 5727/7370 [19:43<01:51, 14.76it/s][A
 78%|███████▊  | 5730/7370 [19:43<01:41, 16.20it/s][A
 78%|███████▊  | 5733/7370 [19:43<01:34, 17.40it/s][A
 78%|███████▊  | 5736/7370 [19:44<01:27, 18.57it/s][A
 78%|███████▊  | 5739/7370 [19:44<01:24, 19.21it/s][A
 78%|███████▊  | 5742/7370 [19:44<01:22, 19.64it/s][A
 78%|███████▊  | 5745/7370 [19:44<01:21, 19.99it/s][A
 78%|████

{'En-Mc-30.Txt': 0.26536899681144,
 'En-Men-Tr-3K.Txt': 0.1613669047132747,
 'En-Mturk-287.Txt': 0.2527733395991852,
 'En-Mturk-771.Txt': 0.22288628847272127,
 'En-Rg-65.Txt': 0.07020559280230972,
 'En-Rw-Stanford.Txt': 0.007229846310036798,
 'En-Simlex-999.Txt': 0.0405157535020283,
 'En-Simverb-3500.Txt': 0.016138959458703653,
 'En-Verb-143.Txt': 0.2526225655200076,
 'En-Ws-353-All.Txt': 0.2548100273047717,
 'En-Ws-353-Rel.Txt': 0.32129215857084886,
 'En-Ws-353-Sim.Txt': 0.24130087702928973,
 'En-Yp-130.Txt': 0.1030553249210993}
Epoch 1 | Loss: 4.272293113802409| spearmanr: 0.0405157535020283



 78%|███████▊  | 5770/7370 [19:56<30:59,  1.16s/it][A
 78%|███████▊  | 5772/7370 [19:57<22:15,  1.20it/s][A
 78%|███████▊  | 5774/7370 [19:57<16:02,  1.66it/s][A
 78%|███████▊  | 5776/7370 [19:57<11:44,  2.26it/s][A
 78%|███████▊  | 5778/7370 [19:57<08:50,  3.00it/s][A
 78%|███████▊  | 5780/7370 [19:57<06:48,  3.89it/s][A
 78%|███████▊  | 5782/7370 [19:57<05:21,  4.93it/s][A
 78%|███████▊  | 5784/7370 [19:57<04:18,  6.13it/s][A
 79%|███████▊  | 5786/7370 [19:58<03:36,  7.30it/s][A
 79%|███████▊  | 5788/7370 [19:58<03:10,  8.31it/s][A
 79%|███████▊  | 5790/7370 [19:58<02:53,  9.09it/s][A
 79%|███████▊  | 5792/7370 [19:58<02:37, 10.02it/s][A
 79%|███████▊  | 5794/7370 [19:58<02:24, 10.88it/s][A
 79%|███████▊  | 5796/7370 [19:58<02:13, 11.77it/s][A
 79%|███████▊  | 5798/7370 [19:58<02:08, 12.23it/s][A
 79%|███████▊  | 5800/7370 [19:59<02:04, 12.60it/s][A
 79%|███████▊  | 5802/7370 [19:59<02:00, 13.03it/s][A
 79%|███████▉  | 5804/7370 [19:59<02:05, 12.47it/s][A
 79%|████

{'En-Mc-30.Txt': 0.25329556984416,
 'En-Men-Tr-3K.Txt': 0.15915428899358566,
 'En-Mturk-287.Txt': 0.3009341394040545,
 'En-Mturk-771.Txt': 0.1683470400595694,
 'En-Rg-65.Txt': 0.03939974887963897,
 'En-Rw-Stanford.Txt': 0.0025270252028032235,
 'En-Simlex-999.Txt': 0.015413155118198218,
 'En-Simverb-3500.Txt': -0.006157955770881766,
 'En-Verb-143.Txt': 0.1790682285153962,
 'En-Ws-353-All.Txt': 0.20313089358766648,
 'En-Ws-353-Rel.Txt': 0.20871770645405469,
 'En-Ws-353-Sim.Txt': 0.21445127582370108,
 'En-Yp-130.Txt': 0.12590282997331795}
Epoch 1 | Loss: 4.254173039002199| spearmanr: 0.015413155118198218


 79%|███████▉  | 5843/7370 [20:12<26:35,  1.04s/it][A
 79%|███████▉  | 5845/7370 [20:12<19:41,  1.29it/s][A
 79%|███████▉  | 5847/7370 [20:12<14:33,  1.74it/s][A
 79%|███████▉  | 5849/7370 [20:12<10:46,  2.35it/s][A
 79%|███████▉  | 5852/7370 [20:12<07:07,  3.55it/s][A
 79%|███████▉  | 5855/7370 [20:13<05:01,  5.03it/s][A
 79%|███████▉  | 5858/7370 [20:13<03:44,  6.74it/s][A
 80%|███████▉  | 5861/7370 [20:13<02:55,  8.58it/s][A
 80%|███████▉  | 5864/7370 [20:13<02:22, 10.57it/s][A
 80%|███████▉  | 5867/7370 [20:13<02:00, 12.51it/s][A
 80%|███████▉  | 5870/7370 [20:13<01:46, 14.02it/s][A
 80%|███████▉  | 5873/7370 [20:14<01:36, 15.51it/s][A
 80%|███████▉  | 5876/7370 [20:14<01:28, 16.89it/s][A
 80%|███████▉  | 5879/7370 [20:14<01:23, 17.94it/s][A
 80%|███████▉  | 5882/7370 [20:14<01:19, 18.72it/s][A
 80%|███████▉  | 5885/7370 [20:14<01:16, 19.42it/s][A
 80%|███████▉  | 5888/7370 [20:14<01:15, 19.66it/s][A
 80%|███████▉  | 5891/7370 [20:14<01:15, 19.58it/s][A
 80%|█████

{'En-Mc-30.Txt': 0.17913023275943996,
 'En-Men-Tr-3K.Txt': 0.17864443574726327,
 'En-Mturk-287.Txt': 0.30271099393261813,
 'En-Mturk-771.Txt': 0.17801787311535808,
 'En-Rg-65.Txt': 0.01861572027601734,
 'En-Rw-Stanford.Txt': -0.03159386810677308,
 'En-Simlex-999.Txt': 0.049395207987267556,
 'En-Simverb-3500.Txt': -0.0014186729235918373,
 'En-Verb-143.Txt': 0.21112733296638622,
 'En-Ws-353-All.Txt': 0.21562790696767456,
 'En-Ws-353-Rel.Txt': 0.29047769246164395,
 'En-Ws-353-Sim.Txt': 0.19490040020638813,
 'En-Yp-130.Txt': 0.09926961910767115}
Epoch 1 | Loss: 4.239613209699593| spearmanr: 0.049395207987267556



 80%|████████  | 5916/7370 [20:27<23:39,  1.02it/s][A
 80%|████████  | 5918/7370 [20:27<17:56,  1.35it/s][A
 80%|████████  | 5920/7370 [20:27<13:26,  1.80it/s][A
 80%|████████  | 5922/7370 [20:27<10:02,  2.40it/s][A
 80%|████████  | 5924/7370 [20:27<07:32,  3.20it/s][A
 80%|████████  | 5926/7370 [20:27<05:43,  4.21it/s][A
 80%|████████  | 5929/7370 [20:27<03:57,  6.06it/s][A
 80%|████████  | 5931/7370 [20:27<03:13,  7.44it/s][A
 81%|████████  | 5933/7370 [20:28<02:40,  8.96it/s][A
 81%|████████  | 5936/7370 [20:28<02:05, 11.40it/s][A
 81%|████████  | 5939/7370 [20:28<01:46, 13.43it/s][A
 81%|████████  | 5942/7370 [20:28<01:34, 15.09it/s][A
 81%|████████  | 5945/7370 [20:28<01:25, 16.59it/s][A
 81%|████████  | 5948/7370 [20:28<01:20, 17.72it/s][A
 81%|████████  | 5951/7370 [20:28<01:17, 18.32it/s][A
 81%|████████  | 5954/7370 [20:29<01:14, 19.12it/s][A
 81%|████████  | 5957/7370 [20:29<01:11, 19.64it/s][A
 81%|████████  | 5960/7370 [20:29<01:11, 19.69it/s][A
 81%|████

{'En-Mc-30.Txt': 0.10299372392495998,
 'En-Men-Tr-3K.Txt': 0.15097167540098022,
 'En-Mturk-287.Txt': 0.3236201420404499,
 'En-Mturk-771.Txt': 0.14408476035548445,
 'En-Rg-65.Txt': 0.061426588353960636,
 'En-Rw-Stanford.Txt': -0.034388754958316024,
 'En-Simlex-999.Txt': 0.007316563486394009,
 'En-Simverb-3500.Txt': -0.008784524116519952,
 'En-Verb-143.Txt': 0.2680960458315262,
 'En-Ws-353-All.Txt': 0.15960352403853134,
 'En-Ws-353-Rel.Txt': 0.22882497991701062,
 'En-Ws-353-Sim.Txt': 0.158705925707359,
 'En-Yp-130.Txt': 0.06472765649317945}
Epoch 1 | Loss: 4.229370070037408| spearmanr: 0.007316563486394009



 81%|████████▏ | 5989/7370 [20:41<20:37,  1.12it/s][A
 81%|████████▏ | 5991/7370 [20:41<15:54,  1.45it/s][A
 81%|████████▏ | 5993/7370 [20:41<12:06,  1.89it/s][A
 81%|████████▏ | 5995/7370 [20:41<09:10,  2.50it/s][A
 81%|████████▏ | 5997/7370 [20:42<06:56,  3.29it/s][A
 81%|████████▏ | 5999/7370 [20:42<05:17,  4.32it/s][A
 81%|████████▏ | 6001/7370 [20:42<04:06,  5.55it/s][A
 81%|████████▏ | 6004/7370 [20:42<02:56,  7.74it/s][A
 82%|████████▏ | 6007/7370 [20:42<02:18,  9.83it/s][A
 82%|████████▏ | 6009/7370 [20:42<02:02, 11.13it/s][A
 82%|████████▏ | 6011/7370 [20:42<01:48, 12.53it/s][A
 82%|████████▏ | 6013/7370 [20:42<01:37, 13.91it/s][A
 82%|████████▏ | 6015/7370 [20:42<01:29, 15.18it/s][A
 82%|████████▏ | 6017/7370 [20:43<01:23, 16.17it/s][A
 82%|████████▏ | 6019/7370 [20:43<01:19, 16.97it/s][A
 82%|████████▏ | 6022/7370 [20:43<01:12, 18.51it/s][A
 82%|████████▏ | 6025/7370 [20:43<01:09, 19.43it/s][A
 82%|████████▏ | 6028/7370 [20:43<01:07, 19.87it/s][A
 82%|████

{'En-Mc-30.Txt': 0.16040410113672002,
 'En-Men-Tr-3K.Txt': 0.15831929369282016,
 'En-Mturk-287.Txt': 0.30816074796656157,
 'En-Mturk-771.Txt': 0.19120042543275617,
 'En-Rg-65.Txt': 0.08263370150930993,
 'En-Rw-Stanford.Txt': -0.028904226111687796,
 'En-Simlex-999.Txt': -0.0009415719714993613,
 'En-Simverb-3500.Txt': -0.007570950363678661,
 'En-Verb-143.Txt': 0.08504079174846878,
 'En-Ws-353-All.Txt': 0.19617215679833389,
 'En-Ws-353-Rel.Txt': 0.15933201473338476,
 'En-Ws-353-Sim.Txt': 0.27371148563607045,
 'En-Yp-130.Txt': 0.10125201291096245}
Epoch 1 | Loss: 4.216445575059444| spearmanr: -0.0009415719714993613



 82%|████████▏ | 6063/7370 [20:56<19:40,  1.11it/s][A
 82%|████████▏ | 6066/7370 [20:56<13:37,  1.60it/s][A
 82%|████████▏ | 6069/7370 [20:56<09:38,  2.25it/s][A
 82%|████████▏ | 6072/7370 [20:56<06:56,  3.12it/s][A
 82%|████████▏ | 6075/7370 [20:56<05:06,  4.23it/s][A
 82%|████████▏ | 6078/7370 [20:56<03:51,  5.58it/s][A
 83%|████████▎ | 6081/7370 [20:57<03:01,  7.11it/s][A
 83%|████████▎ | 6084/7370 [20:57<02:25,  8.85it/s][A
 83%|████████▎ | 6087/7370 [20:57<01:59, 10.71it/s][A
 83%|████████▎ | 6090/7370 [20:57<01:42, 12.44it/s][A
 83%|████████▎ | 6093/7370 [20:57<01:31, 13.88it/s][A
 83%|████████▎ | 6095/7370 [20:57<01:27, 14.63it/s][A
 83%|████████▎ | 6097/7370 [20:57<01:21, 15.61it/s][A
 83%|████████▎ | 6100/7370 [20:58<01:14, 17.05it/s][A
 83%|████████▎ | 6102/7370 [20:58<01:12, 17.48it/s][A
 83%|████████▎ | 6104/7370 [20:58<01:11, 17.76it/s][A
 83%|████████▎ | 6107/7370 [20:58<01:06, 19.02it/s][A
 83%|████████▎ | 6110/7370 [20:58<01:04, 19.63it/s][A
 83%|████

{'En-Mc-30.Txt': 0.023161268059679996,
 'En-Men-Tr-3K.Txt': 0.17398672837579,
 'En-Mturk-287.Txt': 0.33932210832330906,
 'En-Mturk-771.Txt': 0.1499264243947043,
 'En-Rg-65.Txt': -0.0904078801473058,
 'En-Rw-Stanford.Txt': -0.014141282910384263,
 'En-Simlex-999.Txt': 0.08096438752201982,
 'En-Simverb-3500.Txt': 0.00881781773843987,
 'En-Verb-143.Txt': 0.3024820045482333,
 'En-Ws-353-All.Txt': 0.2211518860965601,
 'En-Ws-353-Rel.Txt': 0.23558342924101977,
 'En-Ws-353-Sim.Txt': 0.24813811784991024,
 'En-Yp-130.Txt': 0.12248778182479093}
Epoch 1 | Loss: 4.2023023342242665| spearmanr: 0.08096438752201982



 83%|████████▎ | 6136/7370 [21:10<24:03,  1.17s/it][A
 83%|████████▎ | 6138/7370 [21:11<17:11,  1.19it/s][A
 83%|████████▎ | 6140/7370 [21:11<12:23,  1.65it/s][A
 83%|████████▎ | 6142/7370 [21:11<09:09,  2.23it/s][A
 83%|████████▎ | 6144/7370 [21:11<06:56,  2.94it/s][A
 83%|████████▎ | 6146/7370 [21:11<05:16,  3.87it/s][A
 83%|████████▎ | 6148/7370 [21:11<04:06,  4.96it/s][A
 83%|████████▎ | 6150/7370 [21:11<03:16,  6.20it/s][A
 83%|████████▎ | 6152/7370 [21:12<02:43,  7.43it/s][A
 84%|████████▎ | 6154/7370 [21:12<02:23,  8.48it/s][A
 84%|████████▎ | 6156/7370 [21:12<02:07,  9.53it/s][A
 84%|████████▎ | 6158/7370 [21:12<01:58, 10.24it/s][A
 84%|████████▎ | 6160/7370 [21:12<01:50, 10.97it/s][A
 84%|████████▎ | 6162/7370 [21:12<01:44, 11.51it/s][A
 84%|████████▎ | 6164/7370 [21:13<01:41, 11.85it/s][A
 84%|████████▎ | 6166/7370 [21:13<01:38, 12.23it/s][A
 84%|████████▎ | 6168/7370 [21:13<01:34, 12.65it/s][A
 84%|████████▎ | 6170/7370 [21:13<01:32, 13.04it/s][A
 84%|████

{'En-Mc-30.Txt': 0.06406308186719999,
 'En-Men-Tr-3K.Txt': 0.16579699854324892,
 'En-Mturk-287.Txt': 0.3699234045517846,
 'En-Mturk-771.Txt': 0.1409707856072228,
 'En-Rg-65.Txt': -0.05404905148320943,
 'En-Rw-Stanford.Txt': -0.02467374863573776,
 'En-Simlex-999.Txt': 0.03631764195436795,
 'En-Simverb-3500.Txt': 0.0016045719398066287,
 'En-Verb-143.Txt': 0.15644324105704108,
 'En-Ws-353-All.Txt': 0.21119171410693144,
 'En-Ws-353-Rel.Txt': 0.23065808031810756,
 'En-Ws-353-Sim.Txt': 0.19880872011470435,
 'En-Yp-130.Txt': 0.07278633718513043}
Epoch 1 | Loss: 4.187767543525339| spearmanr: 0.03631764195436795



 84%|████████▍ | 6210/7370 [21:26<18:47,  1.03it/s][A
 84%|████████▍ | 6212/7370 [21:26<14:12,  1.36it/s][A
 84%|████████▍ | 6214/7370 [21:26<10:38,  1.81it/s][A
 84%|████████▍ | 6217/7370 [21:26<07:04,  2.72it/s][A
 84%|████████▍ | 6220/7370 [21:26<04:58,  3.86it/s][A
 84%|████████▍ | 6222/7370 [21:27<04:00,  4.77it/s][A
 84%|████████▍ | 6224/7370 [21:27<03:12,  5.95it/s][A
 84%|████████▍ | 6226/7370 [21:27<02:35,  7.34it/s][A
 85%|████████▍ | 6228/7370 [21:27<02:08,  8.90it/s][A
 85%|████████▍ | 6230/7370 [21:27<01:48, 10.47it/s][A
 85%|████████▍ | 6232/7370 [21:27<01:34, 12.02it/s][A
 85%|████████▍ | 6235/7370 [21:27<01:19, 14.27it/s][A
 85%|████████▍ | 6237/7370 [21:27<01:13, 15.37it/s][A
 85%|████████▍ | 6240/7370 [21:28<01:07, 16.73it/s][A
 85%|████████▍ | 6243/7370 [21:28<01:02, 18.08it/s][A
 85%|████████▍ | 6246/7370 [21:28<01:00, 18.69it/s][A
 85%|████████▍ | 6249/7370 [21:28<00:57, 19.52it/s][A
 85%|████████▍ | 6252/7370 [21:28<00:56, 19.83it/s][A
 85%|████

{'En-Mc-30.Txt': 0.12073426967279999,
 'En-Men-Tr-3K.Txt': 0.15118667570232283,
 'En-Mturk-287.Txt': 0.29883245815937426,
 'En-Mturk-771.Txt': 0.21448929561866503,
 'En-Rg-65.Txt': 0.016130098534617297,
 'En-Rw-Stanford.Txt': -0.02308569012708498,
 'En-Simlex-999.Txt': 0.007859904498970073,
 'En-Simverb-3500.Txt': -0.011464197139675428,
 'En-Verb-143.Txt': 0.0978173735549514,
 'En-Ws-353-All.Txt': 0.20747142611127886,
 'En-Ws-353-Rel.Txt': 0.2693668086207788,
 'En-Ws-353-Sim.Txt': 0.20257944489966614,
 'En-Yp-130.Txt': 0.0512048987635852}
Epoch 1 | Loss: 4.172891910791131| spearmanr: 0.007859904498970073



 85%|████████▌ | 6281/7370 [21:40<16:11,  1.12it/s][A
 85%|████████▌ | 6284/7370 [21:40<11:11,  1.62it/s][A
 85%|████████▌ | 6287/7370 [21:41<07:55,  2.28it/s][A
 85%|████████▌ | 6290/7370 [21:41<05:44,  3.14it/s][A
 85%|████████▌ | 6293/7370 [21:41<04:14,  4.23it/s][A
 85%|████████▌ | 6295/7370 [21:41<03:29,  5.13it/s][A
 85%|████████▌ | 6297/7370 [21:41<02:51,  6.25it/s][A
 85%|████████▌ | 6299/7370 [21:41<02:20,  7.60it/s][A
 85%|████████▌ | 6301/7370 [21:41<01:58,  9.05it/s][A
 86%|████████▌ | 6304/7370 [21:41<01:32, 11.47it/s][A
 86%|████████▌ | 6306/7370 [21:42<01:23, 12.75it/s][A
 86%|████████▌ | 6308/7370 [21:42<01:16, 13.96it/s][A
 86%|████████▌ | 6310/7370 [21:42<01:09, 15.18it/s][A
 86%|████████▌ | 6312/7370 [21:42<01:05, 16.21it/s][A
 86%|████████▌ | 6314/7370 [21:42<01:03, 16.66it/s][A
 86%|████████▌ | 6316/7370 [21:42<01:01, 17.26it/s][A
 86%|████████▌ | 6318/7370 [21:42<00:58, 17.90it/s][A
 86%|████████▌ | 6321/7370 [21:42<00:54, 19.15it/s][A
 86%|████

{'En-Mc-30.Txt': 0.15449058588744,
 'En-Men-Tr-3K.Txt': 0.1848843945941484,
 'En-Mturk-287.Txt': 0.3221070782402164,
 'En-Mturk-771.Txt': 0.14743479165825837,
 'En-Rg-65.Txt': 0.1474449662770919,
 'En-Rw-Stanford.Txt': -0.022665557073331847,
 'En-Simlex-999.Txt': 0.02258291337670533,
 'En-Simverb-3500.Txt': 0.010465593912886099,
 'En-Verb-143.Txt': 0.2697773337884315,
 'En-Ws-353-All.Txt': 0.18460536281375228,
 'En-Ws-353-Rel.Txt': 0.24412217858078203,
 'En-Ws-353-Sim.Txt': 0.17861083821510093,
 'En-Yp-130.Txt': 0.1617191886042112}
Epoch 1 | Loss: 4.1578560712881| spearmanr: 0.02258291337670533



 86%|████████▌ | 6355/7370 [21:55<15:29,  1.09it/s][A
 86%|████████▋ | 6358/7370 [21:55<10:39,  1.58it/s][A
 86%|████████▋ | 6361/7370 [21:55<07:31,  2.23it/s][A
 86%|████████▋ | 6363/7370 [21:55<05:58,  2.81it/s][A
 86%|████████▋ | 6365/7370 [21:55<04:41,  3.57it/s][A
 86%|████████▋ | 6367/7370 [21:56<03:40,  4.55it/s][A
 86%|████████▋ | 6369/7370 [21:56<02:56,  5.69it/s][A
 86%|████████▋ | 6371/7370 [21:56<02:22,  7.02it/s][A
 86%|████████▋ | 6374/7370 [21:56<01:45,  9.43it/s][A
 87%|████████▋ | 6376/7370 [21:56<01:31, 10.92it/s][A
 87%|████████▋ | 6378/7370 [21:56<01:19, 12.45it/s][A
 87%|████████▋ | 6381/7370 [21:56<01:07, 14.72it/s][A
 87%|████████▋ | 6384/7370 [21:56<01:01, 16.10it/s][A
 87%|████████▋ | 6387/7370 [21:57<00:55, 17.56it/s][A
 87%|████████▋ | 6390/7370 [21:57<00:52, 18.58it/s][A
 87%|████████▋ | 6393/7370 [21:57<00:51, 19.15it/s][A
 87%|████████▋ | 6396/7370 [21:57<00:49, 19.59it/s][A
 87%|████████▋ | 6399/7370 [21:57<00:49, 19.65it/s][A
 87%|████

{'En-Mc-30.Txt': 0.2192928571608,
 'En-Men-Tr-3K.Txt': 0.1457826374442937,
 'En-Mturk-287.Txt': 0.37147942930479483,
 'En-Mturk-771.Txt': 0.1741960270458471,
 'En-Rg-65.Txt': 0.014014675775978966,
 'En-Rw-Stanford.Txt': 0.001771660801014313,
 'En-Simlex-999.Txt': 0.007217879232275491,
 'En-Simverb-3500.Txt': -0.009863570590069617,
 'En-Verb-143.Txt': 0.17078848866063637,
 'En-Ws-353-All.Txt': 0.22270529795021232,
 'En-Ws-353-Rel.Txt': 0.26431207860095535,
 'En-Ws-353-Sim.Txt': 0.26766580327789374,
 'En-Yp-130.Txt': 0.14572260331336692}
Epoch 1 | Loss: 4.142945835618193| spearmanr: 0.007217879232275491



 87%|████████▋ | 6428/7370 [22:10<15:25,  1.02it/s][A
 87%|████████▋ | 6430/7370 [22:10<11:39,  1.34it/s][A
 87%|████████▋ | 6432/7370 [22:10<08:44,  1.79it/s][A
 87%|████████▋ | 6434/7370 [22:10<06:31,  2.39it/s][A
 87%|████████▋ | 6437/7370 [22:10<04:21,  3.57it/s][A
 87%|████████▋ | 6439/7370 [22:10<03:25,  4.53it/s][A
 87%|████████▋ | 6441/7370 [22:10<02:44,  5.65it/s][A
 87%|████████▋ | 6443/7370 [22:10<02:12,  7.01it/s][A
 87%|████████▋ | 6445/7370 [22:11<01:49,  8.48it/s][A
 87%|████████▋ | 6447/7370 [22:11<01:32, 10.02it/s][A
 88%|████████▊ | 6449/7370 [22:11<01:20, 11.39it/s][A
 88%|████████▊ | 6451/7370 [22:11<01:13, 12.58it/s][A
 88%|████████▊ | 6453/7370 [22:11<01:05, 13.97it/s][A
 88%|████████▊ | 6455/7370 [22:11<01:00, 15.15it/s][A
 88%|████████▊ | 6457/7370 [22:11<00:56, 16.06it/s][A
 88%|████████▊ | 6459/7370 [22:11<00:53, 16.92it/s][A
 88%|████████▊ | 6462/7370 [22:11<00:49, 18.32it/s][A
 88%|████████▊ | 6464/7370 [22:12<00:48, 18.56it/s][A
 88%|████

{'En-Mc-30.Txt': 0.22619195828495997,
 'En-Men-Tr-3K.Txt': 0.1516871729169319,
 'En-Mturk-287.Txt': 0.39081316783946096,
 'En-Mturk-771.Txt': 0.20338475157805982,
 'En-Rg-65.Txt': 0.07597011981959917,
 'En-Rw-Stanford.Txt': 0.015196973382293373,
 'En-Simlex-999.Txt': 0.00043786664388640825,
 'En-Simverb-3500.Txt': 0.026943201685940617,
 'En-Verb-143.Txt': 0.23943973949427272,
 'En-Ws-353-All.Txt': 0.18075330233521092,
 'En-Ws-353-Rel.Txt': 0.2085568869597632,
 'En-Ws-353-Sim.Txt': 0.1966953208604843,
 'En-Yp-130.Txt': 0.03333003699105102}
Epoch 1 | Loss: 4.127982002656325| spearmanr: 0.00043786664388640825



 88%|████████▊ | 6501/7370 [22:25<14:51,  1.03s/it][A
 88%|████████▊ | 6503/7370 [22:25<11:06,  1.30it/s][A
 88%|████████▊ | 6505/7370 [22:25<08:20,  1.73it/s][A
 88%|████████▊ | 6507/7370 [22:25<06:17,  2.29it/s][A
 88%|████████▊ | 6509/7370 [22:25<04:47,  2.99it/s][A
 88%|████████▊ | 6511/7370 [22:25<03:44,  3.83it/s][A
 88%|████████▊ | 6513/7370 [22:26<02:57,  4.82it/s][A
 88%|████████▊ | 6515/7370 [22:26<02:25,  5.87it/s][A
 88%|████████▊ | 6517/7370 [22:26<02:01,  7.01it/s][A
 88%|████████▊ | 6519/7370 [22:26<01:46,  7.98it/s][A
 88%|████████▊ | 6521/7370 [22:26<01:36,  8.84it/s][A
 89%|████████▊ | 6523/7370 [22:26<01:29,  9.48it/s][A
 89%|████████▊ | 6525/7370 [22:27<01:24, 10.03it/s][A
 89%|████████▊ | 6527/7370 [22:27<01:21, 10.34it/s][A
 89%|████████▊ | 6529/7370 [22:27<01:16, 10.96it/s][A
 89%|████████▊ | 6531/7370 [22:27<01:15, 11.08it/s][A
 89%|████████▊ | 6533/7370 [22:27<01:14, 11.25it/s][A
 89%|████████▊ | 6535/7370 [22:27<01:04, 12.86it/s][A
 89%|████

{'En-Mc-30.Txt': 0.14734508829456,
 'En-Men-Tr-3K.Txt': 0.1506003109725812,
 'En-Mturk-287.Txt': 0.2908883492940902,
 'En-Mturk-771.Txt': 0.19413949023443894,
 'En-Rg-65.Txt': 0.036067958034783604,
 'En-Rw-Stanford.Txt': 0.0076122477942995115,
 'En-Simlex-999.Txt': 0.019551872783433243,
 'En-Simverb-3500.Txt': 0.03018932250391621,
 'En-Verb-143.Txt': 0.21602036358737983,
 'En-Ws-353-All.Txt': 0.16935852008164856,
 'En-Ws-353-Rel.Txt': 0.1673774230681025,
 'En-Ws-353-Sim.Txt': 0.20572403457273383,
 'En-Yp-130.Txt': 0.03303850849056701}
Epoch 1 | Loss: 4.113232131926659| spearmanr: 0.019551872783433243



 89%|████████▉ | 6574/7370 [22:40<13:59,  1.05s/it][A
 89%|████████▉ | 6577/7370 [22:41<09:10,  1.44it/s][A
 89%|████████▉ | 6579/7370 [22:41<07:01,  1.88it/s][A
 89%|████████▉ | 6581/7370 [22:41<05:19,  2.47it/s][A
 89%|████████▉ | 6583/7370 [22:41<04:02,  3.25it/s][A
 89%|████████▉ | 6585/7370 [22:41<03:04,  4.24it/s][A
 89%|████████▉ | 6588/7370 [22:41<02:08,  6.07it/s][A
 89%|████████▉ | 6590/7370 [22:41<01:46,  7.35it/s][A
 89%|████████▉ | 6592/7370 [22:41<01:27,  8.88it/s][A
 89%|████████▉ | 6594/7370 [22:41<01:13, 10.52it/s][A
 89%|████████▉ | 6596/7370 [22:42<01:03, 12.15it/s][A
 90%|████████▉ | 6598/7370 [22:42<00:56, 13.68it/s][A
 90%|████████▉ | 6600/7370 [22:42<00:52, 14.63it/s][A
 90%|████████▉ | 6603/7370 [22:42<00:46, 16.62it/s][A
 90%|████████▉ | 6606/7370 [22:42<00:42, 17.96it/s][A
 90%|████████▉ | 6609/7370 [22:42<00:40, 18.81it/s][A
 90%|████████▉ | 6612/7370 [22:42<00:39, 19.41it/s][A
 90%|████████▉ | 6615/7370 [22:43<00:37, 19.98it/s][A
 90%|████

{'En-Mc-30.Txt': 0.31538747996159994,
 'En-Men-Tr-3K.Txt': 0.1425484176547897,
 'En-Mturk-287.Txt': 0.3286237832537912,
 'En-Mturk-771.Txt': 0.18637202672912567,
 'En-Rg-65.Txt': 0.14154822533738753,
 'En-Rw-Stanford.Txt': 0.003189825222203538,
 'En-Simlex-999.Txt': 0.03080698211091585,
 'En-Simverb-3500.Txt': 0.001714389961573665,
 'En-Verb-143.Txt': 0.1709735107802958,
 'En-Ws-353-All.Txt': 0.1165036043779617,
 'En-Ws-353-Rel.Txt': 0.13820218724281608,
 'En-Ws-353-Sim.Txt': 0.1117558413973504,
 'En-Yp-130.Txt': 0.05735198543093386}
Epoch 1 | Loss: 4.100394819997148| spearmanr: 0.03080698211091585



 90%|█████████ | 6646/7370 [22:55<13:43,  1.14s/it][A
 90%|█████████ | 6648/7370 [22:55<09:58,  1.21it/s][A
 90%|█████████ | 6650/7370 [22:55<07:15,  1.65it/s][A
 90%|█████████ | 6652/7370 [22:55<05:18,  2.25it/s][A
 90%|█████████ | 6654/7370 [22:56<03:54,  3.05it/s][A
 90%|█████████ | 6656/7370 [22:56<02:56,  4.04it/s][A
 90%|█████████ | 6659/7370 [22:56<02:01,  5.85it/s][A
 90%|█████████ | 6661/7370 [22:56<01:39,  7.10it/s][A
 90%|█████████ | 6663/7370 [22:56<01:22,  8.56it/s][A
 90%|█████████ | 6665/7370 [22:56<01:10, 10.06it/s][A
 90%|█████████ | 6667/7370 [22:56<00:59, 11.75it/s][A
 90%|█████████ | 6669/7370 [22:56<00:52, 13.33it/s][A
 91%|█████████ | 6671/7370 [22:57<00:47, 14.76it/s][A
 91%|█████████ | 6673/7370 [22:57<00:44, 15.81it/s][A
 91%|█████████ | 6676/7370 [22:57<00:39, 17.60it/s][A
 91%|█████████ | 6678/7370 [22:57<00:38, 18.04it/s][A
 91%|█████████ | 6680/7370 [22:57<00:37, 18.52it/s][A
 91%|█████████ | 6682/7370 [22:57<00:36, 18.83it/s][A
 91%|████

{'En-Mc-30.Txt': 0.27078971912328,
 'En-Men-Tr-3K.Txt': 0.1298462161100475,
 'En-Mturk-287.Txt': 0.363216256778035,
 'En-Mturk-771.Txt': 0.14513205027286677,
 'En-Rg-65.Txt': 0.176532029208369,
 'En-Rw-Stanford.Txt': -0.0016826481452062705,
 'En-Simlex-999.Txt': 0.04376880541416315,
 'En-Simverb-3500.Txt': 0.03153475435778983,
 'En-Verb-143.Txt': 0.19237775447089783,
 'En-Ws-353-All.Txt': 0.0990092596110834,
 'En-Ws-353-Rel.Txt': 0.06578545110282323,
 'En-Ws-353-Sim.Txt': 0.16207236819167808,
 'En-Yp-130.Txt': 0.11238423693658778}
Epoch 1 | Loss: 4.0922316928092295| spearmanr: 0.04376880541416315



 91%|█████████ | 6719/7370 [23:10<12:15,  1.13s/it][A
 91%|█████████ | 6721/7370 [23:10<08:58,  1.21it/s][A
 91%|█████████ | 6723/7370 [23:10<06:37,  1.63it/s][A
 91%|█████████ | 6725/7370 [23:10<04:54,  2.19it/s][A
 91%|█████████▏| 6727/7370 [23:11<03:40,  2.91it/s][A
 91%|█████████▏| 6729/7370 [23:11<02:52,  3.71it/s][A
 91%|█████████▏| 6731/7370 [23:11<02:16,  4.70it/s][A
 91%|█████████▏| 6733/7370 [23:11<01:48,  5.86it/s][A
 91%|█████████▏| 6735/7370 [23:11<01:30,  6.99it/s][A
 91%|█████████▏| 6737/7370 [23:11<01:17,  8.21it/s][A
 91%|█████████▏| 6739/7370 [23:12<01:08,  9.27it/s][A
 91%|█████████▏| 6741/7370 [23:12<01:03,  9.93it/s][A
 91%|█████████▏| 6743/7370 [23:12<00:58, 10.64it/s][A
 92%|█████████▏| 6745/7370 [23:12<00:55, 11.18it/s][A
 92%|█████████▏| 6747/7370 [23:12<00:52, 11.80it/s][A
 92%|█████████▏| 6749/7370 [23:12<00:52, 11.92it/s][A
 92%|█████████▏| 6751/7370 [23:12<00:51, 11.92it/s][A
 92%|█████████▏| 6753/7370 [23:13<00:48, 12.70it/s][A
 92%|████

{'En-Mc-30.Txt': 0.10890723917423999,
 'En-Men-Tr-3K.Txt': 0.1241403731133242,
 'En-Mturk-287.Txt': 0.34950961718603074,
 'En-Mturk-771.Txt': 0.17694473590706403,
 'En-Rg-65.Txt': 0.09085740748351645,
 'En-Rw-Stanford.Txt': 0.030897530593802076,
 'En-Simlex-999.Txt': 0.06387732632791035,
 'En-Simverb-3500.Txt': 0.013126097190680227,
 'En-Verb-143.Txt': 0.15395348688162389,
 'En-Ws-353-All.Txt': 0.1190616708662561,
 'En-Ws-353-Rel.Txt': 0.1708187184121143,
 'En-Ws-353-Sim.Txt': 0.13191816500664874,
 'En-Yp-130.Txt': 0.01116970626140183}
Epoch 1 | Loss: 4.081337394630084| spearmanr: 0.06387732632791035



 92%|█████████▏| 6793/7370 [23:26<10:39,  1.11s/it][A
 92%|█████████▏| 6795/7370 [23:26<07:44,  1.24it/s][A
 92%|█████████▏| 6797/7370 [23:26<05:38,  1.69it/s][A
 92%|█████████▏| 6799/7370 [23:26<04:07,  2.31it/s][A
 92%|█████████▏| 6801/7370 [23:26<03:03,  3.10it/s][A
 92%|█████████▏| 6803/7370 [23:26<02:19,  4.07it/s][A
 92%|█████████▏| 6805/7370 [23:26<01:47,  5.28it/s][A
 92%|█████████▏| 6808/7370 [23:26<01:16,  7.30it/s][A
 92%|█████████▏| 6810/7370 [23:27<01:08,  8.22it/s][A
 92%|█████████▏| 6812/7370 [23:27<01:02,  8.94it/s][A
 92%|█████████▏| 6814/7370 [23:27<00:56,  9.93it/s][A
 92%|█████████▏| 6816/7370 [23:27<00:51, 10.80it/s][A
 93%|█████████▎| 6818/7370 [23:27<00:48, 11.45it/s][A
 93%|█████████▎| 6820/7370 [23:27<00:47, 11.65it/s][A
 93%|█████████▎| 6822/7370 [23:27<00:44, 12.23it/s][A
 93%|█████████▎| 6824/7370 [23:28<00:43, 12.65it/s][A
 93%|█████████▎| 6826/7370 [23:28<00:43, 12.60it/s][A
 93%|█████████▎| 6828/7370 [23:28<00:43, 12.51it/s][A
 93%|████

{'En-Mc-30.Txt': 0.20672663725607995,
 'En-Men-Tr-3K.Txt': 0.12603494418265054,
 'En-Mturk-287.Txt': 0.27618103385082354,
 'En-Mturk-771.Txt': 0.15880959312833884,
 'En-Rg-65.Txt': 0.1480531503202004,
 'En-Rw-Stanford.Txt': 0.004841777265813556,
 'En-Simlex-999.Txt': 0.05606990002833267,
 'En-Simverb-3500.Txt': 0.018856001973052622,
 'En-Verb-143.Txt': 0.23406203136417125,
 'En-Ws-353-All.Txt': 0.13756140725034305,
 'En-Ws-353-Rel.Txt': 0.15219428935892879,
 'En-Ws-353-Sim.Txt': 0.19139404357978163,
 'En-Yp-130.Txt': 0.10188088153343511}
Epoch 1 | Loss: 4.069591136085002| spearmanr: 0.05606990002833267



 93%|█████████▎| 6864/7370 [23:41<10:48,  1.28s/it][A
 93%|█████████▎| 6866/7370 [23:41<07:32,  1.11it/s][A
 93%|█████████▎| 6868/7370 [23:41<05:19,  1.57it/s][A
 93%|█████████▎| 6870/7370 [23:41<03:48,  2.19it/s][A
 93%|█████████▎| 6872/7370 [23:42<02:46,  3.00it/s][A
 93%|█████████▎| 6874/7370 [23:42<02:03,  4.02it/s][A
 93%|█████████▎| 6876/7370 [23:42<01:34,  5.24it/s][A
 93%|█████████▎| 6879/7370 [23:42<01:05,  7.45it/s][A
 93%|█████████▎| 6881/7370 [23:42<00:54,  8.98it/s][A
 93%|█████████▎| 6883/7370 [23:42<00:46, 10.51it/s][A
 93%|█████████▎| 6885/7370 [23:42<00:40, 12.12it/s][A
 93%|█████████▎| 6887/7370 [23:42<00:35, 13.65it/s][A
 93%|█████████▎| 6890/7370 [23:42<00:30, 15.79it/s][A
 94%|█████████▎| 6892/7370 [23:43<00:29, 16.35it/s][A
 94%|█████████▎| 6894/7370 [23:43<00:27, 17.00it/s][A
 94%|█████████▎| 6896/7370 [23:43<00:27, 17.47it/s][A
 94%|█████████▎| 6899/7370 [23:43<00:25, 18.72it/s][A
 94%|█████████▎| 6901/7370 [23:43<00:24, 18.88it/s][A
 94%|████

{'En-Mc-30.Txt': 0.16927437401064,
 'En-Men-Tr-3K.Txt': 0.16181940398429795,
 'En-Mturk-287.Txt': 0.3212926538366628,
 'En-Mturk-771.Txt': 0.13743701613836123,
 'En-Rg-65.Txt': 0.08435248250070357,
 'En-Rw-Stanford.Txt': -0.017271166852956717,
 'En-Simlex-999.Txt': 0.05879661917604052,
 'En-Simverb-3500.Txt': 0.01398128062757024,
 'En-Verb-143.Txt': 0.20812273463191666,
 'En-Ws-353-All.Txt': 0.10792866482031176,
 'En-Ws-353-Rel.Txt': 0.1739827109689769,
 'En-Ws-353-Sim.Txt': 0.11196609911393705,
 'En-Yp-130.Txt': 0.08615083658589048}
Epoch 1 | Loss: 4.057303349417661| spearmanr: 0.05879661917604052



 94%|█████████▍| 6939/7370 [23:56<06:34,  1.09it/s][A
 94%|█████████▍| 6941/7370 [23:56<05:02,  1.42it/s][A
 94%|█████████▍| 6943/7370 [23:56<03:49,  1.86it/s][A
 94%|█████████▍| 6945/7370 [23:56<02:55,  2.42it/s][A
 94%|█████████▍| 6947/7370 [23:56<02:12,  3.20it/s][A
 94%|█████████▍| 6949/7370 [23:56<01:40,  4.20it/s][A
 94%|█████████▍| 6951/7370 [23:56<01:17,  5.43it/s][A
 94%|█████████▍| 6953/7370 [23:57<01:00,  6.85it/s][A
 94%|█████████▍| 6955/7370 [23:57<00:48,  8.47it/s][A
 94%|█████████▍| 6957/7370 [23:57<00:40, 10.21it/s][A
 94%|█████████▍| 6960/7370 [23:57<00:32, 12.78it/s][A
 94%|█████████▍| 6962/7370 [23:57<00:29, 13.65it/s][A
 94%|█████████▍| 6964/7370 [23:57<00:28, 14.41it/s][A
 95%|█████████▍| 6966/7370 [23:57<00:26, 15.43it/s][A
 95%|█████████▍| 6968/7370 [23:57<00:24, 16.46it/s][A
 95%|█████████▍| 6970/7370 [23:57<00:23, 17.03it/s][A
 95%|█████████▍| 6972/7370 [23:58<00:22, 17.75it/s][A
 95%|█████████▍| 6974/7370 [23:58<00:21, 18.27it/s][A
 95%|████

{'En-Mc-30.Txt': 0.11432796148607999,
 'En-Men-Tr-3K.Txt': 0.17061300056043446,
 'En-Mturk-287.Txt': 0.3616673048648112,
 'En-Mturk-771.Txt': 0.16793295452547208,
 'En-Rg-65.Txt': 0.06785218498332456,
 'En-Rw-Stanford.Txt': 0.013112479865722676,
 'En-Simlex-999.Txt': 0.0363772555816014,
 'En-Simverb-3500.Txt': -0.001078096101815507,
 'En-Verb-143.Txt': 0.17683589641950515,
 'En-Ws-353-All.Txt': 0.14867672571469154,
 'En-Ws-353-Rel.Txt': 0.21408641739130438,
 'En-Ws-353-Sim.Txt': 0.14442463411199855,
 'En-Yp-130.Txt': 0.14897939313305975}
Epoch 1 | Loss: 4.045064143301809| spearmanr: 0.0363772555816014



 95%|█████████▌| 7012/7370 [24:11<06:48,  1.14s/it][A
 95%|█████████▌| 7014/7370 [24:11<04:55,  1.21it/s][A
 95%|█████████▌| 7016/7370 [24:11<03:33,  1.66it/s][A
 95%|█████████▌| 7019/7370 [24:11<02:17,  2.56it/s][A
 95%|█████████▌| 7022/7370 [24:11<01:33,  3.72it/s][A
 95%|█████████▌| 7025/7370 [24:11<01:06,  5.15it/s][A
 95%|█████████▌| 7028/7370 [24:12<00:50,  6.77it/s][A
 95%|█████████▌| 7031/7370 [24:12<00:40,  8.44it/s][A
 95%|█████████▌| 7033/7370 [24:12<00:34,  9.64it/s][A
 95%|█████████▌| 7035/7370 [24:12<00:30, 11.02it/s][A
 95%|█████████▌| 7038/7370 [24:12<00:25, 13.24it/s][A
 96%|█████████▌| 7041/7370 [24:12<00:21, 15.10it/s][A
 96%|█████████▌| 7044/7370 [24:12<00:19, 16.64it/s][A
 96%|█████████▌| 7047/7370 [24:12<00:18, 17.63it/s][A
 96%|█████████▌| 7050/7370 [24:13<00:17, 18.02it/s][A
 96%|█████████▌| 7052/7370 [24:13<00:17, 18.26it/s][A
 96%|█████████▌| 7054/7370 [24:13<00:16, 18.60it/s][A
 96%|█████████▌| 7056/7370 [24:13<00:16, 18.74it/s][A
 96%|████

{'En-Mc-30.Txt': 0.17173833869783997,
 'En-Men-Tr-3K.Txt': 0.1726066383054974,
 'En-Mturk-287.Txt': 0.36737901592788097,
 'En-Mturk-771.Txt': 0.17186076635111325,
 'En-Rg-65.Txt': 0.23071329461399329,
 'En-Rw-Stanford.Txt': -0.004942911172180528,
 'En-Simlex-999.Txt': 0.054684974071166806,
 'En-Simverb-3500.Txt': -0.017862526380595687,
 'En-Verb-143.Txt': 0.29428270909332543,
 'En-Ws-353-All.Txt': 0.18170489422506564,
 'En-Ws-353-Rel.Txt': 0.2122436436371673,
 'En-Ws-353-Sim.Txt': 0.19373547969574084,
 'En-Yp-130.Txt': 0.048439542701851114}
Epoch 1 | Loss: 4.032711417248814| spearmanr: 0.054684974071166806



 96%|█████████▌| 7085/7370 [24:25<05:06,  1.08s/it][A
 96%|█████████▌| 7088/7370 [24:26<03:17,  1.43it/s][A
 96%|█████████▌| 7091/7370 [24:26<02:13,  2.09it/s][A
 96%|█████████▋| 7094/7370 [24:26<01:33,  2.96it/s][A
 96%|█████████▋| 7097/7370 [24:26<01:06,  4.08it/s][A
 96%|█████████▋| 7100/7370 [24:26<00:49,  5.42it/s][A
 96%|█████████▋| 7102/7370 [24:26<00:41,  6.41it/s][A
 96%|█████████▋| 7104/7370 [24:26<00:35,  7.54it/s][A
 96%|█████████▋| 7106/7370 [24:26<00:29,  8.93it/s][A
 96%|█████████▋| 7108/7370 [24:27<00:24, 10.49it/s][A
 96%|█████████▋| 7110/7370 [24:27<00:21, 12.04it/s][A
 96%|█████████▋| 7112/7370 [24:27<00:19, 13.58it/s][A
 97%|█████████▋| 7114/7370 [24:27<00:17, 14.85it/s][A
 97%|█████████▋| 7117/7370 [24:27<00:15, 16.79it/s][A
 97%|█████████▋| 7119/7370 [24:27<00:14, 16.86it/s][A
 97%|█████████▋| 7121/7370 [24:27<00:14, 17.61it/s][A
 97%|█████████▋| 7124/7370 [24:27<00:13, 18.92it/s][A
 97%|█████████▋| 7126/7370 [24:27<00:12, 19.05it/s][A
 97%|████

{'En-Mc-30.Txt': 0.32844649280376,
 'En-Men-Tr-3K.Txt': 0.1673330188963794,
 'En-Mturk-287.Txt': 0.3782719750700396,
 'En-Mturk-771.Txt': 0.16462193825730098,
 'En-Rg-65.Txt': 0.17825081019976263,
 'En-Rw-Stanford.Txt': -0.027420505632266785,
 'En-Simlex-999.Txt': 0.0710669432386491,
 'En-Simverb-3500.Txt': 0.015000587415290157,
 'En-Verb-143.Txt': 0.2927381766161684,
 'En-Ws-353-All.Txt': 0.1866448883682628,
 'En-Ws-353-Rel.Txt': 0.20570082947467686,
 'En-Ws-353-Sim.Txt': 0.20423135938623094,
 'En-Yp-130.Txt': 0.03220973461061959}
Epoch 1 | Loss: 4.0201783430151465| spearmanr: 0.0710669432386491



 97%|█████████▋| 7158/7370 [24:40<04:09,  1.18s/it][A
 97%|█████████▋| 7160/7370 [24:40<02:58,  1.18it/s][A
 97%|█████████▋| 7162/7370 [24:41<02:08,  1.62it/s][A
 97%|█████████▋| 7164/7370 [24:41<01:34,  2.19it/s][A
 97%|█████████▋| 7166/7370 [24:41<01:09,  2.92it/s][A
 97%|█████████▋| 7168/7370 [24:41<00:53,  3.79it/s][A
 97%|█████████▋| 7170/7370 [24:41<00:41,  4.85it/s][A
 97%|█████████▋| 7172/7370 [24:41<00:33,  5.92it/s][A
 97%|█████████▋| 7174/7370 [24:42<00:28,  6.98it/s][A
 97%|█████████▋| 7176/7370 [24:42<00:24,  7.99it/s][A
 97%|█████████▋| 7178/7370 [24:42<00:21,  9.02it/s][A
 97%|█████████▋| 7180/7370 [24:42<00:19,  9.67it/s][A
 97%|█████████▋| 7182/7370 [24:42<00:18, 10.38it/s][A
 97%|█████████▋| 7184/7370 [24:42<00:17, 10.81it/s][A
 98%|█████████▊| 7186/7370 [24:43<00:16, 11.29it/s][A
 98%|█████████▊| 7188/7370 [24:43<00:15, 11.42it/s][A
 98%|█████████▊| 7190/7370 [24:43<00:14, 12.12it/s][A
 98%|█████████▊| 7193/7370 [24:43<00:12, 14.56it/s][A
 98%|████

{'En-Mc-30.Txt': 0.19243564207032,
 'En-Men-Tr-3K.Txt': 0.16889764424023349,
 'En-Mturk-287.Txt': 0.3655361482279775,
 'En-Mturk-771.Txt': 0.15968136285355408,
 'En-Rg-65.Txt': 0.0853308655265738,
 'En-Rw-Stanford.Txt': -0.04517928499062085,
 'En-Simlex-999.Txt': 0.0778555007608628,
 'En-Simverb-3500.Txt': 0.04384192288639279,
 'En-Verb-143.Txt': 0.31843412903887086,
 'En-Ws-353-All.Txt': 0.1750721985231492,
 'En-Ws-353-Rel.Txt': 0.20352634032254005,
 'En-Ws-353-Sim.Txt': 0.1762724941243856,
 'En-Yp-130.Txt': 0.06965448815135929}
Epoch 1 | Loss: 4.007859196198468| spearmanr: 0.0778555007608628



 98%|█████████▊| 7231/7370 [24:56<02:25,  1.05s/it][A
 98%|█████████▊| 7233/7370 [24:56<01:46,  1.29it/s][A
 98%|█████████▊| 7235/7370 [24:56<01:17,  1.74it/s][A
 98%|█████████▊| 7237/7370 [24:56<00:56,  2.35it/s][A
 98%|█████████▊| 7239/7370 [24:56<00:41,  3.16it/s][A
 98%|█████████▊| 7242/7370 [24:56<00:27,  4.62it/s][A
 98%|█████████▊| 7244/7370 [24:56<00:21,  5.76it/s][A
 98%|█████████▊| 7246/7370 [24:57<00:17,  7.04it/s][A
 98%|█████████▊| 7248/7370 [24:57<00:14,  8.51it/s][A
 98%|█████████▊| 7250/7370 [24:57<00:11, 10.14it/s][A
 98%|█████████▊| 7252/7370 [24:57<00:10, 11.66it/s][A
 98%|█████████▊| 7254/7370 [24:57<00:09, 12.68it/s][A
 98%|█████████▊| 7256/7370 [24:57<00:08, 13.82it/s][A
 98%|█████████▊| 7258/7370 [24:57<00:07, 14.86it/s][A
 99%|█████████▊| 7260/7370 [24:57<00:06, 16.01it/s][A
 99%|█████████▊| 7262/7370 [24:57<00:06, 16.93it/s][A
 99%|█████████▊| 7264/7370 [24:58<00:06, 17.44it/s][A
 99%|█████████▊| 7266/7370 [24:58<00:05, 17.87it/s][A
 99%|████

{'En-Mc-30.Txt': 0.13551805779599999,
 'En-Men-Tr-3K.Txt': 0.13838539837552571,
 'En-Mturk-287.Txt': 0.3385543122709399,
 'En-Mturk-771.Txt': 0.16258088374112423,
 'En-Rg-65.Txt': 0.11330733150956578,
 'En-Rw-Stanford.Txt': -0.019866581942378864,
 'En-Simlex-999.Txt': 0.057187672804623293,
 'En-Simverb-3500.Txt': 0.03195668920657461,
 'En-Verb-143.Txt': 0.23287346535635897,
 'En-Ws-353-All.Txt': 0.1411257497349259,
 'En-Ws-353-Rel.Txt': 0.16252583346212096,
 'En-Ws-353-Sim.Txt': 0.18464260645967995,
 'En-Yp-130.Txt': 0.10729914694957374}
Epoch 1 | Loss: 3.995485972744229| spearmanr: 0.057187672804623293



 99%|█████████▉| 7304/7370 [25:11<01:19,  1.20s/it][A
 99%|█████████▉| 7306/7370 [25:11<00:54,  1.17it/s][A
 99%|█████████▉| 7308/7370 [25:11<00:38,  1.62it/s][A
 99%|█████████▉| 7310/7370 [25:11<00:26,  2.23it/s][A
 99%|█████████▉| 7312/7370 [25:11<00:19,  3.04it/s][A
 99%|█████████▉| 7315/7370 [25:11<00:12,  4.57it/s][A
 99%|█████████▉| 7317/7370 [25:11<00:09,  5.75it/s][A
 99%|█████████▉| 7319/7370 [25:11<00:07,  7.14it/s][A
 99%|█████████▉| 7322/7370 [25:12<00:05,  9.52it/s][A
 99%|█████████▉| 7324/7370 [25:12<00:04, 11.02it/s][A
 99%|█████████▉| 7326/7370 [25:12<00:03, 12.53it/s][A
 99%|█████████▉| 7329/7370 [25:12<00:02, 14.81it/s][A
 99%|█████████▉| 7332/7370 [25:12<00:02, 16.43it/s][A
100%|█████████▉| 7335/7370 [25:12<00:01, 17.57it/s][A
100%|█████████▉| 7338/7370 [25:12<00:01, 18.51it/s][A
100%|█████████▉| 7341/7370 [25:12<00:01, 18.97it/s][A
100%|█████████▉| 7344/7370 [25:13<00:01, 19.51it/s][A
100%|█████████▉| 7347/7370 [25:13<00:01, 20.03it/s][A
100%|████

{'En-Mc-30.Txt': 0.09905138042543998,
 'En-Men-Tr-3K.Txt': 0.1439096279099246,
 'En-Mturk-287.Txt': 0.38359101254644906,
 'En-Mturk-771.Txt': 0.1158539179814825,
 'En-Rg-65.Txt': 0.07774178637995878,
 'En-Rw-Stanford.Txt': -0.012910144040946793,
 'En-Simlex-999.Txt': 0.026184719134607792,
 'En-Simverb-3500.Txt': 0.02374412829137752,
 'En-Verb-143.Txt': 0.33542800850759086,
 'En-Ws-353-All.Txt': 0.15030707854828557,
 'En-Ws-353-Rel.Txt': 0.21738623994710485,
 'En-Ws-353-Sim.Txt': 0.17126727826078836,
 'En-Yp-130.Txt': 0.09678746216069298}
Epoch 1 | Loss: 3.98387914075942| spearmanr: 0.026184719134607792
Model trained.
Output saved.





# 5 - Testing

The **word2box** project provides several test datasets inside the `word2box/data/similarity_datasets` folder. They can be used to test the model on the word similarity task.

To test the model, we first need to load the `best_model.ckpt` file. Then, we have to call the `model_eval` method of the previously created `trainer`. You can see the implementation of this method in **section 3.4**.



## 5.1 Load the model

In [None]:
MODEL_FILENAME = 'best_model.ckpt'

model_path = SAVED_MODELS_DIR + '/' + MODEL_FILENAME

# Load the model's checkpoint
model.load_checkpoint(model_path)

# If you're using a GPU, move the model to the GPU
if torch.cuda.is_available():
    model = model.cuda()

## 5.2 Evaluate the model on word similarity task

In [None]:
trainer.model_eval(model=model)

{'En-Mc-30.Txt': 0.06258470305488,
 'En-Men-Tr-3K.Txt': 0.1599207869380512,
 'En-Mturk-287.Txt': 0.3404155169628823,
 'En-Mturk-771.Txt': 0.16082983148945287,
 'En-Rg-65.Txt': 0.0859654923541653,
 'En-Rw-Stanford.Txt': 0.03820726082554591,
 'En-Simlex-999.Txt': 0.101616594761613,
 'En-Simverb-3500.Txt': -0.0034523562617459263,
 'En-Verb-143.Txt': 0.23299413195613689,
 'En-Ws-353-All.Txt': 0.1524565619017167,
 'En-Ws-353-Rel.Txt': 0.17931554988870288,
 'En-Ws-353-Sim.Txt': 0.13537891426090742,
 'En-Yp-130.Txt': 0.06658094481768495}


{'En-Ws-353-Sim.Txt': 0.13537891426090742,
 'En-Rg-65.Txt': 0.0859654923541653,
 'En-Yp-130.Txt': 0.06658094481768495,
 'En-Mturk-771.Txt': 0.16082983148945287,
 'En-Men-Tr-3K.Txt': 0.1599207869380512,
 'En-Mturk-287.Txt': 0.3404155169628823,
 'En-Verb-143.Txt': 0.23299413195613689,
 'En-Mc-30.Txt': 0.06258470305488,
 'En-Ws-353-All.Txt': 0.1524565619017167,
 'En-Rw-Stanford.Txt': 0.03820726082554591,
 'En-Simlex-999.Txt': 0.101616594761613,
 'En-Ws-353-Rel.Txt': 0.17931554988870288,
 'En-Simverb-3500.Txt': -0.0034523562617459263}

#6 - Conclusions

This notebook provides an in-depth analysis of the **word2box** algorithm. It also provides ready-to-run code to train and test the model. To train the model, you can use the dataset provided by the *word2box* repository or a dataset that I have previously preprocessed. To further explore the algorithm, the training parameters can be tuned in different ways.