# Homework and bake-off: pragmatic color descriptions

In [1]:
__author__ = "Christopher Potts"
__version__ = "CS224u, Stanford, Spring 2020"

## Contents

1. [Overview](#Overview)
1. [Set-up](#Set-up)
1. [All two-word examples as a dev corpus](#All-two-word-examples-as-a-dev-corpus)
1. [Dev dataset](#Dev-dataset)
1. [Random train–test split for development](#Random-train–test-split-for-development)
1. [Question 1: Improve the tokenizer [1 point]](#Question-1:-Improve-the-tokenizer-[1-point])
1. [Use the tokenizer](#Use-the-tokenizer)
1. [Question 2: Improve the color representations [1 point]](#Question-2:-Improve-the-color-representations-[1-point])
1. [Use the color representer](#Use-the-color-representer)
1. [Initial model](#Initial-model)
1. [Question 3: GloVe embeddings [1 points]](#Question-3:-GloVe-embeddings-[1-points])
1. [Try the GloVe representations](#Try-the-GloVe-representations)
1. [Question 4: Color context [3 points]](#Question-4:-Color-context-[3-points])
1. [Your original system [3 points]](#Your-original-system-[3-points])
1. [Bakeoff [1 point]](#Bakeoff-[1-point])

## Overview

This homework and associated bake-off are oriented toward building an effective system for generating color descriptions that are pragmatic in the sense that they would help a reader/listener figure out which color was being referred to in a shared context consisting of a target color (whose identity is known only to the describer/speaker) and a set of distractors.

The notebook [colors_overview.ipynb](colors_overview.ipynb) should be studied before work on this homework begins. That notebook provides backgroud on the task, the dataset, and the modeling code that you will be using and adapting.

The homework questions are more open-ended than previous ones have been. Rather than asking you to implement pre-defined functionality, they ask you to try to improve baseline components of the full system in ways that you find to be effective. As usual, this culiminates in a prompt asking you to develop a novel system for entry into the bake-off. In this case, though, the work you do for the homework will likely be directly incorporated into that system.

## Set-up

See [colors_overview.ipynb](colors_overview.ipynb) for set-up in instructions and other background details.

In [1]:
from colors import ColorsCorpusReader
import os
from sklearn.model_selection import train_test_split
from torch_color_describer import (
    ColorizedNeuralListener, create_example_dataset)
import utils
from utils import START_SYMBOL, END_SYMBOL, UNK_SYMBOL
import numpy as np

In [2]:
utils.fix_random_seeds()

In [3]:
COLORS_SRC_FILENAME = os.path.join(
    "data", "colors", "filteredCorpus.csv")

## All two-word examples as a dev corpus

So that you don't have to sit through excessively long training runs during development, I suggest working with the two-word-only subset of the corpus until you enter into the late stages of system testing.

In [4]:
dev_corpus = ColorsCorpusReader(
    COLORS_SRC_FILENAME, 
    word_count=2, 
    normalize_colors=True)

In [6]:
dev_examples = list(dev_corpus.read())

This subset has about one-third the examples of the full corpus:

In [7]:
len(dev_examples)

13890

We __should__ worry that it's not a fully representative sample. Most of the descriptions in the full corpus are shorter, and a large proportion are longer. So this dataset is mainly for debugging, development, and general hill-climbing. All findings should be validated on the full dataset at some point.

## Dev dataset

The first step is to extract the raw color and raw texts from the corpus:

In [9]:
dev_rawcols, dev_texts = zip(*[[ex.colors, ex.contents] for ex in dev_examples])

The raw color representations are suitable inputs to a model, but the texts are just strings, so they can't really be processed as-is. Question 1 asks you to do some tokenizing!

## Random train–test split for development

For the sake of development runs, we create a random train–test split:

In [10]:
dev_rawcols_train, dev_rawcols_test, dev_texts_train, dev_texts_test = \
    train_test_split(dev_rawcols, dev_texts)

## Question 1: Improve the tokenizer [1 point]

This is the first required question – the first required modification to the default pipeline.

The function `tokenize_example` simply splits its string on whitespace and adds the required start and end symbols:

In [11]:
from colors_utils import heuristic_ending_tokenizer

def tokenize_example(s):
    
    # Improve me!
    
    return [START_SYMBOL] + heuristic_ending_tokenizer(s) + [END_SYMBOL]

def clean_test_and_training(dev_seqs_train, dev_seqs_test):    
    vocab = {}
    for toks in dev_seqs_train+dev_seqs_test:
        for w in toks:
            if w not in vocab:
                vocab[w]=0
            vocab[w]+=1
    removal_candidates = {k:v for k, v in vocab.items() if v == 1 }
    
    dev_seqs_train = [[w if w not in removal_candidates else UNK_SYMBOL for w in toks] for toks in dev_seqs_train]

    dev_seqs_test = [[w if w not in removal_candidates else UNK_SYMBOL for w in toks] for toks in dev_seqs_test]
    return dev_seqs_train, dev_seqs_test

In [12]:
tokenize_example(dev_texts_train[376])

['<s>', 'aqua', ',', 'teal', '</s>']

__Your task__: Modify `tokenize_example` so that it does something more sophisticated with the input text. 

__Notes__:

* There are useful ideas for this in [Monroe et al. 2017](https://transacl.org/ojs/index.php/tacl/article/view/1142)
* There is no requirement that you do word-level tokenization. Sub-word and multi-word are options.
* This question can interact with the size of your vocabulary (see just below), and in turn with decisions about how to use `UNK_SYMBOL`.

__Important__: don't forget to add the start and end symbols, else the resulting models will definitely be terrible!

## Use the tokenizer

Once the tokenizer is working, run the following cell to tokenize your inputs:

In [18]:
dev_seqs_train = [tokenize_example(s) for s in dev_texts_train]

dev_seqs_test = [tokenize_example(s) for s in dev_texts_test]

dev_seqs_train, dev_seqs_test = clean_test_and_training(dev_seqs_train, dev_seqs_test)

We use only the train set to derive a vocabulary for the model:

In [19]:
dev_vocab = sorted({w for toks in dev_seqs_train for w in toks}) + [UNK_SYMBOL]

It's important that the `UNK_SYMBOL` is included somewhere in this list. Test examples with word not seen in training will be mapped to `UNK_SYMBOL`. If you model's vocab is the same as your train vocab, then `UNK_SYMBOL` will never be encountered during training, so it will be a random vector at test time.

In [22]:
len(dev_vocab)

524

## Question 2: Improve the color representations [1 point]

This is the second required pipeline improvement for the assignment. 

The following functions do nothing at all to the raw input colors we get from the corpus. 

In [23]:
import colorsys

def represent_color_context(colors):
    
    # Improve me!
    
    return [represent_color(color) for color in colors]


def represent_color(color):
    #import numpy.fft as fft
    # Improve me!
    #return color
    return colorsys.rgb_to_hsv(*color)

In [24]:
represent_color_context(dev_rawcols_train[0])

[(0.2972459639126306, 0.78, 0.5),
 (0.08032596041909196, 0.6386617100371748, 0.7472222222222222),
 (0.5837378640776699, 0.6270928462709285, 0.73)]

__Your task__: Modify `represent_color_context` and/or `represent_color` to represent colors in a new way.
    
__Notes__:

* The Fourier-transform method of [Monroe et al. 2017](https://transacl.org/ojs/index.php/tacl/article/view/1142) is a proven choice.
* You are not required to keep `represent_color`. This might be unnatural if you want to perform an operation on each color trio all at once.
* For that matter, if you want to process all of the color contexts in the entire data set all at once, that is fine too, as long as you can also perform the operation at test time with an unknown number of examples being tested.

## Use the color representer

The following cell just runs your `represent_color_context` on the train and test sets:

In [25]:
dev_cols_train = [represent_color_context(colors) for colors in dev_rawcols_train]

dev_cols_test = [represent_color_context(colors) for colors in dev_rawcols_test]

At this point, our preprocessing steps are complete, and we can fit a first model.

## Initial model

The first model is configured right now to be a small model run for just a few iterations. It should be enough to get traction, but it's unlikely to be a great model. You are free to modify this configuration if you wish; it is here just for demonstration and testing:

In [26]:
dev_mod = ColorizedNeuralListener(
    dev_vocab, 
    embed_dim=10, 
    hidden_dim=10, 
    max_iter=5, 
    batch_size=128)

Using cpu


In [29]:
_ = dev_mod.fit(dev_cols_train, dev_seqs_train)

ColorizedNeuralListenerEncoder cpu
ColorizedNeuralListenerEncoderDecoder cpu
Train: Epoch 1; err = 84.04119199514389; time = 1.5688772201538086
Train: Epoch 2; err = 77.22541165351868; time = 1.4339301586151123
Train: Epoch 3; err = 75.3536547422409; time = 1.4345111846923828
Train: Epoch 4; err = 73.96716660261154; time = 1.4857251644134521
Train: Epoch 5; err = 73.30008709430695; time = 1.533970832824707


We can also see the model's predicted sequences given color context inputs:

In [30]:
dev_mod.predict(dev_cols_test[:1], dev_seqs_train[:1])

[2]

As discussed in [colors_overview.ipynb](colors_overview.ipynb), our primary metric is `listener_accuracy`:

In [32]:
#dev_mod.listener_accuracy(dev_cols_test, dev_seqs_test)

In [22]:
#dev_seqs_train[:1]

## Question 3: GloVe embeddings [1 points]

The above model uses a random initial embedding, as configured by the decoder used by `ContextualColorDescriber`. This homework question asks you to consider using GloVe inputs. 

__Your task__: Complete `create_glove_embedding` so that it creates a GloVe embedding based on your model vocabulary. This isn't mean to be analytically challenging, but rather just to create a basis for you to try out other kinds of rich initialization.

In [33]:
GLOVE_HOME = os.path.join('data', 'glove.6B')

In [34]:
def create_glove_embedding(vocab, glove_base_filename='glove.6B.100d.txt'):
    
    # Use `utils.glove2dict` to read in the GloVe file:    
    ##### YOUR CODE HERE
    glove_dict = utils.glove2dict(os.path.join(GLOVE_HOME, glove_base_filename))

    
    # Use `utils.create_pretrained_embedding` to create the embedding.
    # This function will, by default, ensure that START_TOKEN, 
    # END_TOKEN, and UNK_TOKEN are included in the embedding.
    ##### YOUR CODE HERE
    embedding, new_vocab = utils.create_pretrained_embedding(glove_dict, vocab)

    
    # Be sure to return the embedding you create as well as the
    # vocabulary returned by `utils.create_pretrained_embedding`,
    # which is likely to have been modified from the input `vocab`.
    
    ##### YOUR CODE HERE
    return embedding, new_vocab


## Try the GloVe representations

Let's see if GloVe helped for our development data:

In [25]:
#dev_glove_embedding, dev_glove_vocab = create_glove_embedding(dev_vocab)

In [35]:
embedding = np.random.normal(
            loc=0, scale=0.01, size=(len(dev_vocab), 100))

The above might dramatically change your vocabulary, depending on how many items from your vocab are in the Glove space:

## Question 4: Color context [3 points]

In [36]:
toy_color_seqs, toy_word_seqs, toy_vocab = create_example_dataset(
    group_size=50, vec_dim=2)

In [37]:
toy_color_seqs_train, toy_color_seqs_test, toy_word_seqs_train, toy_word_seqs_test = \
    train_test_split(toy_color_seqs, toy_word_seqs)

In [38]:
toy_mod = ColorizedNeuralListener(
    toy_vocab, 
    embed_dim=100, 
    embedding=embedding,
    hidden_dim=100, 
    max_iter=100, 
    batch_size=128)

Using cpu


In [39]:
_ = toy_mod.fit(toy_color_seqs_train, toy_word_seqs_train)

ColorizedNeuralListenerEncoder cpu
ColorizedNeuralListenerEncoderDecoder cpu
Train: Epoch 1; err = 1.0981638431549072; time = 0.02631402015686035
Train: Epoch 2; err = 1.0931386947631836; time = 0.01901078224182129
Train: Epoch 3; err = 1.0778679847717285; time = 0.0189511775970459
Train: Epoch 4; err = 1.041054129600525; time = 0.018870830535888672
Train: Epoch 5; err = 1.0372779369354248; time = 0.0178070068359375
Train: Epoch 6; err = 0.9852323532104492; time = 0.015791893005371094
Train: Epoch 7; err = 0.9883937239646912; time = 0.01615595817565918
Train: Epoch 8; err = 0.9423364996910095; time = 0.014310121536254883
Train: Epoch 9; err = 0.9208185076713562; time = 0.014590024948120117
Train: Epoch 10; err = 0.9072319865226746; time = 0.013576030731201172
Train: Epoch 11; err = 0.8765920400619507; time = 0.014236211776733398
Train: Epoch 12; err = 0.8645085692405701; time = 0.014837026596069336
Train: Epoch 13; err = 0.87237948179245; time = 0.015115022659301758
Train: Epoch 14; er

In [40]:
preds = toy_mod.predict(toy_color_seqs_test, toy_word_seqs_test)
correct = sum([1 if x == 2 else 0 for x in preds])
print(correct, "/", len(preds), correct/len(preds))

26 / 38 0.6842105263157895


If that worked, then you can now try this model on SCC problems!

In [41]:
dev_color_mod = ColorizedNeuralListener(
    dev_vocab, 
    #embedding=dev_glove_embedding, 
    embed_dim=100,
    embedding=embedding,
    hidden_dim=100, 
    max_iter=500,
    batch_size=64,
    dropout_prob=0.,
    eta=0.005,
    lr_rate=0.96,
    warm_start=True,
    force_cpu=False)

Using cpu


In [42]:
_ = dev_color_mod.fit(dev_cols_train, dev_seqs_train)

ColorizedNeuralListenerEncoder cpu
ColorizedNeuralListenerEncoderDecoder cpu
Train: Epoch 1; err = 172.73405945301056; time = 2.005457878112793
Train: Epoch 2; err = 164.6625559926033; time = 2.011399984359741
Train: Epoch 3; err = 159.40583962202072; time = 2.0694618225097656
Train: Epoch 4; err = 156.94196170568466; time = 2.1623833179473877
Train: Epoch 5; err = 153.71348571777344; time = 2.159597158432007
Train: Epoch 6; err = 151.77801221609116; time = 2.248502016067505
Train: Epoch 7; err = 151.15116107463837; time = 2.187623977661133
Train: Epoch 8; err = 150.12537652254105; time = 2.2472281455993652
Train: Epoch 9; err = 148.91111654043198; time = 2.2541627883911133
Train: Epoch 10; err = 147.8804365992546; time = 2.2705888748168945
Train: Epoch 11; err = 147.38216990232468; time = 2.227753162384033
Train: Epoch 12; err = 147.5932126045227; time = 2.412477970123291
Train: Epoch 13; err = 146.85077250003815; time = 2.2686657905578613
Train: Epoch 14; err = 145.15704756975174; ti

Train: Epoch 110; err = 122.83343231678009; time = 2.8297810554504395
Train: Epoch 111; err = 123.14543253183365; time = 2.848889112472534
Train: Epoch 112; err = 123.55892890691757; time = 2.9393112659454346
Train: Epoch 113; err = 122.89370036125183; time = 2.8181040287017822
Train: Epoch 114; err = 122.61023271083832; time = 2.7814948558807373
Train: Epoch 115; err = 122.52508795261383; time = 2.8210718631744385
Train: Epoch 116; err = 122.1898283958435; time = 2.785043954849243
Train: Epoch 117; err = 122.06646406650543; time = 2.6946470737457275
Train: Epoch 118; err = 122.64717185497284; time = 2.776918649673462
Train: Epoch 119; err = 122.15429109334946; time = 2.6055917739868164
0.003606947894919167
tensor([0.0900, 0.0608, 0.8493], grad_fn=<MeanBackward1>) 0.6857959032058716
Train: Epoch 120; err = 122.32560449838638; time = 2.4888927936553955
Train: Epoch 121; err = 122.82581549882889; time = 2.548144817352295
Train: Epoch 122; err = 121.4633594751358; time = 2.581407070159912

Train: Epoch 219; err = 117.8572399020195; time = 2.7460718154907227
Train: Epoch 220; err = 117.59159523248672; time = 2.632805109024048
Train: Epoch 221; err = 118.17557549476624; time = 2.6236510276794434
Train: Epoch 222; err = 117.3917595744133; time = 2.5967681407928467
Train: Epoch 223; err = 117.64069038629532; time = 2.630458116531372
Train: Epoch 224; err = 117.45365226268768; time = 2.551806926727295
0.0027104318993045437
tensor([0.0693, 0.1437, 0.7870], grad_fn=<MeanBackward1>) 0.746882438659668
Train: Epoch 225; err = 117.33862388134003; time = 2.54649019241333
Train: Epoch 226; err = 117.33894658088684; time = 2.519274950027466
Train: Epoch 227; err = 117.19848906993866; time = 2.490694999694824
Train: Epoch 228; err = 116.92347627878189; time = 2.432830810546875
Train: Epoch 229; err = 117.42016166448593; time = 2.4875340461730957
Train: Epoch 230; err = 117.13965398073196; time = 2.486039161682129
Train: Epoch 231; err = 117.7436056137085; time = 2.4469308853149414
Trai

Train: Epoch 328; err = 115.42594057321548; time = 3.2626218795776367
Train: Epoch 329; err = 115.24288266897202; time = 3.181856155395508
0.0020367472153163093
tensor([0.1543, 0.1064, 0.7393], grad_fn=<MeanBackward1>) 0.7997725605964661
Train: Epoch 330; err = 115.53066271543503; time = 3.1442790031433105
Train: Epoch 331; err = 115.10750848054886; time = 3.130434989929199
Train: Epoch 332; err = 115.2725841999054; time = 3.0072782039642334
Train: Epoch 333; err = 115.04211854934692; time = 3.0877339839935303
Train: Epoch 334; err = 115.21911191940308; time = 3.0117359161376953
Train: Epoch 335; err = 115.6122055053711; time = 2.909853935241699
Train: Epoch 336; err = 115.32840079069138; time = 2.903993844985962
Train: Epoch 337; err = 116.18538790941238; time = 2.990070104598999
Train: Epoch 338; err = 115.77827286720276; time = 2.9947142601013184
Train: Epoch 339; err = 115.67922914028168; time = 3.1191461086273193
Train: Epoch 340; err = 115.27143037319183; time = 3.223014116287231

Train: Epoch 436; err = 114.09270018339157; time = 2.5489258766174316
Train: Epoch 437; err = 114.2434458732605; time = 2.588402032852173
Train: Epoch 438; err = 114.07116204500198; time = 2.580483913421631
Train: Epoch 439; err = 114.12103819847107; time = 2.5501868724823
Train: Epoch 440; err = 114.12211203575134; time = 2.570964813232422
Train: Epoch 441; err = 114.15007942914963; time = 2.563708782196045
Train: Epoch 442; err = 114.03636240959167; time = 2.592534065246582
Train: Epoch 443; err = 114.05013114213943; time = 2.643465995788574
Train: Epoch 444; err = 113.90775513648987; time = 2.6327271461486816
Train: Epoch 445; err = 114.3116842508316; time = 2.6236751079559326
Train: Epoch 446; err = 114.86062318086624; time = 2.7655909061431885
Train: Epoch 447; err = 114.53568398952484; time = 2.9142239093780518
Train: Epoch 448; err = 114.62177670001984; time = 3.0528581142425537
Train: Epoch 449; err = 113.87623065710068; time = 3.0456759929656982
0.0014692882161535274
tensor([0

In [43]:
test_preds = dev_color_mod.predict(dev_cols_test, dev_seqs_test)
#dev_color_mod.predict(dev_cols_test, dev_seqs_test, probabilities=True)
train_preds = dev_color_mod.predict(dev_cols_train, dev_seqs_train)
#dev_color_mod.predict(dev_cols_test, dev_seqs_test, probabilities=True)

In [45]:
correct = sum([1 if x == 2 else 0 for x in test_preds])
print("test", correct, "/", len(test_preds), correct/len(test_preds))
correct = sum([1 if x == 2 else 0 for x in train_preds])
print("train", correct, "/", len(train_preds), correct/len(train_preds))

test 2437 / 3473 0.70169881946444
train 8962 / 10417 0.8603244696169723


In [52]:
hyperparameter_exploration = {
            "batch_medium": {
                "batch_size": 64,
                "dropout_prob": 0.0,
                "eta": 0.005,
                "lr_rate": 0.96,
            },
            "batch_small": {
                "batch_size": 32,
                "dropout_prob": 0.0,
                "eta": 0.005,
                "lr_rate": 0.96,
            },
             "batch_xsmall": {
                "batch_size": 16,
                "dropout_prob": 0.0,
                "eta": 0.005,
                "lr_rate": 0.96,
            },
            "batch_large": {
                "batch_size": 128,
                "dropout_prob": 0.0,
                "eta": 0.005,
                "lr_rate": 0.96,
            },
            "batch_xlarge": {
                "batch_size": 256,
                "dropout_prob": 0.0,
                "eta": 0.005,
                "lr_rate": 0.96,
            },
            "dropout_small": {
                "batch_size": 64,
                "dropout_prob": 0.1,
                "eta": 0.005,
                "lr_rate": 0.96,
            },
            "dropout_large": {
                "batch_size": 64,
                "dropout_prob": 0.3,
                "eta": 0.005,
                "lr_rate": 0.96,
            },
            "eta_small": {
                "batch_size": 64,
                "dropout_prob": 0.0,
                "eta": 0.0005,
                "lr_rate": 0.96,
            },
            "eta_medium": {
                "batch_size": 64,
                "dropout_prob": 0.0,
                "eta": 0.01,
                "lr_rate": 0.96,
            },
            "eta_large": {
                "batch_size": 64,
                "dropout_prob": 0.0,
                "eta": 0.05,
                "lr_rate": 0.96,
            },
            "lr_small": {
                "batch_size": 64,
                "dropout_prob": 0.0,
                "eta": 0.005,
                "lr_rate": 0.05,
            },
            "lr_medium": {
                "batch_size": 64,
                "dropout_prob": 0.0,
                "eta": 0.005,
                "lr_rate": 0.4,
            },
            "lr_large": {
                "batch_size": 64,
                "dropout_prob": 0.0,
                "eta": 0.005,
                "lr_rate": 0.99,
            },
        }

model_exploration = {}
model_fittings = {}
for config in hyperparameter_exploration:
    print(f'\nTraining for config: {config}')
    model_exploration[config] = ColorizedNeuralListener(
        dev_vocab, 
        embed_dim=100,
        embedding=embedding,
        hidden_dim=100, 
        max_iter=400,
        warm_start=True,
        force_cpu=False, **hyperparameter_exploration[config])
    model_fittings[config] = model_exploration[config].fit(dev_cols_train, dev_seqs_train)


Training for config: batch_medium
Using cpu
ColorizedNeuralListenerEncoder cpu
ColorizedNeuralListenerEncoderDecoder cpu
Train: Epoch 1; err = 172.4085202217102; time = 1.9026310443878174
Train: Epoch 2; err = 163.1066757440567; time = 1.9381968975067139
Train: Epoch 3; err = 159.3544823527336; time = 1.8862369060516357
Train: Epoch 4; err = 156.59143388271332; time = 1.910494089126587
Train: Epoch 5; err = 154.4408074617386; time = 2.0201563835144043
Train: Epoch 6; err = 153.5412660241127; time = 2.045928955078125
Train: Epoch 7; err = 152.05450975894928; time = 2.0667338371276855
Train: Epoch 8; err = 151.10628414154053; time = 2.019474983215332
Train: Epoch 9; err = 150.33194142580032; time = 2.0745432376861572
Train: Epoch 10; err = 149.6664039492607; time = 2.1198348999023438
Train: Epoch 11; err = 150.09579610824585; time = 2.1080482006073
Train: Epoch 12; err = 149.47756612300873; time = 2.155155897140503
Train: Epoch 13; err = 147.82903039455414; time = 2.1725730895996094
Tra

Train: Epoch 110; err = 124.1750248670578; time = 2.770617961883545
Train: Epoch 111; err = 123.77630805969238; time = 2.8097243309020996
Train: Epoch 112; err = 124.31997841596603; time = 2.8579342365264893
Train: Epoch 113; err = 124.07885468006134; time = 2.984311819076538
Train: Epoch 114; err = 124.22640067338943; time = 2.9559879302978516
Train: Epoch 115; err = 124.45721304416656; time = 2.9508988857269287
Train: Epoch 116; err = 123.63563448190689; time = 2.8443820476531982
Train: Epoch 117; err = 122.85251623392105; time = 2.6955888271331787
Train: Epoch 118; err = 123.96957570314407; time = 2.5170552730560303
Train: Epoch 119; err = 123.94109219312668; time = 2.4478278160095215
0.003606947894919167
tensor([0.1234, 0.1469, 0.7297], grad_fn=<MeanBackward1>) 0.8117360472679138
Train: Epoch 120; err = 123.46891552209854; time = 2.358455181121826
Train: Epoch 121; err = 122.72969734668732; time = 2.3089840412139893
Train: Epoch 122; err = 123.51391178369522; time = 2.2747161388397

Train: Epoch 219; err = 119.80273312330246; time = 2.4867289066314697
Train: Epoch 220; err = 119.26295530796051; time = 2.4214260578155518
Train: Epoch 221; err = 119.87914246320724; time = 2.4540328979492188
Train: Epoch 222; err = 120.74695187807083; time = 2.363023281097412
Train: Epoch 223; err = 119.94810795783997; time = 2.3910257816314697
Train: Epoch 224; err = 119.91062599420547; time = 2.322575092315674
0.0027104318993045437
tensor([0.1132, 0.0853, 0.8015], grad_fn=<MeanBackward1>) 0.7346795797348022
Train: Epoch 225; err = 119.6625167131424; time = 2.327266216278076
Train: Epoch 226; err = 119.35956281423569; time = 2.312695026397705
Train: Epoch 227; err = 118.94033843278885; time = 2.2695250511169434
Train: Epoch 228; err = 118.6839325428009; time = 2.2539830207824707
Train: Epoch 229; err = 118.97984826564789; time = 2.2742412090301514
Train: Epoch 230; err = 118.8381689786911; time = 2.31502103805542
Train: Epoch 231; err = 119.01726430654526; time = 2.2976088523864746


Train: Epoch 328; err = 117.1567901968956; time = 2.3762741088867188
Train: Epoch 329; err = 116.9958900809288; time = 2.3632149696350098
0.0020367472153163093
tensor([0.1284, 0.0486, 0.8230], grad_fn=<MeanBackward1>) 0.7169910073280334
Train: Epoch 330; err = 117.15183299779892; time = 2.3852882385253906
Train: Epoch 331; err = 116.96791714429855; time = 2.3623158931732178
Train: Epoch 332; err = 117.26266252994537; time = 2.3805341720581055
Train: Epoch 333; err = 116.92461436986923; time = 2.364043712615967
Train: Epoch 334; err = 117.20121556520462; time = 2.3708059787750244
Train: Epoch 335; err = 117.37174379825592; time = 2.444483995437622
Train: Epoch 336; err = 116.98427188396454; time = 2.4567487239837646
Train: Epoch 337; err = 117.26569133996964; time = 2.466336965560913
Train: Epoch 338; err = 117.00214296579361; time = 2.5048630237579346
Train: Epoch 339; err = 117.70784336328506; time = 2.5522191524505615
Train: Epoch 340; err = 117.52807360887527; time = 2.5126247406005

Train: Epoch 36; err = 266.4420014619827; time = 2.988229751586914
Train: Epoch 37; err = 266.4993517398834; time = 2.8967559337615967
Train: Epoch 38; err = 264.99391639232635; time = 2.9232068061828613
Train: Epoch 39; err = 266.83353340625763; time = 2.9137189388275146
Train: Epoch 40; err = 271.2175948023796; time = 2.9002459049224854
Train: Epoch 41; err = 262.7966698408127; time = 2.917358875274658
Train: Epoch 42; err = 263.34832590818405; time = 2.8705012798309326
Train: Epoch 43; err = 264.0669021010399; time = 2.9269349575042725
Train: Epoch 44; err = 263.39954620599747; time = 2.904902935028076
0.004423679999999999
tensor([0.1006, 0.0822, 0.8172], grad_fn=<MeanBackward1>) 0.7098647952079773
Train: Epoch 45; err = 260.7729706168175; time = 3.002161979675293
Train: Epoch 46; err = 260.3976284265518; time = 3.0469791889190674
Train: Epoch 47; err = 260.45979714393616; time = 3.115813970565796
Train: Epoch 48; err = 260.04351633787155; time = 3.1853020191192627
Train: Epoch 49; 

Train: Epoch 146; err = 239.0605805516243; time = 3.1115882396698
Train: Epoch 147; err = 238.67195719480515; time = 3.0716261863708496
Train: Epoch 148; err = 238.44701021909714; time = 3.0308780670166016
Train: Epoch 149; err = 237.5946220755577; time = 3.072249174118042
0.003324163179957504
tensor([0.0626, 0.1528, 0.7846], grad_fn=<MeanBackward1>) 0.758070707321167
Train: Epoch 150; err = 236.86959666013718; time = 3.1048901081085205
Train: Epoch 151; err = 237.95092153549194; time = 3.16526198387146
Train: Epoch 152; err = 237.28438472747803; time = 3.0334560871124268
Train: Epoch 153; err = 237.84713715314865; time = 3.0583808422088623
Train: Epoch 154; err = 238.28770965337753; time = 3.057784080505371
Train: Epoch 155; err = 240.91111624240875; time = 3.0348198413848877
Train: Epoch 156; err = 239.54686039686203; time = 3.020603895187378
Train: Epoch 157; err = 238.98741537332535; time = 3.1733310222625732
Train: Epoch 158; err = 236.97763234376907; time = 3.192422866821289
Trai

0.0024979340383990676
tensor([0.1224, 0.0244, 0.8532], grad_fn=<MeanBackward1>) 0.6913303136825562
Train: Epoch 255; err = 231.0133991241455; time = 3.231912136077881
Train: Epoch 256; err = 230.83122205734253; time = 3.1993248462677
Train: Epoch 257; err = 230.87837398052216; time = 3.2422430515289307
Train: Epoch 258; err = 230.4906341433525; time = 3.1964218616485596
Train: Epoch 259; err = 230.51456427574158; time = 3.217777729034424
Train: Epoch 260; err = 232.08217829465866; time = 3.2184839248657227
Train: Epoch 261; err = 232.03289413452148; time = 3.164186954498291
Train: Epoch 262; err = 231.82072961330414; time = 3.1761910915374756
Train: Epoch 263; err = 231.61049658060074; time = 3.180086851119995
Train: Epoch 264; err = 230.95741373300552; time = 3.240063190460205
Train: Epoch 265; err = 231.29740542173386; time = 3.2208118438720703
Train: Epoch 266; err = 231.5162469148636; time = 3.229968309402466
Train: Epoch 267; err = 232.26828479766846; time = 3.2363061904907227
Tra

Train: Epoch 362; err = 228.4721458554268; time = 3.2530171871185303
Train: Epoch 363; err = 228.22254329919815; time = 3.212351083755493
Train: Epoch 364; err = 227.5647241473198; time = 3.2314438819885254
Train: Epoch 365; err = 229.42934775352478; time = 3.2427940368652344
Train: Epoch 366; err = 228.74811601638794; time = 3.1409730911254883
Train: Epoch 367; err = 229.24781346321106; time = 3.2748379707336426
Train: Epoch 368; err = 227.931758582592; time = 3.2607240676879883
Train: Epoch 369; err = 227.1858925819397; time = 3.1778719425201416
Train: Epoch 370; err = 227.66336411237717; time = 3.2565629482269287
Train: Epoch 371; err = 227.53134089708328; time = 3.181697130203247
Train: Epoch 372; err = 227.78206753730774; time = 3.2030510902404785
Train: Epoch 373; err = 228.07389426231384; time = 3.2265257835388184
Train: Epoch 374; err = 227.81866878271103; time = 3.2445061206817627
0.00180198358429009
tensor([0.1502, 0.0579, 0.7918], grad_fn=<MeanBackward1>) 0.7530935406684875


Train: Epoch 72; err = 506.9566775560379; time = 4.647888898849487
Train: Epoch 73; err = 513.3322265744209; time = 4.382638216018677
Train: Epoch 74; err = 502.7636184692383; time = 4.438126802444458
0.004076863487999999
tensor([2.0352e-12, 1.0065e-18, 1.0000e+00], grad_fn=<MeanBackward1>) 0.5514447093009949
Train: Epoch 75; err = 498.5547252893448; time = 4.436722040176392
Train: Epoch 76; err = 499.3145307302475; time = 4.402345180511475
Train: Epoch 77; err = 499.6700849533081; time = 4.453946828842163
Train: Epoch 78; err = 496.1575842499733; time = 4.460522890090942
Train: Epoch 79; err = 495.7898290157318; time = 4.402024984359741
Train: Epoch 80; err = 495.62304651737213; time = 4.448755979537964
Train: Epoch 81; err = 500.16941863298416; time = 4.432720899581909
Train: Epoch 82; err = 495.7653725743294; time = 4.42321515083313
Train: Epoch 83; err = 501.2105879187584; time = 4.421267032623291
Train: Epoch 84; err = 508.12695026397705; time = 4.441049814224243
Train: Epoch 85; 

Train: Epoch 181; err = 474.21362894773483; time = 4.591544151306152
Train: Epoch 182; err = 473.8145781159401; time = 4.726147174835205
Train: Epoch 183; err = 476.9692293405533; time = 4.66487193107605
Train: Epoch 184; err = 484.145891726017; time = 4.620847940444946
Train: Epoch 185; err = 475.0236847400665; time = 4.581180810928345
Train: Epoch 186; err = 475.22868824005127; time = 4.606388092041016
Train: Epoch 187; err = 473.18153989315033; time = 4.851457118988037
Train: Epoch 188; err = 473.65484887361526; time = 4.869959115982056
Train: Epoch 189; err = 472.64319360256195; time = 4.8405938148498535
Train: Epoch 190; err = 473.26510632038116; time = 4.557305812835693
Train: Epoch 191; err = 473.5797002315521; time = 4.605724096298218
Train: Epoch 192; err = 473.3297877907753; time = 4.558164834976196
Train: Epoch 193; err = 474.05201798677444; time = 4.51834511756897
Train: Epoch 194; err = 471.11244225502014; time = 4.558001756668091
0.002941006835182882
tensor([5.5592e-02, 6

Train: Epoch 290; err = 463.16593861579895; time = 4.791747808456421
Train: Epoch 291; err = 463.4593493938446; time = 4.514563083648682
Train: Epoch 292; err = 463.8458684682846; time = 4.529588937759399
Train: Epoch 293; err = 462.3357181549072; time = 4.521372079849243
Train: Epoch 294; err = 461.3580062389374; time = 4.59481406211853
Train: Epoch 295; err = 462.7314594388008; time = 4.627978086471558
Train: Epoch 296; err = 462.5102521777153; time = 4.633054256439209
Train: Epoch 297; err = 461.2955161333084; time = 4.53462815284729
Train: Epoch 298; err = 462.90609961748123; time = 4.537220001220703
Train: Epoch 299; err = 462.0858837366104; time = 4.5847649574279785
0.002210012169397037
tensor([0.0000e+00, 1.4051e-11, 1.0000e+00], grad_fn=<MeanBackward1>) 0.5514447093009949
Train: Epoch 300; err = 466.00640296936035; time = 4.544975280761719
Train: Epoch 301; err = 462.00154834985733; time = 4.510313987731934
Train: Epoch 302; err = 460.1344481706619; time = 4.50146222114563
Trai

Train: Epoch 399; err = 455.3525087237358; time = 4.556226015090942
Train: Epoch 400; err = 455.41910725831985; time = 4.59233283996582

Training for config: batch_large
Using cpu
ColorizedNeuralListenerEncoder cpu
ColorizedNeuralListenerEncoderDecoder cpu
Train: Epoch 1; err = 87.37073820829391; time = 1.799379825592041
Train: Epoch 2; err = 84.7578387260437; time = 1.7968549728393555
Train: Epoch 3; err = 81.96560603380203; time = 1.8042271137237549
Train: Epoch 4; err = 80.1120982170105; time = 1.768920660018921
Train: Epoch 5; err = 79.09020602703094; time = 1.771657943725586
Train: Epoch 6; err = 79.82344061136246; time = 1.8069818019866943
Train: Epoch 7; err = 77.61268502473831; time = 1.794360876083374
Train: Epoch 8; err = 77.28653520345688; time = 1.7840838432312012
Train: Epoch 9; err = 76.88307726383209; time = 1.7857038974761963
Train: Epoch 10; err = 76.40439343452454; time = 1.8228230476379395
Train: Epoch 11; err = 77.37224900722504; time = 1.781484842300415
Train: Epoc

Train: Epoch 109; err = 63.18418312072754; time = 1.8503057956695557
Train: Epoch 110; err = 64.05094754695892; time = 1.889854907989502
Train: Epoch 111; err = 63.82304668426514; time = 1.8813908100128174
Train: Epoch 112; err = 63.82453155517578; time = 1.91019606590271
Train: Epoch 113; err = 63.86874508857727; time = 1.8533248901367188
Train: Epoch 114; err = 63.46858114004135; time = 1.827901840209961
Train: Epoch 115; err = 63.440477669239044; time = 1.8341259956359863
Train: Epoch 116; err = 63.06795132160187; time = 1.8230090141296387
Train: Epoch 117; err = 63.52537500858307; time = 1.837231159210205
Train: Epoch 118; err = 63.189080595970154; time = 1.8577373027801514
Train: Epoch 119; err = 62.84174728393555; time = 1.814673900604248
0.003606947894919167
tensor([0.1364, 0.1356, 0.7279], grad_fn=<MeanBackward1>) 0.8096221089363098
Train: Epoch 120; err = 63.07891637086868; time = 1.8315849304199219
Train: Epoch 121; err = 63.05487269163132; time = 1.8064937591552734
Train: Ep

Train: Epoch 219; err = 61.427520871162415; time = 1.878688097000122
Train: Epoch 220; err = 60.8333854675293; time = 1.8822531700134277
Train: Epoch 221; err = 60.78331995010376; time = 1.8346848487854004
Train: Epoch 222; err = 60.41019994020462; time = 1.8578288555145264
Train: Epoch 223; err = 60.36415421962738; time = 1.8852617740631104
Train: Epoch 224; err = 60.32142263650894; time = 1.8938589096069336
0.0027104318993045437
tensor([0.1471, 0.1451, 0.7079], grad_fn=<MeanBackward1>) 0.8297202587127686
Train: Epoch 225; err = 60.29588484764099; time = 1.913844108581543
Train: Epoch 226; err = 60.36318665742874; time = 1.9170057773590088
Train: Epoch 227; err = 60.247194945812225; time = 1.8847160339355469
Train: Epoch 228; err = 60.12211334705353; time = 1.8945751190185547
Train: Epoch 229; err = 60.0177218914032; time = 1.8926799297332764
Train: Epoch 230; err = 59.977897346019745; time = 1.9033019542694092
Train: Epoch 231; err = 62.09295493364334; time = 1.846747875213623
Train:

Train: Epoch 328; err = 59.19956690073013; time = 1.9315288066864014
Train: Epoch 329; err = 59.165461122989655; time = 1.9307239055633545
0.0020367472153163093
tensor([0.1290, 0.1280, 0.7430], grad_fn=<MeanBackward1>) 0.8013822436332703
Train: Epoch 330; err = 59.08205020427704; time = 1.9293098449707031
Train: Epoch 331; err = 58.8167182803154; time = 1.9425568580627441
Train: Epoch 332; err = 60.454160928726196; time = 1.9178760051727295
Train: Epoch 333; err = 60.07485610246658; time = 1.9477078914642334
Train: Epoch 334; err = 59.69580012559891; time = 1.922166109085083
Train: Epoch 335; err = 59.23557251691818; time = 1.937690019607544
Train: Epoch 336; err = 59.14467012882233; time = 1.9567978382110596
Train: Epoch 337; err = 58.88284695148468; time = 1.9396109580993652
Train: Epoch 338; err = 59.39306443929672; time = 1.9620401859283447
Train: Epoch 339; err = 59.66187208890915; time = 1.9521470069885254
Train: Epoch 340; err = 58.862776815891266; time = 1.8948187828063965
Trai

Train: Epoch 36; err = 36.24827438592911; time = 1.5677270889282227
Train: Epoch 37; err = 36.191568315029144; time = 1.5780560970306396
Train: Epoch 38; err = 36.00677663087845; time = 1.5447862148284912
Train: Epoch 39; err = 36.16826939582825; time = 1.5606558322906494
Train: Epoch 40; err = 35.99359887838364; time = 1.643967866897583
Train: Epoch 41; err = 35.85054391622543; time = 1.5596661567687988
Train: Epoch 42; err = 35.94732481241226; time = 1.5881478786468506
Train: Epoch 43; err = 35.779068410396576; time = 1.6016321182250977
Train: Epoch 44; err = 35.73152709007263; time = 1.5661792755126953
0.004423679999999999
tensor([0.1839, 0.1781, 0.6380], grad_fn=<MeanBackward1>) 0.8743948936462402
Train: Epoch 45; err = 35.66848981380463; time = 1.5895040035247803
Train: Epoch 46; err = 35.57622140645981; time = 1.5906548500061035
Train: Epoch 47; err = 35.38685703277588; time = 1.583441972732544
Train: Epoch 48; err = 35.382991313934326; time = 1.5507609844207764
Train: Epoch 49; 

Train: Epoch 146; err = 31.94400030374527; time = 1.6302812099456787
Train: Epoch 147; err = 31.758779048919678; time = 1.674004077911377
Train: Epoch 148; err = 31.72279864549637; time = 1.6174159049987793
Train: Epoch 149; err = 31.689376711845398; time = 1.6121931076049805
0.003324163179957504
tensor([0.1092, 0.1169, 0.7739], grad_fn=<MeanBackward1>) 0.7552047967910767
Train: Epoch 150; err = 31.788796961307526; time = 1.6071031093597412
Train: Epoch 151; err = 31.703272342681885; time = 1.6181349754333496
Train: Epoch 152; err = 31.753090858459473; time = 1.6390042304992676
Train: Epoch 153; err = 31.683014631271362; time = 1.6115999221801758
Train: Epoch 154; err = 31.536756455898285; time = 1.5980322360992432
Train: Epoch 155; err = 31.44390118122101; time = 1.5985510349273682
Train: Epoch 156; err = 31.458579421043396; time = 1.6263539791107178
Train: Epoch 157; err = 31.571496784687042; time = 1.6156821250915527
Train: Epoch 158; err = 31.526220083236694; time = 1.6072061061859

0.0024979340383990676
tensor([0.1065, 0.0917, 0.8017], grad_fn=<MeanBackward1>) 0.7357764840126038
Train: Epoch 255; err = 30.70400959253311; time = 1.625124216079712
Train: Epoch 256; err = 30.46892422437668; time = 1.610450029373169
Train: Epoch 257; err = 30.50845068693161; time = 1.6330430507659912
Train: Epoch 258; err = 30.43673539161682; time = 1.630760908126831
Train: Epoch 259; err = 30.384515821933746; time = 1.6555750370025635
Train: Epoch 260; err = 30.38917076587677; time = 1.699164867401123
Train: Epoch 261; err = 30.346210420131683; time = 1.6425130367279053
Train: Epoch 262; err = 30.403152644634247; time = 1.6363048553466797
Train: Epoch 263; err = 30.293351113796234; time = 1.6450369358062744
Train: Epoch 264; err = 30.333603620529175; time = 1.634488821029663
Train: Epoch 265; err = 30.387780964374542; time = 1.6190659999847412
Train: Epoch 266; err = 30.596715688705444; time = 1.6263530254364014
Train: Epoch 267; err = 30.608726143836975; time = 1.6605429649353027
T

Train: Epoch 362; err = 29.85229468345642; time = 1.7135870456695557
Train: Epoch 363; err = 29.90709751844406; time = 1.7226810455322266
Train: Epoch 364; err = 29.91406399011612; time = 1.676335096359253
Train: Epoch 365; err = 29.805643558502197; time = 1.6810181140899658
Train: Epoch 366; err = 29.750112533569336; time = 1.6815330982208252
Train: Epoch 367; err = 29.762202322483063; time = 1.6767268180847168
Train: Epoch 368; err = 29.88146311044693; time = 1.6691770553588867
Train: Epoch 369; err = 29.910292208194733; time = 1.7097859382629395
Train: Epoch 370; err = 29.881310880184174; time = 1.670008659362793
Train: Epoch 371; err = 29.781645596027374; time = 1.654313325881958
Train: Epoch 372; err = 29.76809734106064; time = 1.6835689544677734
Train: Epoch 373; err = 29.79858100414276; time = 1.6530380249023438
Train: Epoch 374; err = 29.838784992694855; time = 1.6573939323425293
0.00180198358429009
tensor([0.1321, 0.1046, 0.7633], grad_fn=<MeanBackward1>) 0.7745950222015381
Tr

Train: Epoch 72; err = 138.17186826467514; time = 2.220407247543335
Train: Epoch 73; err = 138.03243273496628; time = 2.193997859954834
Train: Epoch 74; err = 138.16805535554886; time = 2.2439889907836914
0.004076863487999999
tensor([0.1735, 0.0762, 0.7503], grad_fn=<MeanBackward1>) 0.7751052379608154
Train: Epoch 75; err = 137.8197284936905; time = 2.1962480545043945
Train: Epoch 76; err = 137.33520311117172; time = 2.251849889755249
Train: Epoch 77; err = 137.51989340782166; time = 2.2449138164520264
Train: Epoch 78; err = 137.18295496702194; time = 2.2439849376678467
Train: Epoch 79; err = 137.55106616020203; time = 2.259885787963867
Train: Epoch 80; err = 137.107859313488; time = 2.267648935317993
Train: Epoch 81; err = 136.8421002626419; time = 2.254256010055542
Train: Epoch 82; err = 136.680444419384; time = 2.2631120681762695
Train: Epoch 83; err = 138.43661165237427; time = 2.246368646621704
Train: Epoch 84; err = 137.025026679039; time = 2.2027552127838135
Train: Epoch 85; err

Train: Epoch 181; err = 130.50498795509338; time = 2.2448933124542236
Train: Epoch 182; err = 130.58989983797073; time = 2.2699408531188965
Train: Epoch 183; err = 131.32978910207748; time = 2.2766778469085693
Train: Epoch 184; err = 130.85800057649612; time = 2.2780568599700928
Train: Epoch 185; err = 129.97735315561295; time = 2.2789220809936523
Train: Epoch 186; err = 130.36985212564468; time = 2.297783136367798
Train: Epoch 187; err = 129.91520589590073; time = 2.2440550327301025
Train: Epoch 188; err = 129.89654737710953; time = 2.2726328372955322
Train: Epoch 189; err = 129.78745752573013; time = 2.250831127166748
Train: Epoch 190; err = 129.87127989530563; time = 2.2711827754974365
Train: Epoch 191; err = 130.13989132642746; time = 2.3051559925079346
Train: Epoch 192; err = 130.67887139320374; time = 2.31597900390625
Train: Epoch 193; err = 130.2488141655922; time = 2.338326930999756
Train: Epoch 194; err = 129.95470428466797; time = 2.31162691116333
0.002941006835182882
tensor(

Train: Epoch 290; err = 127.69450241327286; time = 2.2865841388702393
Train: Epoch 291; err = 127.68731158971786; time = 2.323927879333496
Train: Epoch 292; err = 127.79021990299225; time = 2.319132089614868
Train: Epoch 293; err = 127.54123598337173; time = 2.3305299282073975
Train: Epoch 294; err = 126.75446206331253; time = 2.3274638652801514
Train: Epoch 295; err = 127.01136595010757; time = 2.3138248920440674
Train: Epoch 296; err = 127.14024257659912; time = 2.3113369941711426
Train: Epoch 297; err = 127.30875462293625; time = 2.269798755645752
Train: Epoch 298; err = 126.9938822388649; time = 2.2712900638580322
Train: Epoch 299; err = 127.19734424352646; time = 2.2602550983428955
0.002210012169397037
tensor([0.0600, 0.1859, 0.7541], grad_fn=<MeanBackward1>) 0.780566394329071
Train: Epoch 300; err = 126.98234361410141; time = 2.2928168773651123
Train: Epoch 301; err = 127.06040120124817; time = 2.308912754058838
Train: Epoch 302; err = 127.22604835033417; time = 2.278548955917358

Train: Epoch 399; err = 125.4786929488182; time = 2.3699707984924316
Train: Epoch 400; err = 125.68136900663376; time = 2.295217275619507

Training for config: dropout_large
Using cpu
ColorizedNeuralListenerEncoder cpu
ColorizedNeuralListenerEncoderDecoder cpu
Train: Epoch 1; err = 173.32180392742157; time = 2.1666648387908936
Train: Epoch 2; err = 167.6329665184021; time = 2.201144218444824
Train: Epoch 3; err = 163.09282290935516; time = 2.203233003616333
Train: Epoch 4; err = 160.47885262966156; time = 2.1597187519073486
Train: Epoch 5; err = 159.20955634117126; time = 2.183666229248047
Train: Epoch 6; err = 157.46846640110016; time = 2.177856922149658
Train: Epoch 7; err = 156.67616075277328; time = 2.2031030654907227
Train: Epoch 8; err = 155.92827928066254; time = 2.198446750640869
Train: Epoch 9; err = 155.53340935707092; time = 2.208916187286377
Train: Epoch 10; err = 154.76886028051376; time = 2.150759220123291
Train: Epoch 11; err = 154.62547147274017; time = 2.11170578002929

Train: Epoch 108; err = 139.01806008815765; time = 2.21156907081604
Train: Epoch 109; err = 139.29745733737946; time = 2.2051620483398438
Train: Epoch 110; err = 139.06802207231522; time = 2.235517978668213
Train: Epoch 111; err = 139.1582260131836; time = 2.228804111480713
Train: Epoch 112; err = 139.3508147597313; time = 2.1810622215270996
Train: Epoch 113; err = 139.16580080986023; time = 2.242374897003174
Train: Epoch 114; err = 138.95052027702332; time = 2.2317099571228027
Train: Epoch 115; err = 138.87312293052673; time = 2.190887928009033
Train: Epoch 116; err = 138.40402340888977; time = 2.24493408203125
Train: Epoch 117; err = 138.0894819498062; time = 2.1998720169067383
Train: Epoch 118; err = 138.74777227640152; time = 2.192845106124878
Train: Epoch 119; err = 138.93647253513336; time = 2.1876001358032227
0.003606947894919167
tensor([0.1234, 0.2815, 0.5951], grad_fn=<MeanBackward1>) 0.9287171363830566
Train: Epoch 120; err = 138.78544771671295; time = 2.25256609916687
Train:

Train: Epoch 217; err = 135.19915342330933; time = 2.1993300914764404
Train: Epoch 218; err = 134.96877843141556; time = 2.2220370769500732
Train: Epoch 219; err = 135.733662545681; time = 2.234041929244995
Train: Epoch 220; err = 135.22948390245438; time = 2.185433864593506
Train: Epoch 221; err = 134.964681327343; time = 2.177103042602539
Train: Epoch 222; err = 135.12178963422775; time = 2.194782018661499
Train: Epoch 223; err = 135.16930878162384; time = 2.1564319133758545
Train: Epoch 224; err = 134.836461186409; time = 2.241990804672241
0.0027104318993045437
tensor([0.1932, 0.1200, 0.6868], grad_fn=<MeanBackward1>) 0.8302444815635681
Train: Epoch 225; err = 134.79494416713715; time = 2.1780290603637695
Train: Epoch 226; err = 134.80968171358109; time = 2.1994502544403076
Train: Epoch 227; err = 134.8948226571083; time = 2.224207878112793
Train: Epoch 228; err = 134.37443125247955; time = 2.2367448806762695
Train: Epoch 229; err = 134.89284843206406; time = 2.203735113143921
Train

Train: Epoch 326; err = 132.48383170366287; time = 2.2887017726898193
Train: Epoch 327; err = 132.2242909669876; time = 2.294069766998291
Train: Epoch 328; err = 132.12499582767487; time = 2.2713160514831543
Train: Epoch 329; err = 133.14007312059402; time = 2.23044490814209
0.0020367472153163093
tensor([0.1518, 0.1273, 0.7209], grad_fn=<MeanBackward1>) 0.8068379163742065
Train: Epoch 330; err = 132.05640137195587; time = 2.2338240146636963
Train: Epoch 331; err = 132.59054243564606; time = 2.22422194480896
Train: Epoch 332; err = 132.34220427274704; time = 2.222167730331421
Train: Epoch 333; err = 132.33116227388382; time = 2.231531858444214
Train: Epoch 334; err = 132.01626980304718; time = 2.2451210021972656
Train: Epoch 335; err = 132.0142759680748; time = 2.2155230045318604
Train: Epoch 336; err = 132.74975109100342; time = 2.240872859954834
Train: Epoch 337; err = 132.4687227010727; time = 2.226261854171753
Train: Epoch 338; err = 132.11166620254517; time = 2.2529120445251465
Tra

Train: Epoch 34; err = 151.31159526109695; time = 2.1690940856933594
Train: Epoch 35; err = 150.8862009048462; time = 2.1759090423583984
Train: Epoch 36; err = 150.903582572937; time = 2.142073154449463
Train: Epoch 37; err = 151.37847757339478; time = 2.1227519512176514
Train: Epoch 38; err = 150.54011487960815; time = 2.166853189468384
Train: Epoch 39; err = 150.84447944164276; time = 2.1857988834381104
Train: Epoch 40; err = 150.41001081466675; time = 2.1482558250427246
Train: Epoch 41; err = 150.33770221471786; time = 2.1746256351470947
Train: Epoch 42; err = 149.85742831230164; time = 2.122730016708374
Train: Epoch 43; err = 150.03810441493988; time = 2.1622519493103027
Train: Epoch 44; err = 150.31632781028748; time = 2.137789011001587
0.00044236799999999995
tensor([0.2988, 0.1460, 0.5552], grad_fn=<MeanBackward1>) 0.9344802498817444
Train: Epoch 45; err = 149.83323109149933; time = 2.119863986968994
Train: Epoch 46; err = 150.0646631717682; time = 2.1401572227478027
Train: Epoch

Train: Epoch 144; err = 142.89966535568237; time = 2.149587869644165
Train: Epoch 145; err = 142.60997760295868; time = 2.1016359329223633
Train: Epoch 146; err = 142.23328864574432; time = 2.159675121307373
Train: Epoch 147; err = 142.42229825258255; time = 2.1480939388275146
Train: Epoch 148; err = 142.2211212515831; time = 2.1858692169189453
Train: Epoch 149; err = 142.58596086502075; time = 2.145942211151123
0.0003324163179957504
tensor([0.1601, 0.2790, 0.5608], grad_fn=<MeanBackward1>) 0.9580060839653015
Train: Epoch 150; err = 142.5980304479599; time = 2.1461281776428223
Train: Epoch 151; err = 142.25664722919464; time = 2.1900901794433594
Train: Epoch 152; err = 142.3813797235489; time = 2.213020086288452
Train: Epoch 153; err = 142.23427516222; time = 2.181260108947754
Train: Epoch 154; err = 142.55189168453217; time = 2.2171599864959717
Train: Epoch 155; err = 142.73984825611115; time = 2.148405075073242
Train: Epoch 156; err = 142.16852056980133; time = 2.1609280109405518
Tra

Train: Epoch 253; err = 138.97482633590698; time = 2.180622100830078
Train: Epoch 254; err = 139.0799262523651; time = 2.1789519786834717
0.00024979340383990666
tensor([0.1239, 0.2725, 0.6036], grad_fn=<MeanBackward1>) 0.9208188652992249
Train: Epoch 255; err = 139.3042578101158; time = 2.151921272277832
Train: Epoch 256; err = 138.96373224258423; time = 2.140744924545288
Train: Epoch 257; err = 139.00957864522934; time = 2.1707777976989746
Train: Epoch 258; err = 138.81687778234482; time = 2.1798670291900635
Train: Epoch 259; err = 138.88202291727066; time = 2.1665937900543213
Train: Epoch 260; err = 138.78374511003494; time = 2.164757013320923
Train: Epoch 261; err = 138.85375893115997; time = 2.189059019088745
Train: Epoch 262; err = 138.84209787845612; time = 2.202069044113159
Train: Epoch 263; err = 138.5425969362259; time = 2.2119810581207275
Train: Epoch 264; err = 138.82846474647522; time = 2.1858901977539062
Train: Epoch 265; err = 138.5621992945671; time = 2.2030088901519775


Train: Epoch 361; err = 136.57344311475754; time = 2.1893818378448486
Train: Epoch 362; err = 136.31658524274826; time = 2.173459053039551
Train: Epoch 363; err = 136.668177485466; time = 2.1528189182281494
Train: Epoch 364; err = 136.40878373384476; time = 2.171135187149048
Train: Epoch 365; err = 136.4281229376793; time = 2.1638541221618652
Train: Epoch 366; err = 136.37763333320618; time = 2.2089099884033203
Train: Epoch 367; err = 136.34638285636902; time = 2.1588120460510254
Train: Epoch 368; err = 136.4579005241394; time = 2.1650607585906982
Train: Epoch 369; err = 136.68502908945084; time = 2.208899974822998
Train: Epoch 370; err = 136.38200986385345; time = 2.1555399894714355
Train: Epoch 371; err = 136.73563063144684; time = 2.1809170246124268
Train: Epoch 372; err = 136.26801198720932; time = 2.195681095123291
Train: Epoch 373; err = 136.6114137172699; time = 2.156008005142212
Train: Epoch 374; err = 136.73225861787796; time = 2.2037408351898193
0.00018019835842900898
tensor(

Train: Epoch 71; err = 124.64714002609253; time = 2.228607177734375
Train: Epoch 72; err = 125.23513317108154; time = 2.1998250484466553
Train: Epoch 73; err = 124.47666227817535; time = 2.186373233795166
Train: Epoch 74; err = 123.87888443470001; time = 2.213836193084717
0.008153726975999998
tensor([0.1750, 0.1134, 0.7116], grad_fn=<MeanBackward1>) 0.8136171698570251
Train: Epoch 75; err = 124.82277250289917; time = 2.2253291606903076
Train: Epoch 76; err = 124.23177641630173; time = 2.260913848876953
Train: Epoch 77; err = 123.28556180000305; time = 2.2673699855804443
Train: Epoch 78; err = 123.66631597280502; time = 2.3100879192352295
Train: Epoch 79; err = 124.42530626058578; time = 2.2297680377960205
Train: Epoch 80; err = 124.08712208271027; time = 2.2376556396484375
Train: Epoch 81; err = 123.6665386557579; time = 2.2262279987335205
Train: Epoch 82; err = 123.0581186413765; time = 2.247460126876831
Train: Epoch 83; err = 123.77562856674194; time = 2.240886926651001
Train: Epoch 

0.006127097573297671
tensor([0.1363, 0.0744, 0.7892], grad_fn=<MeanBackward1>) 0.7401146292686462
Train: Epoch 180; err = 121.04452782869339; time = 2.2958950996398926
Train: Epoch 181; err = 118.55714291334152; time = 2.316622018814087
Train: Epoch 182; err = 118.18845689296722; time = 2.3198912143707275
Train: Epoch 183; err = 118.05534595251083; time = 2.3191330432891846
Train: Epoch 184; err = 117.65892398357391; time = 2.31479811668396
Train: Epoch 185; err = 117.5989716053009; time = 2.3039727210998535
Train: Epoch 186; err = 117.73140096664429; time = 2.3469879627227783
Train: Epoch 187; err = 117.80647039413452; time = 2.346580982208252
Train: Epoch 188; err = 118.27440148591995; time = 2.307461738586426
Train: Epoch 189; err = 118.67751878499985; time = 2.33738112449646
Train: Epoch 190; err = 117.90562564134598; time = 2.3278591632843018
Train: Epoch 191; err = 117.81627994775772; time = 2.271713972091675
Train: Epoch 192; err = 118.23843771219254; time = 2.3011980056762695
T

Train: Epoch 287; err = 115.89232569932938; time = 2.356295108795166
Train: Epoch 288; err = 116.38474726676941; time = 2.3431360721588135
Train: Epoch 289; err = 115.43492567539215; time = 2.3507280349731445
Train: Epoch 290; err = 116.28684514760971; time = 2.3007400035858154
Train: Epoch 291; err = 115.58036243915558; time = 2.3497421741485596
Train: Epoch 292; err = 115.22952729463577; time = 2.38747501373291
Train: Epoch 293; err = 115.20905131101608; time = 2.409301996231079
Train: Epoch 294; err = 115.19783616065979; time = 2.30938720703125
Train: Epoch 295; err = 115.08892500400543; time = 2.310945987701416
Train: Epoch 296; err = 115.36126816272736; time = 2.3380658626556396
Train: Epoch 297; err = 115.21046960353851; time = 2.3123860359191895
Train: Epoch 298; err = 114.80564445257187; time = 2.2980220317840576
Train: Epoch 299; err = 118.92657721042633; time = 2.2972159385681152
0.004420024338794074
tensor([0.1135, 0.1141, 0.7724], grad_fn=<MeanBackward1>) 0.7674746513366699

Train: Epoch 396; err = 113.83559739589691; time = 2.336297035217285
Train: Epoch 397; err = 113.92890828847885; time = 2.3518640995025635
Train: Epoch 398; err = 113.9797842502594; time = 2.3348188400268555
Train: Epoch 399; err = 114.34223538637161; time = 2.3462228775024414
Train: Epoch 400; err = 113.75961029529572; time = 2.3359811305999756

Training for config: eta_large
Using cpu
ColorizedNeuralListenerEncoder cpu
ColorizedNeuralListenerEncoderDecoder cpu
Train: Epoch 1; err = 179.1927808523178; time = 2.17958402633667
Train: Epoch 2; err = 169.2687411904335; time = 2.1765382289886475
Train: Epoch 3; err = 164.95541840791702; time = 2.1822750568389893
Train: Epoch 4; err = 165.09772473573685; time = 2.160039186477661
Train: Epoch 5; err = 163.63577431440353; time = 2.1279191970825195
Train: Epoch 6; err = 161.34953099489212; time = 2.1412353515625
Train: Epoch 7; err = 158.41087347269058; time = 2.210663080215454
Train: Epoch 8; err = 155.2488602399826; time = 2.196949005126953


Train: Epoch 106; err = 160.39034658670425; time = 2.4330968856811523
Train: Epoch 107; err = 157.48823767900467; time = 2.373028039932251
Train: Epoch 108; err = 157.57078045606613; time = 2.374772071838379
Train: Epoch 109; err = 158.72213965654373; time = 2.3865950107574463
Train: Epoch 110; err = 170.95581722259521; time = 2.3627262115478516
Train: Epoch 111; err = 173.13240110874176; time = 2.4848358631134033
Train: Epoch 112; err = 168.32893884181976; time = 2.419332981109619
Train: Epoch 113; err = 164.90175992250443; time = 2.4216980934143066
Train: Epoch 114; err = 162.7339290380478; time = 2.415879011154175
Train: Epoch 115; err = 160.11667376756668; time = 2.3822081089019775
Train: Epoch 116; err = 160.3437162041664; time = 2.415508985519409
Train: Epoch 117; err = 161.29170942306519; time = 2.4349138736724854
Train: Epoch 118; err = 161.0210840702057; time = 2.403493881225586
Train: Epoch 119; err = 159.0758137702942; time = 2.388801097869873
0.036069478949191665
tensor([0.

Train: Epoch 215; err = 140.60640400648117; time = 2.3944289684295654
Train: Epoch 216; err = 141.3951171040535; time = 2.376310110092163
Train: Epoch 217; err = 141.99020332098007; time = 2.4316558837890625
Train: Epoch 218; err = 141.27340483665466; time = 2.42887806892395
Train: Epoch 219; err = 143.6957955956459; time = 2.4476208686828613
Train: Epoch 220; err = 142.62359058856964; time = 2.4686710834503174
Train: Epoch 221; err = 141.47102427482605; time = 2.4676058292388916
Train: Epoch 222; err = 140.3037422299385; time = 2.3913049697875977
Train: Epoch 223; err = 139.4313735961914; time = 2.39254093170166
Train: Epoch 224; err = 138.5564662218094; time = 2.364107131958008
0.02710431899304544
tensor([0.1577, 0.0902, 0.7521], grad_fn=<MeanBackward1>) 0.7790684103965759
Train: Epoch 225; err = 138.8305622935295; time = 2.386760950088501
Train: Epoch 226; err = 137.5719987154007; time = 2.4699220657348633
Train: Epoch 227; err = 137.63700103759766; time = 2.4593310356140137
Train: 

Train: Epoch 324; err = 138.7066868543625; time = 2.474958658218384
Train: Epoch 325; err = 137.77894389629364; time = 2.5183069705963135
Train: Epoch 326; err = 137.4399333000183; time = 2.486684799194336
Train: Epoch 327; err = 140.97387582063675; time = 2.4897782802581787
Train: Epoch 328; err = 136.5360325574875; time = 2.506295919418335
Train: Epoch 329; err = 136.51279824972153; time = 2.522526979446411
0.020367472153163094
tensor([0.1207, 0.1244, 0.7549], grad_fn=<MeanBackward1>) 0.7777178287506104
Train: Epoch 330; err = 135.6824390888214; time = 2.4638171195983887
Train: Epoch 331; err = 135.93106043338776; time = 2.446100950241089
Train: Epoch 332; err = 136.13166671991348; time = 2.4798479080200195
Train: Epoch 333; err = 135.38853430747986; time = 2.497473955154419
Train: Epoch 334; err = 135.75492733716965; time = 2.4606709480285645
Train: Epoch 335; err = 135.05714374780655; time = 2.484076976776123
Train: Epoch 336; err = 136.20013773441315; time = 2.521890163421631
Trai

Train: Epoch 32; err = 143.13072991371155; time = 2.225480079650879
Train: Epoch 33; err = 143.1061195731163; time = 2.234607219696045
Train: Epoch 34; err = 143.10864824056625; time = 2.171977996826172
Train: Epoch 35; err = 143.11559599637985; time = 2.1656861305236816
Train: Epoch 36; err = 143.10645353794098; time = 2.1663670539855957
Train: Epoch 37; err = 143.075021982193; time = 2.1537249088287354
Train: Epoch 38; err = 143.05051332712173; time = 2.1573588848114014
Train: Epoch 39; err = 143.04362493753433; time = 2.184044122695923
Train: Epoch 40; err = 143.0406945347786; time = 2.1473357677459717
Train: Epoch 41; err = 143.02474105358124; time = 2.172424077987671
Train: Epoch 42; err = 143.02228063344955; time = 2.136367082595825
Train: Epoch 43; err = 143.02303808927536; time = 2.1491079330444336
Train: Epoch 44; err = 142.99016451835632; time = 2.159183979034424
6.25e-07
tensor([0.2316, 0.1411, 0.6274], grad_fn=<MeanBackward1>) 0.8774058818817139
Train: Epoch 45; err = 143.0

Train: Epoch 142; err = 142.9829579591751; time = 2.1213598251342773
Train: Epoch 143; err = 142.96501541137695; time = 2.1957850456237793
Train: Epoch 144; err = 142.9636219739914; time = 2.2134060859680176
Train: Epoch 145; err = 142.96150541305542; time = 2.202277183532715
Train: Epoch 146; err = 142.98401111364365; time = 2.169451951980591
Train: Epoch 147; err = 142.96254760026932; time = 2.19185733795166
Train: Epoch 148; err = 142.95946884155273; time = 2.1648011207580566
Train: Epoch 149; err = 142.9898682832718; time = 2.1482789516448975
4.882812500000003e-16
tensor([0.2082, 0.1644, 0.6274], grad_fn=<MeanBackward1>) 0.8723357319831848
Train: Epoch 150; err = 142.9765629172325; time = 2.182497978210449
Train: Epoch 151; err = 142.99827599525452; time = 2.1576130390167236
Train: Epoch 152; err = 142.97769594192505; time = 2.1403918266296387
Train: Epoch 153; err = 142.98970818519592; time = 2.132154941558838
Train: Epoch 154; err = 142.97433584928513; time = 2.14260196685791
Tra

Train: Epoch 251; err = 142.95922142267227; time = 2.1749391555786133
Train: Epoch 252; err = 142.98548525571823; time = 2.195323944091797
Train: Epoch 253; err = 142.96198374032974; time = 2.173403739929199
Train: Epoch 254; err = 142.97133713960648; time = 2.214329957962036
3.814697265625004e-25
tensor([0.1725, 0.1614, 0.6662], grad_fn=<MeanBackward1>) 0.8438409566879272
Train: Epoch 255; err = 142.9698829650879; time = 2.2122528553009033
Train: Epoch 256; err = 142.9953938126564; time = 2.221292018890381
Train: Epoch 257; err = 142.98217648267746; time = 2.2245900630950928
Train: Epoch 258; err = 142.98078191280365; time = 2.2074978351593018
Train: Epoch 259; err = 142.99588686227798; time = 2.2280828952789307
Train: Epoch 260; err = 143.00030320882797; time = 2.2595880031585693
Train: Epoch 261; err = 142.99052572250366; time = 2.1864359378814697
Train: Epoch 262; err = 142.97439831495285; time = 2.2241158485412598
Train: Epoch 263; err = 142.97900688648224; time = 2.23682618141174

2.9802322387695356e-34
tensor([0.1891, 0.2057, 0.6052], grad_fn=<MeanBackward1>) 0.899335503578186
Train: Epoch 360; err = 142.9828879237175; time = 2.3094751834869385
Train: Epoch 361; err = 142.97710591554642; time = 2.3263847827911377
Train: Epoch 362; err = 142.9868819117546; time = 2.340353012084961
Train: Epoch 363; err = 142.9934064745903; time = 2.3629560470581055
Train: Epoch 364; err = 142.96210569143295; time = 2.3344597816467285
Train: Epoch 365; err = 142.9728416800499; time = 2.3077280521392822
Train: Epoch 366; err = 142.9545032978058; time = 2.3252670764923096
Train: Epoch 367; err = 142.97257965803146; time = 2.3007619380950928
Train: Epoch 368; err = 142.97099512815475; time = 2.340456962585449
Train: Epoch 369; err = 142.95577257871628; time = 2.326366901397705
Train: Epoch 370; err = 142.97860658168793; time = 2.319687843322754
Train: Epoch 371; err = 142.99058252573013; time = 2.36299991607666
Train: Epoch 372; err = 142.97964429855347; time = 2.3106439113616943
Tr

Train: Epoch 69; err = 130.0984480381012; time = 2.240189790725708
Train: Epoch 70; err = 130.0279976129532; time = 2.2496771812438965
Train: Epoch 71; err = 130.06054055690765; time = 2.15608811378479
Train: Epoch 72; err = 129.99361288547516; time = 2.166968822479248
Train: Epoch 73; err = 129.9272045493126; time = 2.1513710021972656
Train: Epoch 74; err = 129.90110456943512; time = 2.1962029933929443
5.120000000000001e-05
tensor([0.1517, 0.1194, 0.7289], grad_fn=<MeanBackward1>) 0.7953416705131531
Train: Epoch 75; err = 129.84533697366714; time = 2.191321849822998
Train: Epoch 76; err = 129.67325669527054; time = 2.1794021129608154
Train: Epoch 77; err = 129.67768514156342; time = 2.180391311645508
Train: Epoch 78; err = 129.65168017148972; time = 2.1969878673553467
Train: Epoch 79; err = 129.67360019683838; time = 2.222698926925659
Train: Epoch 80; err = 129.61202955245972; time = 2.1789920330047607
Train: Epoch 81; err = 129.63108211755753; time = 2.181619882583618
Train: Epoch 82

Train: Epoch 178; err = 129.24572896957397; time = 2.2014122009277344
Train: Epoch 179; err = 129.24508172273636; time = 2.2232820987701416
8.388608000000007e-08
tensor([0.1100, 0.1428, 0.7472], grad_fn=<MeanBackward1>) 0.7785813212394714
Train: Epoch 180; err = 129.24343484640121; time = 2.278815984725952
Train: Epoch 181; err = 129.23873645067215; time = 2.1905159950256348
Train: Epoch 182; err = 129.24748253822327; time = 2.178159236907959
Train: Epoch 183; err = 129.2437989115715; time = 2.180389165878296
Train: Epoch 184; err = 129.24177807569504; time = 2.2065300941467285
Train: Epoch 185; err = 129.24406665563583; time = 2.2065558433532715
Train: Epoch 186; err = 129.2396201491356; time = 2.173042058944702
Train: Epoch 187; err = 129.25988215208054; time = 2.196537971496582
Train: Epoch 188; err = 129.2570685148239; time = 2.1923179626464844
Train: Epoch 189; err = 129.275266289711; time = 2.2006499767303467
Train: Epoch 190; err = 129.26936584711075; time = 2.299712896347046
Tr

Train: Epoch 286; err = 129.24376326799393; time = 2.170297861099243
Train: Epoch 287; err = 129.2612493634224; time = 2.1540610790252686
Train: Epoch 288; err = 129.24444091320038; time = 2.1940441131591797
Train: Epoch 289; err = 129.24168121814728; time = 2.1950840950012207
Train: Epoch 290; err = 129.25026762485504; time = 2.2105119228363037
Train: Epoch 291; err = 129.24892270565033; time = 2.1729800701141357
Train: Epoch 292; err = 129.2515977025032; time = 2.199542999267578
Train: Epoch 293; err = 129.25606137514114; time = 2.219412088394165
Train: Epoch 294; err = 129.25438559055328; time = 2.1623270511627197
Train: Epoch 295; err = 129.22810804843903; time = 2.193969249725342
Train: Epoch 296; err = 129.2596788406372; time = 2.1721081733703613
Train: Epoch 297; err = 129.24147087335587; time = 2.169823169708252
Train: Epoch 298; err = 129.25138294696808; time = 2.1699719429016113
Train: Epoch 299; err = 129.228131711483; time = 2.14973521232605
5.497558138880007e-11
tensor([0.

Train: Epoch 395; err = 129.22668832540512; time = 2.151151180267334
Train: Epoch 396; err = 129.25816023349762; time = 2.1639890670776367
Train: Epoch 397; err = 129.25356394052505; time = 2.1721198558807373
Train: Epoch 398; err = 129.2359681725502; time = 2.152987241744995
Train: Epoch 399; err = 129.23490422964096; time = 2.2081918716430664
Train: Epoch 400; err = 129.2484648823738; time = 2.1798830032348633

Training for config: lr_large
Using cpu
ColorizedNeuralListenerEncoder cpu
ColorizedNeuralListenerEncoderDecoder cpu
Train: Epoch 1; err = 176.57816982269287; time = 2.1914620399475098
Train: Epoch 2; err = 165.9784476161003; time = 2.220134973526001
Train: Epoch 3; err = 160.83957117795944; time = 2.174172878265381
Train: Epoch 4; err = 158.21571403741837; time = 2.1849608421325684
Train: Epoch 5; err = 155.59450215101242; time = 2.210139274597168
Train: Epoch 6; err = 153.746131837368; time = 2.1834776401519775
Train: Epoch 7; err = 152.86804401874542; time = 2.2215700149536

Train: Epoch 106; err = 127.36018490791321; time = 2.1582720279693604
Train: Epoch 107; err = 127.70705789327621; time = 2.169701099395752
Train: Epoch 108; err = 125.78749519586563; time = 2.2320430278778076
Train: Epoch 109; err = 126.50278323888779; time = 2.220720052719116
Train: Epoch 110; err = 126.02510166168213; time = 2.1894540786743164
Train: Epoch 111; err = 125.8369511961937; time = 2.159778118133545
Train: Epoch 112; err = 125.29536157846451; time = 2.2101712226867676
Train: Epoch 113; err = 126.76042973995209; time = 2.24767804145813
Train: Epoch 114; err = 125.88960874080658; time = 2.1537699699401855
Train: Epoch 115; err = 125.06680399179459; time = 2.258060932159424
Train: Epoch 116; err = 126.2240480184555; time = 2.202198028564453
Train: Epoch 117; err = 125.50231283903122; time = 2.231013059616089
Train: Epoch 118; err = 125.42477136850357; time = 2.270094871520996
Train: Epoch 119; err = 124.59617614746094; time = 2.259223699569702
0.0046137234721396
tensor([0.106

Train: Epoch 215; err = 118.99971926212311; time = 2.2440390586853027
Train: Epoch 216; err = 119.18001371622086; time = 2.215074062347412
Train: Epoch 217; err = 120.37658071517944; time = 2.2680959701538086
Train: Epoch 218; err = 120.52789342403412; time = 2.2233500480651855
Train: Epoch 219; err = 119.53441953659058; time = 2.23516583442688
Train: Epoch 220; err = 119.00841856002808; time = 2.2215330600738525
Train: Epoch 221; err = 118.80336767435074; time = 2.257906198501587
Train: Epoch 222; err = 119.3019151687622; time = 2.245568037033081
Train: Epoch 223; err = 119.09828305244446; time = 2.2609200477600098
Train: Epoch 224; err = 118.93870228528976; time = 2.262495994567871
0.004300291773206443
tensor([0.1267, 0.0344, 0.8389], grad_fn=<MeanBackward1>) 0.6998190879821777
Train: Epoch 225; err = 118.94570994377136; time = 2.295281171798706
Train: Epoch 226; err = 119.46868258714676; time = 2.2526116371154785
Train: Epoch 227; err = 120.11539924144745; time = 2.2507503032684326


Train: Epoch 324; err = 117.91102516651154; time = 2.345062017440796
Train: Epoch 325; err = 117.42412900924683; time = 2.299931049346924
Train: Epoch 326; err = 117.34259700775146; time = 2.308371067047119
Train: Epoch 327; err = 116.91366821527481; time = 2.31630277633667
Train: Epoch 328; err = 116.78759241104126; time = 2.2712080478668213
Train: Epoch 329; err = 116.87813329696655; time = 2.3009021282196045
0.00400815294769523
tensor([0.1051, 0.1093, 0.7856], grad_fn=<MeanBackward1>) 0.7515583634376526
Train: Epoch 330; err = 116.95521938800812; time = 2.2894909381866455
Train: Epoch 331; err = 116.82769793272018; time = 2.2857301235198975
Train: Epoch 332; err = 116.5975952744484; time = 2.256218910217285
Train: Epoch 333; err = 117.0042929649353; time = 2.2956011295318604
Train: Epoch 334; err = 118.07951009273529; time = 2.311537981033325
Train: Epoch 335; err = 117.25062131881714; time = 2.290397882461548
Train: Epoch 336; err = 116.77962332963943; time = 2.3208131790161133
Tra

In [53]:
for config, model in model_exploration.items():
    print(f'\nResults for {config}')
    test_preds = model.predict(dev_cols_test, dev_seqs_test)
    #dev_color_mod.predict(dev_cols_test, dev_seqs_test, probabilities=True)
    train_preds = model.predict(dev_cols_train, dev_seqs_train)
    #dev_color_mod.predict(dev_cols_test, dev_seqs_test, probabilities=True)
    correct = sum([1 if x == 2 else 0 for x in test_preds])
    print("test", correct, "/", len(test_preds), correct/len(test_preds))
    correct = sum([1 if x == 2 else 0 for x in train_preds])
    print("train", correct, "/", len(train_preds), correct/len(train_preds))


Results for batch_medium
test 2385 / 3473 0.6867261733371725
train 8806 / 10417 0.8453489488336373

Results for batch_small
test 2409 / 3473 0.6936366253959113
train 8928 / 10417 0.85706057406163

Results for batch_xsmall
test 2417 / 3473 0.6959401094154909
train 8959 / 10417 0.8600364788326773

Results for batch_large
test 2422 / 3473 0.6973797869277282
train 8818 / 10417 0.8465009119708169

Results for batch_xlarge
test 2399 / 3473 0.6907572703714367
train 8701 / 10417 0.8352692713833157

Results for dropout_small
test 2214 / 3473 0.6374892024186583
train 8330 / 10417 0.7996544110588462

Results for dropout_large
test 2175 / 3473 0.6262597178232076
train 8066 / 10417 0.7743112220408946

Results for eta_small
test 2268 / 3473 0.6530377195508206
train 7526 / 10417 0.7224728808678123

Results for eta_medium
test 2407 / 3473 0.6930607543910164
train 8935 / 10417 0.8577325525583182

Results for eta_large
test 2282 / 3473 0.657068816585085
train 7583 / 10417 0.7279447057694154

Results fo

In [41]:
totals = {}
for ex in dev_examples:
    #ex.display(typ='speaker')
    #print(ex.condition)
    if ex.condition not in totals:
        totals[ex.condition] = 0
    totals[ex.condition]+=1
    #print(dev_color_mod.predict([ex.speaker_context], [tokenize_example(ex.contents)], probabilities=True))
    #print(dev_color_mod.predict([ex.speaker_context], [tokenize_example(ex.contents)])[0])
    #print()
    
scores = {}
for ex in dev_examples:
    #ex.display(typ='speaker')
    #print(ex.condition)
    if ex.condition not in scores:
        scores[ex.condition] = 0
    if dev_color_mod.predict([represent_color_context(ex.colors)], [tokenize_example(ex.contents)])[0] == 2:
        scores[ex.condition]+=1

In [42]:
for condition in scores:
    print(condition, ":", scores[condition], "/", totals[condition], "=", scores[condition]/totals[condition])

close : 4071 / 5776 = 0.7048130193905817
far : 2009 / 2657 = 0.756115920210764
split : 3869 / 5457 = 0.7089976177386843


In [43]:
#dev_perp = dev_color_mod.perplexities(dev_cols_test, dev_seqs_test)
#dev_perp[0]

In [44]:
#dev_color_mod.to_pickle(os.path.join('data', 'colors' 'color_describer_unigram_20e.pt'))