# Homework and bake-off: pragmatic color descriptions

In [1]:
__author__ = "Christopher Potts"
__version__ = "CS224u, Stanford, Spring 2020"

## Contents

1. [Overview](#Overview)
1. [Set-up](#Set-up)
1. [All two-word examples as a dev corpus](#All-two-word-examples-as-a-dev-corpus)
1. [Dev dataset](#Dev-dataset)
1. [Random train–test split for development](#Random-train–test-split-for-development)
1. [Question 1: Improve the tokenizer [1 point]](#Question-1:-Improve-the-tokenizer-[1-point])
1. [Use the tokenizer](#Use-the-tokenizer)
1. [Question 2: Improve the color representations [1 point]](#Question-2:-Improve-the-color-representations-[1-point])
1. [Use the color representer](#Use-the-color-representer)
1. [Initial model](#Initial-model)
1. [Question 3: GloVe embeddings [1 points]](#Question-3:-GloVe-embeddings-[1-points])
1. [Try the GloVe representations](#Try-the-GloVe-representations)
1. [Question 4: Color context [3 points]](#Question-4:-Color-context-[3-points])
1. [Your original system [3 points]](#Your-original-system-[3-points])
1. [Bakeoff [1 point]](#Bakeoff-[1-point])

## Overview

This homework and associated bake-off are oriented toward building an effective system for generating color descriptions that are pragmatic in the sense that they would help a reader/listener figure out which color was being referred to in a shared context consisting of a target color (whose identity is known only to the describer/speaker) and a set of distractors.

The notebook [colors_overview.ipynb](colors_overview.ipynb) should be studied before work on this homework begins. That notebook provides backgroud on the task, the dataset, and the modeling code that you will be using and adapting.

The homework questions are more open-ended than previous ones have been. Rather than asking you to implement pre-defined functionality, they ask you to try to improve baseline components of the full system in ways that you find to be effective. As usual, this culiminates in a prompt asking you to develop a novel system for entry into the bake-off. In this case, though, the work you do for the homework will likely be directly incorporated into that system.

## Set-up

See [colors_overview.ipynb](colors_overview.ipynb) for set-up in instructions and other background details.

In [2]:
from colors import ColorsCorpusReader
import os
from sklearn.model_selection import train_test_split
from torch_listener_with_attention import (
    AttentionalColorizedNeuralListener, create_example_dataset)
import utils
from utils import START_SYMBOL, END_SYMBOL, UNK_SYMBOL
import numpy as np

In [3]:
utils.fix_random_seeds()

In [4]:
COLORS_SRC_FILENAME = os.path.join(
    "data", "colors", "filteredCorpus.csv")

## Dev datasets

In [5]:
def load_from_pickle():
    import pickle 
    with open('dev_vocab.pickle', 'rb') as handle:
        dev_vocab = pickle.load(handle)
    with open('dev_seqs_test.pickle', 'rb') as handle:
        dev_seqs_test = pickle.load(handle)
    with open('dev_seqs_train.pickle', 'rb') as handle:
        dev_seqs_train = pickle.load(handle)
    with open('dev_cols_test.pickle', 'rb') as handle:
        dev_cols_test = pickle.load(handle)
    with open('dev_cols_train.pickle', 'rb') as handle:
        dev_cols_train = pickle.load(handle)
    with open('embedding.pickle', 'rb') as handle:
        embedding = pickle.load(handle)
    return dev_vocab, dev_seqs_test, dev_seqs_train, dev_cols_test, dev_cols_train, embedding
dev_vocab, dev_seqs_test, dev_seqs_train, dev_cols_test, dev_cols_train, embedding = load_from_pickle()

## Question 4: Color context [3 points]

In [6]:
toy_color_seqs, toy_word_seqs, toy_vocab = create_example_dataset(
    group_size=50, vec_dim=2)

In [7]:
toy_color_seqs_train, toy_color_seqs_test, toy_word_seqs_train, toy_word_seqs_test = \
    train_test_split(toy_color_seqs, toy_word_seqs)

In [8]:
toy_mod = AttentionalColorizedNeuralListener(
    toy_vocab, 
    embed_dim=100, 
    embedding=embedding,
    hidden_dim=100, 
    max_iter=100, 
    batch_size=128)

Using cuda


In [9]:
_ = toy_mod.fit(toy_color_seqs_train, toy_word_seqs_train)

AttentionalColorizedListenerEncoder cpu
AttentionalColorizedListenerEncoderDecoder cpu
Train: Epoch 1; err = 1.0995502471923828; time = 2.5735819339752197
Train: Epoch 2; err = 1.0893242359161377; time = 0.03000664710998535
Train: Epoch 3; err = 1.0737110376358032; time = 0.02800607681274414
Train: Epoch 4; err = 1.0362563133239746; time = 0.029006481170654297
Train: Epoch 5; err = 0.996231198310852; time = 0.029006481170654297
Train: Epoch 6; err = 1.0165879726409912; time = 0.027005672454833984
Train: Epoch 7; err = 0.9940729141235352; time = 0.029006481170654297
Train: Epoch 8; err = 0.9703787565231323; time = 0.029006242752075195
Train: Epoch 9; err = 0.9713678359985352; time = 0.025005578994750977
Train: Epoch 10; err = 0.9723686575889587; time = 0.02600574493408203
Train: Epoch 11; err = 0.9564589858055115; time = 0.029006242752075195
Train: Epoch 12; err = 0.9368780851364136; time = 0.0290069580078125
Train: Epoch 13; err = 0.9309271574020386; time = 0.025005817413330078
Train: 

In [10]:
preds = toy_mod.predict(toy_color_seqs_test, toy_word_seqs_test)
correct = sum([1 if x == 2 else 0 for x in preds])
print(correct, "/", len(preds), correct/len(preds))

27 / 38 0.7105263157894737


If that worked, then you can now try this model on SCC problems!

In [11]:
dev_color_mod = AttentionalColorizedNeuralListener(
    dev_vocab, 
    #embedding=dev_glove_embedding, 
    embed_dim=100,
    embedding=embedding,
    hidden_dim=100, 
    max_iter=500,
    batch_size=32,
    dropout_prob=0.7,
    eta=0.001,
    lr_rate=0.96,
    warm_start=True,
    device='cuda')
# Uncomment line if you want to continue training the previous model
# literal_listener.load_model("literal_listener.pt")


Using cuda


In [None]:
_ = dev_color_mod.fit(dev_cols_train, dev_seqs_train)

Train: Epoch 464; err = 697.3633873462677; time = 27.110013008117676
0.0002821033375014772
tensor([0.0264, 0.0795, 0.8941], device='cuda:0', grad_fn=<MeanBackward1>) 0.6501845717430115
Train: Epoch 465; err = 697.6089966893196; time = 27.228989362716675
Train: Epoch 466; err = 696.6519541740417; time = 27.16316533088684
Train: Epoch 467; err = 697.1027517318726; time = 27.149153470993042
Train: Epoch 468; err = 696.5312654972076; time = 27.068135023117065
Train: Epoch 469; err = 696.9392926692963; time = 26.99011754989624
Train: Epoch 470; err = 696.673243522644; time = 27.08813977241516
Train: Epoch 471; err = 696.9639156460762; time = 27.028084754943848
Train: Epoch 472; err = 695.9203844666481; time = 27.209177017211914
Train: Epoch 473; err = 696.4069340229034; time = 27.04012942314148
Train: Epoch 474; err = 697.8989734649658; time = 27.032428979873657
Train: Epoch 475; err = 695.7367730140686; time = 27.13715100288391
Train: Epoch 476; err = 696.8365647792816; time = 27.025105476

0.00021198584153138007
tensor([1.4491e-04, 1.2839e-05, 9.9984e-01], device='cuda:0',
       grad_fn=<MeanBackward1>) 0.5515450239181519
Train: Epoch 570; err = 692.0776328444481; time = 27.08012890815735
Train: Epoch 571; err = 691.9771090745926; time = 26.993127822875977
Train: Epoch 572; err = 692.0697584748268; time = 27.06413459777832
Train: Epoch 573; err = 693.3424463868141; time = 27.137150287628174
Train: Epoch 574; err = 691.9604591727257; time = 27.010121822357178
Train: Epoch 575; err = 692.3866448998451; time = 27.038127660751343
Train: Epoch 576; err = 692.2289009094238; time = 27.175158977508545
Train: Epoch 577; err = 692.8636382818222; time = 26.91910171508789
Train: Epoch 578; err = 691.6376814842224; time = 27.11291265487671
Train: Epoch 579; err = 692.8581780791283; time = 27.093148708343506
Train: Epoch 580; err = 691.8309137225151; time = 27.083146333694458
Train: Epoch 581; err = 691.4247442483902; time = 27.00898289680481
Train: Epoch 582; err = 691.9700686335564

In [23]:
#import torch
#torch.cuda.empty_cache()
test_preds = dev_color_mod.predict(dev_cols_test, dev_seqs_test)
#dev_color_mod.predict(dev_cols_test, dev_seqs_test, probabilities=True)
train_preds = dev_color_mod.predict(dev_cols_train, dev_seqs_train)
#dev_color_mod.predict(dev_cols_test, dev_seqs_test, probabilities=True)

In [24]:
correct = sum([1 if x == 2 else 0 for x in test_preds])
print("test", correct, "/", len(test_preds), correct/len(test_preds))
correct = sum([1 if x == 2 else 0 for x in train_preds])
print("train", correct, "/", len(train_preds), correct/len(train_preds))

test 9425 / 11749 0.8021959315686441
train 32484 / 35245 0.9216626471839977


In [21]:
dev_color_mod.save_model("literal_listener_with_attention.pt")

In [None]:
def save_to_pickle():
    import pickle 

    with open('dev_vocab.pickle', 'wb') as handle:
        pickle.dump(dev_vocab, handle, protocol=pickle.HIGHEST_PROTOCOL)
    with open('dev_seqs_test.pickle', 'wb') as handle:
        pickle.dump(dev_seqs_test, handle, protocol=pickle.HIGHEST_PROTOCOL)
    with open('dev_seqs_train.pickle', 'wb') as handle:
        pickle.dump(dev_seqs_train, handle, protocol=pickle.HIGHEST_PROTOCOL)
    with open('dev_cols_test.pickle', 'wb') as handle:
        pickle.dump(dev_cols_test, handle, protocol=pickle.HIGHEST_PROTOCOL)
    with open('dev_cols_train.pickle', 'wb') as handle:
        pickle.dump(dev_cols_train, handle, protocol=pickle.HIGHEST_PROTOCOL)
    with open('embedding.pickle', 'wb') as handle:
        pickle.dump(embedding, handle, protocol=pickle.HIGHEST_PROTOCOL)
#save_to_pickle()