# Homework and bake-off: pragmatic color descriptions

In [1]:
__author__ = "Christopher Potts"
__version__ = "CS224u, Stanford, Spring 2020"

## Contents

1. [Overview](#Overview)
1. [Set-up](#Set-up)
1. [All two-word examples as a dev corpus](#All-two-word-examples-as-a-dev-corpus)
1. [Dev dataset](#Dev-dataset)
1. [Random train–test split for development](#Random-train–test-split-for-development)
1. [Question 1: Improve the tokenizer [1 point]](#Question-1:-Improve-the-tokenizer-[1-point])
1. [Use the tokenizer](#Use-the-tokenizer)
1. [Question 2: Improve the color representations [1 point]](#Question-2:-Improve-the-color-representations-[1-point])
1. [Use the color representer](#Use-the-color-representer)
1. [Initial model](#Initial-model)
1. [Question 3: GloVe embeddings [1 points]](#Question-3:-GloVe-embeddings-[1-points])
1. [Try the GloVe representations](#Try-the-GloVe-representations)
1. [Question 4: Color context [3 points]](#Question-4:-Color-context-[3-points])
1. [Your original system [3 points]](#Your-original-system-[3-points])
1. [Bakeoff [1 point]](#Bakeoff-[1-point])

## Overview

This homework and associated bake-off are oriented toward building an effective system for generating color descriptions that are pragmatic in the sense that they would help a reader/listener figure out which color was being referred to in a shared context consisting of a target color (whose identity is known only to the describer/speaker) and a set of distractors.

The notebook [colors_overview.ipynb](colors_overview.ipynb) should be studied before work on this homework begins. That notebook provides backgroud on the task, the dataset, and the modeling code that you will be using and adapting.

The homework questions are more open-ended than previous ones have been. Rather than asking you to implement pre-defined functionality, they ask you to try to improve baseline components of the full system in ways that you find to be effective. As usual, this culiminates in a prompt asking you to develop a novel system for entry into the bake-off. In this case, though, the work you do for the homework will likely be directly incorporated into that system.

## Set-up

See [colors_overview.ipynb](colors_overview.ipynb) for set-up in instructions and other background details.

In [2]:
from colors import ColorsCorpusReader
import os
from sklearn.model_selection import train_test_split
from torch_listener_with_attention import (
    AttentionalColorizedNeuralListener, create_example_dataset)
import utils
from utils import START_SYMBOL, END_SYMBOL, UNK_SYMBOL
import numpy as np

In [3]:
utils.fix_random_seeds()

In [4]:
COLORS_SRC_FILENAME = os.path.join(
    "data", "colors", "filteredCorpus.csv")

## Dev datasets

In [5]:
# This variable will toggle whether we're training the listener or the listener-hallucinating speaker
agent = 'speaker'

In [6]:
def load_from_pickle():
    import pickle 
    with open('dev_vocab_'+agent+'.pickle', 'rb') as handle:
        dev_vocab = pickle.load(handle)
    with open('dev_seqs_test.pickle', 'rb') as handle:
        dev_seqs_test = pickle.load(handle)
    with open('dev_seqs_train_'+agent+'.pickle', 'rb') as handle:
        dev_seqs_train = pickle.load(handle)
    with open('dev_cols_test.pickle', 'rb') as handle:
        dev_cols_test = pickle.load(handle)
    with open('dev_cols_train_'+agent+'.pickle', 'rb') as handle:
        dev_cols_train = pickle.load(handle)
    with open('embedding.pickle', 'rb') as handle:
        embedding = pickle.load(handle)
    return dev_vocab, dev_seqs_test, dev_seqs_train, dev_cols_test, dev_cols_train, embedding

dev_vocab, dev_seqs_test, dev_seqs_train, dev_cols_test, dev_cols_train, embedding = load_from_pickle()

## Question 4: Color context [3 points]

In [7]:
toy_color_seqs, toy_word_seqs, toy_vocab = create_example_dataset(
    group_size=50, vec_dim=2)

In [8]:
toy_color_seqs_train, toy_color_seqs_test, toy_word_seqs_train, toy_word_seqs_test = \
    train_test_split(toy_color_seqs, toy_word_seqs)

In [9]:
toy_mod = AttentionalColorizedNeuralListener(
    toy_vocab, 
    embed_dim=100, 
    embedding=embedding,
    hidden_dim=100, 
    max_iter=100, 
    batch_size=128)

Using cuda


In [10]:
_ = toy_mod.fit(toy_color_seqs_train, toy_word_seqs_train)

AttentionalColorizedListenerEncoder cpu
AttentionalColorizedListenerEncoderDecoder cpu
Train: Epoch 1; err = 1.0995502471923828; time = 2.011448621749878
Train: Epoch 2; err = 1.0893242359161377; time = 0.02800607681274414
Train: Epoch 3; err = 1.0737110376358032; time = 0.028006792068481445
Train: Epoch 4; err = 1.0362563133239746; time = 0.028005599975585938
Train: Epoch 5; err = 0.996231198310852; time = 0.03000664710998535
Train: Epoch 6; err = 1.0165879726409912; time = 0.028006553649902344
Train: Epoch 7; err = 0.9940729141235352; time = 0.02800464630126953
Train: Epoch 8; err = 0.9703787565231323; time = 0.027005672454833984
Train: Epoch 9; err = 0.9713678359985352; time = 0.028005599975585938
Train: Epoch 10; err = 0.9723686575889587; time = 0.0290069580078125
Train: Epoch 11; err = 0.9564589858055115; time = 0.02700519561767578
Train: Epoch 12; err = 0.9368780851364136; time = 0.029005765914916992
Train: Epoch 13; err = 0.9309271574020386; time = 0.02600574493408203
Train: Epo

In [11]:
preds = toy_mod.predict(toy_color_seqs_test, toy_word_seqs_test)
correct = sum([1 if x == 2 else 0 for x in preds])
print(correct, "/", len(preds), correct/len(preds))

27 / 38 0.7105263157894737


If that worked, then you can now try this model on SCC problems!

In [12]:
dev_color_mod = AttentionalColorizedNeuralListener(
    dev_vocab, 
    #embedding=dev_glove_embedding, 
    embed_dim=100,
    embedding=embedding,
    hidden_dim=100, 
    max_iter=10,
    batch_size=16,
    dropout_prob=0.7,
    eta=0.001,
    lr_rate=0.96,
    warm_start=True,
    device='cuda')
# Uncomment line if you want to continue training the previous model
# literal_listener.load_model("literal_listener_with_attention_"+agent+"_split.pt.pt")


Using cuda


In [18]:
#_ = dev_color_mod.fit(dev_cols_train, dev_seqs_train)

for i in range(9):
    dev_color_mod.fit(dev_cols_train, dev_seqs_train)
    
    test_preds = dev_color_mod.predict(dev_cols_test, dev_seqs_test)
    train_preds = dev_color_mod.predict(dev_cols_train, dev_seqs_train)
    correct = sum([1 if x == 2 else 0 for x in test_preds])
    print("test", correct, "/", len(test_preds), correct/len(test_preds))
    correct = sum([1 if x == 2 else 0 for x in train_preds])
    print("train", correct, "/", len(train_preds), correct/len(train_preds))

Train: Epoch 181; err = 726.3978577852249; time = 30.76386284828186
Train: Epoch 182; err = 725.076132774353; time = 29.941669702529907
Train: Epoch 183; err = 725.7876468300819; time = 34.857786417007446
Train: Epoch 184; err = 724.8396295309067; time = 33.4764678478241
Train: Epoch 185; err = 726.7567889094353; time = 33.89456129074097
Train: Epoch 186; err = 724.227826654911; time = 32.63728094100952
Train: Epoch 187; err = 723.0948595404625; time = 33.99557304382324
Train: Epoch 188; err = 724.514818072319; time = 33.306429862976074
Train: Epoch 189; err = 725.5529453754425; time = 29.523586750030518
Train: Epoch 190; err = 722.4781914353371; time = 29.864662647247314
test 9089 / 11749 0.7735977530002554
train 15748 / 17623 0.8936049480792146
Train: Epoch 191; err = 730.5085909962654; time = 32.71729803085327
Train: Epoch 192; err = 723.0350778698921; time = 34.309653997421265
Train: Epoch 193; err = 723.6086834669113; time = 34.914788246154785
Train: Epoch 194; err = 720.876796603

In [14]:
#import torch
#torch.cuda.empty_cache()
test_preds = dev_color_mod.predict(dev_cols_test, dev_seqs_test)
#dev_color_mod.predict(dev_cols_test, dev_seqs_test, probabilities=True)
train_preds = dev_color_mod.predict(dev_cols_train, dev_seqs_train)
#dev_color_mod.predict(dev_cols_test, dev_seqs_test, probabilities=True)

In [15]:
correct = sum([1 if x == 2 else 0 for x in test_preds])
print("test", correct, "/", len(test_preds), correct/len(test_preds))
correct = sum([1 if x == 2 else 0 for x in train_preds])
print("train", correct, "/", len(train_preds), correct/len(train_preds))

test 8985 / 11749 0.7647459358243255
train 14925 / 17623 0.8469046132894513


In [16]:
dev_color_mod.save_model("literal_listener_with_attention_"+agent+"_split.pt")