# Experiments: Sequence to Sequence

This notebook reproduces our reproducibility project during the Fairness, Accountability,
Confidentiality and Transparency (FACT) course at University of Amsterdam. Specifically, we reproduce the results from
"Learning to Deceive with Attention-Based Explanations".

While our main code is contained in the folders `classification` and `sequence-to-sequence`, we enable training and
visualization via this notebook.

## Imports and Setup

In [6]:
%load_ext autoreload
%autoreload 2
%matplotlib inline
%config InlineBackend.figure_format = 'retina'

from run_experiments_util import run_synthetic_experiments, run_en_de_experiments

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


## Configuration

In [7]:
attentions = ['dot-product', 'uniform', 'no-attention']
seeds = [1, 2, 3, 4, 5]
coefficients = [0.0, 1.0, 0.1]
tasks = ['copy', 'reverse-copy', 'binary-flip']
epochs = 30

## Training + Evaluation

For this part of the experiments' attention is computed as dot-product and impermissible words, as defined in our reproducibility report, are penalized.
The lambda coefficient (0.0, 0.1 or 1.0) defines respectively if placing attention on these impermissible words is penalized and if so how much.

The authors also ran experiments with uniform and no attention (ablation studies) and no penalty on impermissible words (loss coefficient 0.0).

In [None]:
# Sequence Copy, Sequence Reverse and Bigram Flip
run_synthetic_experiments(clear_out=True,
                          seeds=seeds,
                          tasks=tasks,
                          coefficients=coefficients,
                          attentions=attentions,
                          epochs=epochs)

# English German Machine Translation
run_en_de_experiments(clear_out=True,
                      seeds=seeds,
                      coefficients=coefficients,
                      attentions=attentions,
                      epochs=epochs)

## Evaluation

Because our models take quite some time to be trained and this might not be feasable on your machine, the following cells provide
functionalities to load and test pretrained models.

- add instruction where do get models from

### Sequence Copy, Sequence Reverse and Bigram Flip

In [8]:
run_synthetic_experiments(clear_out=True,
                          stage='test',
                          seeds=seeds,
                          tasks=tasks,
                          coefficients=coefficients,
                          attentions=attentions,
                          epochs=epochs)

Configuration: coeff: 0.1 seed: 1 attention: dot-product device: cpu task: copy
Could not find file. Proceeding to next model.
Configuration: coeff: 0.1 seed: 2 attention: dot-product device: cpu task: copy
Could not find file. Proceeding to next model.
Configuration: coeff: 0.1 seed: 3 attention: dot-product device: cpu task: copy
Could not find file. Proceeding to next model.
Configuration: coeff: 0.1 seed: 4 attention: dot-product device: cpu task: copy
Could not find file. Proceeding to next model.
Configuration: coeff: 0.1 seed: 5 attention: dot-product device: cpu task: copy


KeyboardInterrupt: 

### English-German Translation

In [2]:
run_en_de_experiments(clear_out=True, stage='test', seeds=[1,2])

Configuration: coeff: 0.0 seed: 1 attention: dot-product device: cpu task: en-de
Loading model with path: data/models/model_en-de_attention=dot-product_seed=1_coeff=0.0_epoch=7.pt


 12%|█▎        | 1/8 [00:04<00:28,  4.14s/it]


KeyboardInterrupt: 