# Experiments: Sequence to Sequence

This notebook reproduces our reproducibility project during the Fairness, Accountability,
Confidentiality and Transparency (FACT) course at University of Amsterdam. Specifically, we reproduce the results from
"Learning to Deceive with Attention-Based Explanations".

While our main code is contained in the folders `classification` and `sequence-to-sequence`, we enable training and
visualization via this notebook.

## Imports

In [20]:
import pandas as pd
from prettytable import PrettyTable

from seq2seq.train import train, evaluate_test

# try:
#     import pytorch_lightning as pl
# except ModuleNotFoundError: # In case PyTorch Lightning is not installed by default.
#     !pip install pytorch-lightning==1.0.3
#     import pytorch_lightning as pl

ModuleNotFoundError: No module named 'batch_utils'

## Sequence to Sequence

In [None]:
attentions = ['dot-product', 'uniform', 'no-attention']

# original seeds for which the authors trained their seq2seq models
seeds = [1, 2, 3, 4, 5]
coefficients = [0.0, 1.0, 0.1]
tasks = ['copy', 'reverse-copy', 'binary-flip', 'en-de']

epochs = 30
batch_size = 128

### Training: Attention Manipulation

For this part of the experiments attention is computed as dot-product and impermissible words, as defined in our reproducibility report, are penalized.
The lambda coefficient (0.0, 0.1 or 1.0) defines respectively if placing attention on these impermissible words is penalized and if so how much.

In [None]:
for seed in seeds:
    for coeff in coefficients:
        for task in tasks:
            train(task, epochs, coeff, seed, batch_size, attentions[0])

### Training Baselines (without Attention)
The authors ran experiments with uniform and no attention (ablation studies) and no penalty on impermissible words (loss coefficient 0.0).

In [None]:
for seed in seeds:
    for task in tasks:
        for attention in attentions[1:]:
            train(task, epochs, 0.0, seed, batch_size, attention)


### Results: Attention Manipulation

- load pretrained models which we have saved and included in github repo / code
- visualize some results

In [None]:
evaluate_model = ['best', 'latest']

for seed in seeds:
    for coeff in coefficients:
        for task in tasks:
            loss, acc, attn_mass = evaluate_test(task, coeff, seed, model=evaluate_model[0])

In [None]:
data = [['Dot-Product',2,3],['Uniform',5,6],['None',8,9], ['Manipulated',8,9], ['Manipulated',8,9]]
data_frame = pd.DataFrame(data, columns=['Attention', 'Bigram Flip: Acc.', 'Bigram Flip: A.M.'])

def generate_ascii_table(df):
    x = PrettyTable()
    x.field_names = df.columns.tolist()
    for row in df.values:
        x.add_row(row)
    print(x)
    return x

generate_ascii_table(data_frame)

### Results Baselines (without Attention)
- load pretrained models which we have saved and included in github repo / code
- visualize some results

## English-German Translation