# Testing the prototype without finetuning
***
So far, we obtained several situations of people actually performing the norm, which we call `norm-story`. Now, we'd like to test, whether natural language inference (NLI), or more specifically, textual entailment models are able to tell apart the moral action from the immoral ones:
* Norm: *It's manipulative to try to force a partner into marriage.*
* Norm-story: *Jake tries to force a partner into marriage*
* Moral action: *Jake proposes to Harry at the bar they met at.*
* Immoral action: *Jake tells Harry that he will kill himself if he doesn't marry him.*

General idea: If an action $A$ entails the norm-story $A_N$ we assume that the actor also performed $A_N$ and is therefore exposed to the value-judgement of the norm. Continuing the above example:
* If we find that $A=$*Jake proposes to Harry at the bar they met at.* is a sufficient condition for the statement $A_N=$*Jake proposes to Harry at the bar they met at.*, then we assign the value *manipulative* to $A$.


In [1]:
import pandas as pd
import numpy as np

pd.set_option('display.max_colwidth', 400)


In [2]:
dataframe = pd.read_pickle("../data/moral_stories_proto_l2s.dat")

In [3]:
# stitch together moral and immoral norms
story_col = "norm_storyfied"
moral_df = dataframe[["moral_action", story_col]].copy()
immoral_df = dataframe[["immoral_action", story_col]].copy()
moral_df.rename(columns={"moral_action":"action"}, inplace=True)
immoral_df.rename(columns={"immoral_action":"action"}, inplace=True)
moral_df["label"] = 1
immoral_df["label"] = 0
moral_df["sentiment"] = dataframe["norm_sentiment"].apply(lambda x: int(x=="POSITIVE"))
immoral_df["sentiment"] = dataframe["norm_sentiment"].apply(lambda x: int(x=="POSITIVE"))

data = pd.concat([moral_df, immoral_df], ignore_index=True)
#data = immoral_df

In [4]:
from sklearn.model_selection import train_test_split
#data, data_val = train_test_split(data, train_size=1000)

In [5]:
# load the NLI model and its tokenizer
from transformers import AutoTokenizer, AutoModelForSequenceClassification

name = "ynie/roberta-large-snli_mnli_fever_anli_R1_R2_R3-nli" # 79.5%
#name = "cross-encoder/nli-distilroberta-base" # 80%
#name = "boychaboy/SNLI_bert-base-uncased" # 75%
checkpoint = "/data/kiehne/results/checkpoint-13495/"
tokenizer = AutoTokenizer.from_pretrained(name)
model = AutoModelForSequenceClassification.from_pretrained(checkpoint)

In [6]:
# convert the dataframe to a huggingface dataset and tokenize the sentences
from datasets import Dataset

def tok(samples):
    return tokenizer(samples["action"], samples[story_col], padding="max_length", 
                     truncation=True, return_token_type_ids=True)

dataset = Dataset.from_pandas(data)
dataset = dataset.map(tok, batched=True)

  0%|          | 0/24 [00:00<?, ?ba/s]

In [7]:
# run evaluation
from transformers import Trainer, TrainingArguments
import torch

training_args = TrainingArguments(
    output_dir="results/",
    num_train_epochs=0,              # total number of training epochs
    per_device_train_batch_size=1,  # batch size per device during training
    per_device_eval_batch_size=32,   # batch size for evaluation
    warmup_steps=500,                # number of warmup steps for learning rate scheduler
    weight_decay=0.01,               # strength of weight decay
    logging_dir='./logs',            # directory for storing logs
    logging_steps=50,                # how often to log
    evaluation_strategy="epoch",     # when to run evaluation
)

trainer = Trainer(
    model=model,
    args=training_args,
)

In [8]:
results = trainer.predict(dataset)
scores = torch.softmax(torch.from_numpy(results.predictions),1).numpy()

is_entailed = (scores[:,0] > scores[:,2]).astype("int32")
labels = np.array(dataset["label"])
sentiment = np.array(dataset["sentiment"])
y_pred = (is_entailed == sentiment).astype("int32")

data["y_pred"] = y_pred
data["is_entailed"] = is_entailed
data = pd.concat([data, pd.DataFrame(scores,columns=["entailment","neutral","contradiction"], index=data.index)],axis=1)
misclassed = data[y_pred != labels]

acc = (y_pred == labels).mean()
print("Accuracy:", acc)

The following columns in the test set  don't have a corresponding argument in `RobertaForSequenceClassification.forward` and have been ignored: norm_storyfied, action, sentiment.
***** Running Prediction *****
  Num examples = 23992
  Batch size = 64


Accuracy: 0.9821190396798933


In [None]:
# show some mis classified samples
temp = data.join(misclassed[misclassed.columns[-5:]],how="inner")
temp["temp_ind"] = temp.index
temp["temp_ind"] = temp["temp_ind"].apply(lambda x: x if x<11996 else x-11996)
cols = ["norm","l2s_output","norm_action",story_col, "norm_value","norm_sentiment"]
misclassed_frame = temp[temp.columns[2:]].join(dataframe[cols], on="temp_ind")
misclassed_frame.drop("temp_ind", axis=1, inplace=True)

In [None]:
misclassed_frame.sample(10)

### Running a classifier on the NLI scores
***
Are there better decision boundaries than $P(entailment)>P(contradiction)$?
* So far: No standard ML classifier was better than our simple rule
* On few occasions, an SVM improved the results by 0.5-1% 

In [None]:
# test, whether a classifier improves the performance

x = np.concatenate([scores, sentiment[:,np.newaxis]], axis=1).copy()
# shuffling
index = np.arange(len(x))
np.random.shuffle(index)
x = x[index]
y = labels[index]

v = 0.1
n = int(len(x)*v)

x_train, y_train = x[n:], y[n:]
x_test, y_test = x[:n], y[:n]
print(x_train.shape, y_train.shape)
print(x_test.shape, y_test.shape)

In [None]:
from sklearn import svm, ensemble
cl = svm.SVC(C=2, kernel="rbf")
cl.fit(x_train, y_train)
y_pred = cl.predict(x_test)
print((y_pred == y_test).mean())

##  Testing the prototype with finetuning
***
Goal: Finetune the NLI models on our norm-stories. The task is to learn, which norm and action pair are entailing or contradicting.

The following pairs of matches are possible:
* moral action + incentivizing norm: we want entailment
* moral action + prohibiting norm: we want contradiction
* immoral action + incentivizing norm: we want contradiction
* immoral action + prohibiting norm: we want entailment

Unless the performance is significantly higher than in the original paper, this experiment is rather pointless.


In [4]:
# the labels need to be adjusted for the nli task according to the above cases
data_nli = data.copy()
# maps entailment (True) or contradiction (False) to class indices of the model
class_map = {True: 0, False: 2}
data_nli["label"] = (data["sentiment"] == data["label"]).apply(class_map.get)

In [5]:
from sklearn.model_selection import train_test_split
train, test = train_test_split(data_nli, test_size=0.1)

In [6]:
# load the NLI model and its tokenizer
from transformers import AutoTokenizer, AutoModelForSequenceClassification

name = "ynie/roberta-large-snli_mnli_fever_anli_R1_R2_R3-nli" # 79.5%
#name = "cross-encoder/nli-distilroberta-base" # 80%
name = "boychaboy/SNLI_bert-base-uncased" # 75%

tokenizer = AutoTokenizer.from_pretrained(name)
model = AutoModelForSequenceClassification.from_pretrained(name)

In [7]:
# split into val data
# convert the dataframe to a huggingface dataset and tokenize the sentences
from datasets import Dataset

def tok(samples):
    return tokenizer(samples["action"], samples[story_col], padding="max_length", 
                     truncation=True, return_token_type_ids=True)

train_data = Dataset.from_pandas(train)
train_data = train_data.map(tok, batched=True)
val_data = Dataset.from_pandas(test)
val_data = val_data.map(tok, batched=True)

  0%|          | 0/22 [00:00<?, ?ba/s]

  0%|          | 0/3 [00:00<?, ?ba/s]

In [8]:
# run evaluation
from transformers import Trainer, TrainingArguments
import torch
from ailignment.datasets.util import get_accuracy_metric

training_args = TrainingArguments(
    output_dir="/data/kiehne/results/bert-snli/",
    num_train_epochs=5,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    gradient_accumulation_steps=1,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir='logs/',
    log_level="info",
    logging_steps=500,
    evaluation_strategy="epoch",
    save_steps=30000000,
    save_strategy="epoch",
    learning_rate=1e-5
    
)
acc_metric = get_accuracy_metric()

In [9]:
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_data,
    eval_dataset=val_data,
    compute_metrics=acc_metric,
)
logs = trainer.train()

The following columns in the training set  don't have a corresponding argument in `BertForSequenceClassification.forward` and have been ignored: sentiment, norm_storyfied, action, __index_level_0__.
***** Running training *****
  Num examples = 21592
  Num Epochs = 5
  Instantaneous batch size per device = 8
  Total train batch size (w. parallel, distributed & accumulation) = 16
  Gradient Accumulation steps = 1
  Total optimization steps = 6750


Epoch,Training Loss,Validation Loss,Accuracy
1,0.5379,0.504042,0.75375
2,0.405,0.524654,0.766667
3,0.3078,0.638413,0.76625


The following columns in the evaluation set  don't have a corresponding argument in `BertForSequenceClassification.forward` and have been ignored: sentiment, norm_storyfied, action, __index_level_0__.
***** Running Evaluation *****
  Num examples = 2400
  Batch size = 16
Saving model checkpoint to /data/kiehne/results/bert-snli/checkpoint-1350
Configuration saved in /data/kiehne/results/bert-snli/checkpoint-1350/config.json
Model weights saved in /data/kiehne/results/bert-snli/checkpoint-1350/pytorch_model.bin
The following columns in the evaluation set  don't have a corresponding argument in `BertForSequenceClassification.forward` and have been ignored: sentiment, norm_storyfied, action, __index_level_0__.
***** Running Evaluation *****
  Num examples = 2400
  Batch size = 16
Saving model checkpoint to /data/kiehne/results/bert-snli/checkpoint-2700
Configuration saved in /data/kiehne/results/bert-snli/checkpoint-2700/config.json
Model weights saved in /data/kiehne/results/bert-snli/ch

KeyboardInterrupt: 

In [None]:
results = trainer.predict(train_data)
scores = torch.softmax(torch.from_numpy(results.predictions),1).numpy()
