# Action classification for `Moral-Stories` dataset
***
This notebook is simply for training a model to predict whether an action $A$ is moral or immoral given a norm $N$.

Results:
* `roberta-large`: $0.916$
* `ynie/roberta-large-snli_mnli_fever_anli_R1_R2_R3-nli` $0.926$

In [2]:
import pandas as pd
import numpy as np

pd.set_option('display.max_colwidth', 400)

In [3]:
dataframe = pd.read_pickle("../data/moral_stories_proto_l2s.dat")

In [3]:
from sklearn.model_selection import train_test_split
from ailignment.datasets.moral_stories import make_action_classification_dataframe

data = make_action_classification_dataframe(dataframe)
data.drop("__index_level_0__",axis=1, inplace=True)
train, test = train_test_split(data, test_size=0.1)

In [4]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification, AutoConfig

name = "roberta-large"
name = "ynie/roberta-large-snli_mnli_fever_anli_R1_R2_R3-nli"
tokenizer = AutoTokenizer.from_pretrained(name)
model = AutoModelForSequenceClassification.from_pretrained(name)

Some weights of the model checkpoint at ynie/roberta-large-snli_mnli_fever_anli_R1_R2_R3-nli were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [5]:
# split into val data
# convert the dataframe to a huggingface dataset and tokenize the sentences
from datasets import Dataset

def tok(samples):
    return tokenizer(samples["action"], samples["norm"], padding="max_length", 
                     truncation=True, return_token_type_ids=True)

train_data = Dataset.from_pandas(train)
train_data = train_data.map(tok, batched=True)
val_data = Dataset.from_pandas(test)
val_data = val_data.map(tok, batched=True)

  0%|          | 0/22 [00:00<?, ?ba/s]

  0%|          | 0/3 [00:00<?, ?ba/s]

In [6]:
# run evaluation
from transformers import Trainer, TrainingArguments
import torch
from ailignment.datasets.util import get_accuracy_metric

training_args = TrainingArguments(
    output_dir="/data/kiehne/results/action_classification/",
    num_train_epochs=5,
    per_device_train_batch_size=1,
    per_device_eval_batch_size=1,
    gradient_accumulation_steps=4,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir='logs/',
    log_level="info",
    logging_steps=500,
    evaluation_strategy="epoch",
    save_steps=30000000,
    save_strategy="epoch",
    learning_rate=1e-5
    
)
acc_metric = get_accuracy_metric()

In [7]:
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_data,
    eval_dataset=val_data,
    compute_metrics=acc_metric,
)
logs = trainer.train()

The following columns in the training set  don't have a corresponding argument in `RobertaForSequenceClassification.forward` and have been ignored: norm, consequence, __index_level_0__, situation, norm_action, action, intention, norm_sentiment, l2s_output, norm_value, norm_storyfied, ID, actor_name.
***** Running training *****
  Num examples = 21592
  Num Epochs = 5
  Instantaneous batch size per device = 1
  Total train batch size (w. parallel, distributed & accumulation) = 8
  Gradient Accumulation steps = 4
  Total optimization steps = 13495


Epoch,Training Loss,Validation Loss,Accuracy
1,0.3634,0.270084,0.915417
2,0.2682,0.471422,0.909583
3,0.1462,0.386422,0.92375
4,0.08,0.538192,0.923333
5,0.0362,0.586744,0.92625


The following columns in the evaluation set  don't have a corresponding argument in `RobertaForSequenceClassification.forward` and have been ignored: norm, consequence, __index_level_0__, situation, norm_action, action, intention, norm_sentiment, l2s_output, norm_value, norm_storyfied, ID, actor_name.
***** Running Evaluation *****
  Num examples = 2400
  Batch size = 2
Saving model checkpoint to /data/kiehne/results/action_classification_transfer/checkpoint-2699
Configuration saved in /data/kiehne/results/action_classification_transfer/checkpoint-2699/config.json
Model weights saved in /data/kiehne/results/action_classification_transfer/checkpoint-2699/pytorch_model.bin
The following columns in the evaluation set  don't have a corresponding argument in `RobertaForSequenceClassification.forward` and have been ignored: norm, consequence, __index_level_0__, situation, norm_action, action, intention, norm_sentiment, l2s_output, norm_value, norm_storyfied, ID, actor_name.
***** Running Eva