# Assignment 2

**Credits**: Federico Ruggeri, Eleonora Mancini, Paolo Torroni

**Keywords**: Human Value Detection, Multi-label classification, Transformers, BERT

# Imports and libraries needed

In [2]:
import pandas as pd 
import numpy as np
import matplotlib.pyplot as plt
from tqdm import tqdm
import seaborn as sns
import sklearn
import random
import os
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# Task 1: Corpus

Check the official page of the challenge [here](https://touche.webis.de/semeval23/touche23-web/).

The challenge offers several corpora for evaluation and testing.

You are going to work with the standard training, validation, and test splits.

#### Arguments
* arguments-training.tsv
* arguments-validation.tsv
* arguments-test.tsv

#### Human values
* labels-training.tsv
* labels-validation.tsv
* labels-test.tsv

### Instructions

* **Download** the specificed training, validation, and test files.
* **Encode** split files into a pandas.DataFrame object.
* For each split, **merge** the arguments and labels dataframes into a single dataframe.
* **Merge** level 2 annotations to level 3 categories.

### Train set

In [31]:
# Create documents dataframe
dataframes = []

with open('arguments/arguments-training.tsv', 'r', encoding='utf-8') as f:
    lines = f.readlines()
    data = [line.split('\t') for line in lines]
    df_train = pd.DataFrame(data)
    df_train.columns = df_train.iloc[0]
    df_train = df_train[1:]

print(df_train.head())
print(df_train.shape)


0 Argument ID                                   Conclusion       Stance  \
1      A01002                  We should ban human cloning  in favor of   
2      A01005                      We should ban fast food  in favor of   
3      A01006  We should end the use of economic sanctions      against   
4      A01007         We should abolish capital punishment      against   
5      A01008                We should ban factory farming      against   

0                                          Premise\n  
1  we should ban human cloning as it will only ca...  
2  fast food should be banned because it is reall...  
3  sometimes economic sanctions are the only thin...  
4  capital punishment is sometimes the only optio...  
5  factory farming allows for the production of c...  
(5393, 4)


In [30]:
dataframes = []

with open('arguments/labels-training.tsv', 'r', encoding='utf-8') as f:
    lines = f.readlines()
    data = [line.split('\t') for line in lines]
    df_train_label = pd.DataFrame(data)
    df_train_label.columns = df_train_label.iloc[0]
    df_train_label = df_train_label[1:]

print(df_train_label.head())
print(df_train_label.shape)

0 Argument ID Self-direction: thought Self-direction: action Stimulation  \
1      A01002                       0                      0           0   
2      A01005                       0                      0           0   
3      A01006                       0                      0           0   
4      A01007                       0                      0           0   
5      A01008                       0                      0           0   

0 Hedonism Achievement Power: dominance Power: resources Face  \
1        0           0                0                0    0   
2        0           0                0                0    0   
3        0           0                1                0    0   
4        0           0                0                0    0   
5        0           0                0                0    0   

0 Security: personal  ... Tradition Conformity: rules  \
1                  0  ...         0                 0   
2                  1  ...         0   

In [34]:
df_all_train = pd.concat([df_train, df_train_label.reindex(df_train.index)], axis=1)
print(df_all_train.head())
print(df_all_train.shape)

0 Argument ID                                   Conclusion       Stance  \
1      A01002                  We should ban human cloning  in favor of   
2      A01005                      We should ban fast food  in favor of   
3      A01006  We should end the use of economic sanctions      against   
4      A01007         We should abolish capital punishment      against   
5      A01008                We should ban factory farming      against   

0                                          Premise\n Argument ID  \
1  we should ban human cloning as it will only ca...      A01002   
2  fast food should be banned because it is reall...      A01005   
3  sometimes economic sanctions are the only thin...      A01006   
4  capital punishment is sometimes the only optio...      A01007   
5  factory farming allows for the production of c...      A01008   

0 Self-direction: thought Self-direction: action Stimulation Hedonism  \
1                       0                      0           0       

### Validation set

In [36]:
# Create documents dataframe
dataframes = []

with open('arguments/arguments-validation.tsv', 'r', encoding='utf-8') as f:
    lines = f.readlines()
    data = [line.split('\t') for line in lines]
    df_val = pd.DataFrame(data)
    df_val.columns = df_val.iloc[0]
    df_val = df_val[1:]

print(df_val.head())
print(df_val.shape)

0 Argument ID                                       Conclusion       Stance  \
1      A01001                   Entrapment should be legalized  in favor of   
2      A01012  The use of public defenders should be mandatory  in favor of   
3      A02001                    Payday loans should be banned  in favor of   
4      A02002                       Surrogacy should be banned      against   
5      A02009                   Entrapment should be legalized      against   

0                                          Premise\n  
1  if entrapment can serve to more easily capture...  
2  the use of public defenders should be mandator...  
3  payday loans create a more impoverished societ...  
4  Surrogacy should not be banned as it is the wo...  
5  entrapment is gravely immoral and against huma...  
(1896, 4)


In [37]:
dataframes = []

with open('arguments/labels-validation.tsv', 'r', encoding='utf-8') as f:
    lines = f.readlines()
    data = [line.split('\t') for line in lines]
    df_val_label = pd.DataFrame(data)
    df_val_label.columns = df_val_label.iloc[0]
    df_val_label = df_val_label[1:]

print(df_val_label.head())
print(df_val_label.shape)

0 Argument ID Self-direction: thought Self-direction: action Stimulation  \
1      A01001                       0                      0           0   
2      A01012                       0                      0           0   
3      A02001                       0                      0           0   
4      A02002                       0                      1           0   
5      A02009                       0                      0           0   

0 Hedonism Achievement Power: dominance Power: resources Face  \
1        0           0                0                0    0   
2        0           0                0                0    0   
3        0           0                0                0    0   
4        0           0                0                0    0   
5        0           0                0                0    0   

0 Security: personal  ... Tradition Conformity: rules  \
1                  0  ...         0                 0   
2                  0  ...         0   

In [38]:
df_all_val = pd.concat([df_val, df_val_label.reindex(df_val.index)], axis=1)
print(df_all_val.head())
print(df_all_val.shape)

0 Argument ID                                       Conclusion       Stance  \
1      A01001                   Entrapment should be legalized  in favor of   
2      A01012  The use of public defenders should be mandatory  in favor of   
3      A02001                    Payday loans should be banned  in favor of   
4      A02002                       Surrogacy should be banned      against   
5      A02009                   Entrapment should be legalized      against   

0                                          Premise\n Argument ID  \
1  if entrapment can serve to more easily capture...      A01001   
2  the use of public defenders should be mandator...      A01012   
3  payday loans create a more impoverished societ...      A02001   
4  Surrogacy should not be banned as it is the wo...      A02002   
5  entrapment is gravely immoral and against huma...      A02009   

0 Self-direction: thought Self-direction: action Stimulation Hedonism  \
1                       0                  

### Test set

In [39]:
# Create documents dataframe
dataframes = []

with open('arguments/arguments-test.tsv', 'r', encoding='utf-8') as f:
    lines = f.readlines()
    data = [line.split('\t') for line in lines]
    df_test = pd.DataFrame(data)
    df_test.columns = df_test.iloc[0]
    df_test = df_test[1:]

print(df_test.head())
print(df_test.shape)

0 Argument ID                          Conclusion       Stance  \
1      A26004    We should end affirmative action      against   
2      A26010    We should end affirmative action  in favor of   
3      A26016           We should ban naturopathy  in favor of   
4      A26024  We should prohibit women in combat  in favor of   
5      A26026           We should ban naturopathy  in favor of   

0                                          Premise\n  
1  affirmative action helps with employment equit...  
2  affirmative action can be considered discrimin...  
3  naturopathy is very dangerous for the most vul...  
4  women shouldn't be in combat because they aren...  
5  once eradicated illnesses are returning due to...  
(1576, 4)


In [40]:
dataframes = []

with open('arguments/labels-test.tsv', 'r', encoding='utf-8') as f:
    lines = f.readlines()
    data = [line.split('\t') for line in lines]
    df_test_label = pd.DataFrame(data)
    df_test_label.columns = df_test_label.iloc[0]
    df_test_label = df_test_label[1:]

print(df_test_label.head())
print(df_test_label.shape)

0 Argument ID Self-direction: thought Self-direction: action Stimulation  \
1      A26004                       0                      0           0   
2      A26010                       0                      0           0   
3      A26016                       0                      0           0   
4      A26024                       0                      0           0   
5      A26026                       0                      0           0   

0 Hedonism Achievement Power: dominance Power: resources Face  \
1        0           1                0                0    0   
2        0           1                0                0    0   
3        0           1                0                0    0   
4        0           1                0                0    0   
5        0           1                0                0    0   

0 Security: personal  ... Tradition Conformity: rules  \
1                  1  ...         0                 0   
2                  0  ...         0   

In [41]:
df_all_test = pd.concat([df_test, df_test_label.reindex(df_test.index)], axis=1)
print(df_all_test.head())
print(df_all_test.shape)

0 Argument ID                          Conclusion       Stance  \
1      A26004    We should end affirmative action      against   
2      A26010    We should end affirmative action  in favor of   
3      A26016           We should ban naturopathy  in favor of   
4      A26024  We should prohibit women in combat  in favor of   
5      A26026           We should ban naturopathy  in favor of   

0                                          Premise\n Argument ID  \
1  affirmative action helps with employment equit...      A26004   
2  affirmative action can be considered discrimin...      A26010   
3  naturopathy is very dangerous for the most vul...      A26016   
4  women shouldn't be in combat because they aren...      A26024   
5  once eradicated illnesses are returning due to...      A26026   

0 Self-direction: thought Self-direction: action Stimulation Hedonism  \
1                       0                      0           0        0   
2                       0                      0

# Task 2: Model definition

You are tasked to define several neural models for multi-label classification.

### Instructions

* **Baseline**: implement a random uniform classifier (an individual classifier per category).
* **Baseline**: implement a majority classifier (an individual classifier per category).

<br/>

* **BERT w/ C**: define a BERT-based classifier that receives an argument **conclusion** as input.
* **BERT w/ CP**: add argument **premise** as an additional input.
* **BERT w/ CPS**: add argument premise-to-conclusion **stance** as an additional input.

### Implement a random uniform classifier

In [None]:
# Random uniform classifier using Keras
model = Sequential()
model.add(Dense(1, input_dim=X_train.shape[1], activation='sigmoid'))

# Compile the model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

### Implement a majority classifier

In [None]:
# Find the majority class
majority_class = np.argmax(np.bincount(y_train))

# Majority classifier using Keras
model = Sequential()
model.add(Dense(1, input_dim=X_train.shape[1], activation='sigmoid', kernel_initializer='zeros', bias_initializer='zeros'))

# Set the weights to predict the majority class
model.layers[0].set_weights([np.array([[0.0]]), np.array([float(majority_class)])])

# Task 3: Metrics

Before training the models, you are tasked to define the evaluation metrics for comparison.

### Instructions

* Evaluate your models using per-category binary F1-score.
* Compute the average binary F1-score over all categories (macro F1-score).

# Task 4: Training and Evaluation

You are now tasked to train and evaluate **all** defined models.

### Instructions

* Train **all** models on the train set.
* Evaluate **all** models on the validation set.
* Pick **at least** three seeds for robust estimation.
* Compute metrics on the validation set.
* Report **per-category** and **macro** F1-score for comparison.

# Task 5: Error Analysis

You are tasked to discuss your results.

### Instructions

* **Compare** classification performance of BERT-based models with respect to baselines.
* Discuss **difference in prediction** between the best performing BERT-based model and its variants.

# Task 6: Report

Wrap up your experiment in a short report (up to 2 pages).

### Instructions

* Use the NLP course report template.
* Summarize each task in the report following the provided template.