# Notebook: Train Model

This notebook is used to train a classification model given a dataset of tweets. Results of the training are saved in a CSV file.
<br>**Contributors:** [Nils Hellwig](https://github.com/NilsHellwig/) | [Markus Bink](https://github.com/MarkusBink/)

## Packages

In [69]:
from simpletransformers.classification import ClassificationModel
from get_germeval_2017_dataset import get_germeval_2017_dataset
from sklearn.metrics import f1_score, accuracy_score
from sklearn.metrics import f1_score
import pandas as pd
import numpy as np
import random
import json
import os

## Parameters

In [70]:
SPLID_ID = 0
TEST_DATASET_PATH = "../Datasets/k_fold_splits/TRAIN_TEST_0/test.csv"
N_TRAIN_EPOCHS = 4
TRAIN_BATCH_SIZE = 32
TEST_BATCH_SIZE = 32
USE_CUDA = False
SEED_VALUE = 0

MODEL_TYPE = "bert"
MODEL_NAME = "deepset/gbert-base"
MODEL_DIRECTORY_PATH = "output"

PATH_RESULT_DATA = f'../Models/Results/GermEval_Annotaded_it_{SPLID_ID}.json'
SAVE_MODEL = False

N_LABELS = 2

## Code

### 1. Get Reproducable Results

In [71]:
os.environ['PYTHONHASHSEED'] = str(SEED_VALUE)
random.seed(SEED_VALUE)
np.random.seed(SEED_VALUE)

### 2. Load Dataframes

#### Load Training Data
**Important:** Comment out unnecessary data frames

In [72]:
train_df_annotated_split = pd.read_csv(f'../Datasets/k_fold_splits/TRAIN_TEST_{SPLID_ID}/train.csv', encoding="utf-8")[["tweet","sentiment_label"]].rename(columns={"tweet":"text"})
train_df_germeval = get_germeval_2017_dataset()
train_df_annotated_total = pd.read_csv("../Datasets/annotations.csv", encoding="utf-8")[["tweet","sentiment_label"]].rename(columns={"tweet":"text"})

In [73]:
train_df = pd.concat([train_df_annotated_split, train_df_germeval], axis=0).sample(frac=1, random_state=SEED_VALUE).reset_index(drop=True)
train_df['sentiment_label'] = train_df['sentiment_label'].str.lower()

Check Labels

#### Load Test Data

In [74]:
test_df = pd.read_csv(TEST_DATASET_PATH, encoding="utf-8")[["tweet","sentiment_label"]].rename(columns={"tweet":"text"})
test_df['sentiment_label'] = test_df['sentiment_label'].str.lower()

Replace label strings with numbers

In [75]:
train_df['sentiment_label'] = train_df['sentiment_label'].replace({'negative': 1, 'positive': 0, 'neutral': 2})
test_df['sentiment_label'] = test_df['sentiment_label'].replace({'negative': 1, 'positive': 0, 'neutral': 2})

In [76]:
train_df.sentiment_label.value_counts(), test_df.sentiment_label.value_counts()

(2    17758
 1     7707
 0     2343
 Name: sentiment_label, dtype: int64,
 1    204
 0    196
 Name: sentiment_label, dtype: int64)

### 3. Create Model

In [77]:
training_args = {
    "fp16":False,
    "num_train_epochs":N_TRAIN_EPOCHS,
    "overwrite_output_dir":True,
    "train_batch_size":TRAIN_BATCH_SIZE,
    "eval_batch_size":TEST_BATCH_SIZE,
    "manual_seed": SEED_VALUE,
    "reprocess_input_data":True,
    "no_save":True,
    "no_cache":True
}

In [78]:
model = ClassificationModel(model_type=MODEL_TYPE, model_name=MODEL_NAME, num_labels=N_LABELS, args=training_args, use_cuda=USE_CUDA)

Some weights of the model checkpoint at deepset/gbert-base were not used when initializing BertForSequenceClassification: ['cls.seq_relationship.bias', 'cls.predictions.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint a

### 4. Train Model

In [79]:
#model.train_model(train_df)

### 5. Define Metrics

In [80]:
accuracy_metric = accuracy_score

def f1_metrics(labels, preds):
    metrics = {
      "f1_macro": f1_score(labels, preds, average='macro'),
      "f1_micro": f1_score(labels, preds, average='micro'),
      "f1_weighted": f1_score(labels, preds, average='weighted')
    }
    return metrics

### 4. Evaluate Model

In [81]:
result, model_outputs, wrong_predictions = model.eval_model(test_df, acc=accuracy_metric, f1=f1_metrics)



  0%|          | 0/400 [00:00<?, ?it/s]

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Running Evaluation:   0%|          | 0/13 [00:00<?, ?it/s]

In [89]:
result

{'mcc': -0.0030018256651754634,
 'tp': 65,
 'tn': 133,
 'fp': 63,
 'fn': 139,
 'auroc': 0.4806422569027611,
 'auprc': 0.5034326263465296,
 'acc': 0.495,
 'f1': {'f1_macro': 0.47997116671815465,
  'f1_micro': 0.495,
  'f1_weighted': 0.4782030686849964},
 'eval_loss': 0.7151838265932523}

In [91]:
with open(PATH_RESULT_DATA, 'w') as f:
    json.dump(result, f, default=str)

### 4. Save Model

In [None]:
if SAVE_MODEL:
    model.save_model(MODEL_DIRECTORY_PATH)