# Hate speech classification using fox news dataset

Consists of: in-domain results and domain adaptation on movies dataset results

The class labels depict the following:

0: Normal speech, 
1: Hate speech

#### To work with this, the following folder paths needs to be created in the directory of this notebook:

classification_reports/   : This will contain all the classification reports generated by the model

data/         : Contains fox_news.csv annotation file

movies/       : contains all_movies.csv file

movies/for_training/:    contains 6 movies used for cross validation training and testing

training_checkpoints/in_domain/fox/cp_fox.ckpt  : for storing the weights of execution

In [7]:
! pip install transformers==2.6.0



In [8]:
import pandas as pd
from matplotlib import pyplot as plt
import numpy as np
import re
import tensorflow as tf
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
import os
import glob

In [9]:
# initialize bert for 2 labels
from transformers import BertTokenizer, TFBertForSequenceClassification
from transformers import InputExample, InputFeatures

model = TFBertForSequenceClassification.from_pretrained("bert-base-uncased", 
                                                        trainable=True, 
                                                        num_labels=2)
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")

All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [10]:
model.summary()

Model: "tf_bert_for_sequence_classification_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
bert (TFBertMainLayer)       multiple                  109482240 
_________________________________________________________________
dropout_75 (Dropout)         multiple                  0         
_________________________________________________________________
classifier (Dense)           multiple                  1538      
Total params: 109,483,778
Trainable params: 109,483,778
Non-trainable params: 0
_________________________________________________________________


Initialize checkpoints

In [11]:
checkpoint_path = "training_checkpoints/in_domain/fox/cp_fox.ckpt"
checkpoint_dir = os.path.dirname(checkpoint_path)
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,
                                                 save_weights_only=True,
                                                 verbose=1)

Read hate dataset and convert it into train and test

In [12]:
df = pd.read_csv("data/fox_news.csv")
df = df.drop(columns=['Unnamed: 0'])

In [13]:
df['label'] = df['label'].replace({2:1})
df.head()

Unnamed: 0,comment,label
0,Merkel would never say NO,1
1,"Expect more and more women to be asking .. ""wh...",1
2,Groping people in public wasn't already illega...,0
3,"Merkel, possible the only person in charge who...",1
4,"They know very well, no means NO They need to ...",1


In [14]:
def get_dataset(df, seed, test_size):
    return train_test_split(df, test_size=test_size, random_state=seed, shuffle=True)

In [15]:
train, test = get_dataset(df, 11, 0.2)

In [16]:
train.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1210 entries, 941 to 1104
Data columns (total 2 columns):
 #   Column   Non-Null Count  Dtype 
---  ------   --------------  ----- 
 0   comment  1210 non-null   object
 1   label    1210 non-null   int64 
dtypes: int64(1), object(1)
memory usage: 28.4+ KB


In [17]:
train.columns = ['DATA_COLUMN', 'LABEL_COLUMN']
test.columns = ['DATA_COLUMN', 'LABEL_COLUMN']

In [18]:
train.head()

Unnamed: 0,DATA_COLUMN,LABEL_COLUMN
941,It sure looks that way,0
569,Spoken like a true Wookie.,1
1496,Someone will dox the snarky Snarth. It's only ...,1
1129,"Oh my, you just blew LBJ's ""war against povert...",0
259,I did make a statement. I admit that,0


In [19]:
def convert_data_to_examples(train, test, DATA_COLUMN, LABEL_COLUMN): 
  train_InputExamples = train.apply(lambda x: InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this case
                                                          text_a = x[DATA_COLUMN], 
                                                          text_b = None,
                                                          label = x[LABEL_COLUMN]), axis = 1)

  validation_InputExamples = test.apply(lambda x: InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this case
                                                          text_a = x[DATA_COLUMN], 
                                                          text_b = None,
                                                          label = x[LABEL_COLUMN]), axis = 1)
  
  return train_InputExamples, validation_InputExamples

  
def convert_examples_to_tf_dataset(examples, tokenizer, max_length=128):
    features = [] # -> will hold InputFeatures to be converted later

    for e in examples:
        # Documentation is really strong for this method, so please take a look at it
        input_dict = tokenizer.encode_plus(
            e.text_a,
            add_special_tokens=True,
            max_length=max_length, # truncates if len(s) > max_length
            return_token_type_ids=True,
            return_attention_mask=True,
            pad_to_max_length=True, # pads to the right by default # CHECK THIS for pad_to_max_length
            truncation=True
        )

        input_ids, token_type_ids, attention_mask = (input_dict["input_ids"],
            input_dict["token_type_ids"], input_dict['attention_mask'])

        features.append(
            InputFeatures(
                input_ids=input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids, label=e.label
            )
        )

    def gen():
        for f in features:
            yield (
                {
                    "input_ids": f.input_ids,
                    "attention_mask": f.attention_mask,
                    "token_type_ids": f.token_type_ids,
                },
                f.label,
            )

    return tf.data.Dataset.from_generator(
        gen,
        ({"input_ids": tf.int32, "attention_mask": tf.int32, "token_type_ids": tf.int32}, tf.int64),
        (
            {
                "input_ids": tf.TensorShape([None]),
                "attention_mask": tf.TensorShape([None]),
                "token_type_ids": tf.TensorShape([None]),
            },
            tf.TensorShape([]),
        ),
    )


DATA_COLUMN = 'DATA_COLUMN'
LABEL_COLUMN = 'LABEL_COLUMN'

In [20]:
train_InputExamples, validation_InputExamples = convert_data_to_examples(train, test, DATA_COLUMN, LABEL_COLUMN)

train_data = convert_examples_to_tf_dataset(list(train_InputExamples), tokenizer)
train_data = train_data.batch(32)

validation_data = convert_examples_to_tf_dataset(list(validation_InputExamples), tokenizer)
validation_data = validation_data.batch(32)



In [21]:
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=3e-6, epsilon=1e-08, clipnorm=1.0), 
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), 
              metrics=[tf.keras.metrics.SparseCategoricalAccuracy('accuracy')])



In [22]:
hist = model.fit(train_data, epochs=17, validation_data=validation_data, callbacks=[cp_callback])

Epoch 1/17
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module, class, method, function, traceback, frame, or code object was expected, got cython_function_or_method
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module, class, method, function, traceback, frame, or code object was expected, got cython_function_or_method
Cause: while/else statement not yet supported
Cause: while/else statement not yet supported

Instructions for updating:
The `validate_indices` argument has no effect. Indices are always validated on CPU and never validated on GPU.

Epoch 00001: saving model to training_checkpoints/in_domain/fox/cp_fox.ckpt
Epoch 2/17

Epoch 00002: saving model to training_checkpoints/in_domain/fox/cp_fox.ckpt
Epoch 3/17

Epoch 00003: saving model t

In [23]:
preds = model.predict(validation_data)



#### In-domain classification report for fox news

In [24]:
print(classification_report(test['LABEL_COLUMN'],np.argmax(preds[0],axis=1)))

              precision    recall  f1-score   support

           0       0.84      0.89      0.86       229
           1       0.57      0.46      0.51        74

    accuracy                           0.78       303
   macro avg       0.70      0.67      0.68       303
weighted avg       0.77      0.78      0.77       303



In [25]:
cr = classification_report(test['LABEL_COLUMN'],np.argmax(preds[0],axis=1),output_dict=True)

In [26]:
pd.DataFrame(cr).transpose().to_csv('classification_reports/classification_bert_fox_indomain.csv')



---



---



---



---

#### Domain Adaptation, predicting on movies with the fox trained model on 2 labels

In [27]:
def convert_data_to_examples_valid(data, DATA_COLUMN, LABEL_COLUMN): 
  inputExamples = data.apply(lambda x: InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this case
                                                          text_a = x[DATA_COLUMN], 
                                                          text_b = None,
                                                          label = x[LABEL_COLUMN]), axis = 1)

  
  return inputExamples

In [28]:
df_movies = pd.read_csv('movies/all_movies.csv')

In [29]:
df_movies = df_movies.rename(columns={"text": "DATA_COLUMN", "majority_answer": "LABEL_COLUMN"})
df_movies.head()

Unnamed: 0.1,Unnamed: 0,movie_id,batch_id,LABEL_COLUMN,DATA_COLUMN,movie_name,Unnamed: 6,Unnamed: 7
0,0,AmericanHistoryX(1998)_1,1566624979,0,Derek.,AmerricanHistoryX,,
1,1,AmericanHistoryX(1998)_2,1566624979,1,What the fuck are you thinking?,AmerricanHistoryX,,
2,2,AmericanHistoryX(1998)_3,1566624979,0,There's a black guy outside breaking into your...,AmerricanHistoryX,,
3,3,AmericanHistoryX(1998)_4,1566624979,0,How long has he been there?,AmerricanHistoryX,,
4,4,AmericanHistoryX(1998)_5,1566624979,0,I don't know.,AmerricanHistoryX,,


In [30]:
# using 2 labels in movies
df_movies_2col = df_movies.copy()
df_movies_2col['LABEL_COLUMN'] = df_movies_2col['LABEL_COLUMN'].replace(2, 1)

In [31]:
movie2_InputExamples = convert_data_to_examples_valid(df_movies_2col, DATA_COLUMN, LABEL_COLUMN)
movie2_data = convert_examples_to_tf_dataset(list(movie2_InputExamples), tokenizer)
movie2_data = movie2_data.batch(32)



In [32]:
preds_movie = model.predict(movie2_data)

In [33]:
cr_movies = classification_report(df_movies_2col['LABEL_COLUMN'], np.argmax(preds_movie[0], axis=1), output_dict=True)

In [34]:
pd.DataFrame(cr_movies).transpose().to_csv('classification_reports/bert_fox_domain_adap_movies.csv')

#### Domain adaptation classification report from fox news on the movies dataset

In [35]:
pd.DataFrame(cr_movies).transpose()

Unnamed: 0,precision,recall,f1-score,support
0,0.885151,0.896051,0.890567,9014.0
1,0.400512,0.373955,0.386778,1674.0
accuracy,0.814278,0.814278,0.814278,0.814278
macro avg,0.642831,0.635003,0.638673,10688.0
weighted avg,0.809244,0.814278,0.811662,10688.0




---



---



---
### Cross validation


#### 6-fold cross validation on movies by fine tuning on above fox dataset

In [36]:
from sklearn.metrics import classification_report

In [37]:
def convert_data_to_examples_cv(train, DATA_COLUMN, LABEL_COLUMN):
    train_InputExamples = train.apply(
        lambda x: InputExample(guid=None,  # Globally unique ID for bookkeeping, unused in this case
                               text_a=x[DATA_COLUMN],
                               text_b=None,
                               label=x[LABEL_COLUMN]), axis=1)

    return train_InputExamples


def convert_examples_to_tf_dataset_cv(examples, tokenizer, max_length=128):
    features = []  # -> will hold InputFeatures to be converted later

    for e in examples:
        # Documentation is really strong for this method, so please take a look at it
        input_dict = tokenizer.encode_plus(
            e.text_a,
            add_special_tokens=True,
            max_length=max_length,  # truncates if len(s) > max_length
            return_token_type_ids=True,
            return_attention_mask=True,
            pad_to_max_length=True,  # pads to the right by default # CHECK THIS for pad_to_max_length
            truncation=True
        )

        input_ids, token_type_ids, attention_mask = (input_dict["input_ids"],
                                                     input_dict["token_type_ids"], input_dict['attention_mask'])

        features.append(
            InputFeatures(
                input_ids=input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids, label=e.label
            )
        )

    def gen():
        for f in features:
            yield (
                {
                    "input_ids": f.input_ids,
                    "attention_mask": f.attention_mask,
                    "token_type_ids": f.token_type_ids,
                },
                f.label,
            )

    return tf.data.Dataset.from_generator(
        gen,
        ({"input_ids": tf.int32, "attention_mask": tf.int32, "token_type_ids": tf.int32}, tf.int64),
        (
            {
                "input_ids": tf.TensorShape([None]),
                "attention_mask": tf.TensorShape([None]),
                "token_type_ids": tf.TensorShape([None]),
            },
            tf.TensorShape([]),
        ),
    )

def train_bert(df_train, df_test, load_weights = False):
    model = TFBertForSequenceClassification.from_pretrained("bert-base-uncased",
                                                            trainable=True,
                                                            num_labels=2)
    tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
    if load_weights:
	    model.load_weights('training_checkpoints/in_domain/fox/cp_fox.ckpt')
    train = df_train[['text', 'majority_answer']]
    train.columns = ['DATA_COLUMN', 'LABEL_COLUMN']

    test = df_test[['text', 'majority_answer']]
    test.columns = ['DATA_COLUMN', 'LABEL_COLUMN']

    DATA_COLUMN = 'DATA_COLUMN'
    LABEL_COLUMN = 'LABEL_COLUMN'

    train_InputExamples = convert_data_to_examples_cv(train, DATA_COLUMN, LABEL_COLUMN)
    test_InputExamples = convert_data_to_examples_cv(test, DATA_COLUMN, LABEL_COLUMN)

    train_data = convert_examples_to_tf_dataset_cv(list(train_InputExamples), tokenizer)
    train_data = train_data.batch(32)

    # compile and fit
    model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=3e-6, epsilon=1e-08, clipnorm=1.0),
                  loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
                  metrics=[tf.keras.metrics.SparseCategoricalAccuracy('accuracy')])

    model.fit(train_data, epochs=6)

    test_data = convert_examples_to_tf_dataset_cv(list(test_InputExamples), tokenizer)
    test_data = test_data.batch(32)

    print('predicting')
    preds = model.predict(test_data)

    # classification
    return classification_report(pd.DataFrame(test['LABEL_COLUMN']), np.argmax(preds[0], axis=1), output_dict=True)

In [38]:
def load_movies_to_df(path):
    df_movies = []

    for filename in glob.glob(path + '*.csv'):
        df_movies.append(pd.read_csv(filename))

    return df_movies

In [39]:
df_movies = load_movies_to_df('movies/for_training/')
classification_reports = []
df_main = pd.DataFrame()

In [40]:
# perform cross folding
for i in range(len(df_movies)):
    df_train = pd.concat(df_movies[0:i] + df_movies[i + 1:])
    df_test = df_movies[i]

    df_train['majority_answer'] = df_train['majority_answer'].replace({2:1})
    df_test['majority_answer'] = df_test['majority_answer'].replace({2:1})

    train_movies = df_train['movie_name'].unique()
    test_movie = df_test['movie_name'].unique()
    print(','.join(train_movies))
    print(test_movie[0])
    report = train_bert(df_train, df_test, True)
    classification_reports.append(report)
    
    print('Train movies: ', str(','.join(train_movies)))
    print('Test movie: ', str(test_movie[0]))
    print('Classification report: \n', classification_reports[i])
    print('------------------------------------------------')

    df_cr = pd.DataFrame(classification_reports[i]).transpose()
    df_cr['movie_train'] =  str(','.join(train_movies))
    df_cr['movie_test'] = str(test_movie[0])
    df_cr.to_csv('classification_reports/'+'bert_fox_cv_finetune_testmovie_'+str(test_movie[0])+'.csv')
    df_main = df_main.append(df_cr)

Pulp_Fiction,AmerricanHistoryX,TheWolfofWallStreet,Django_Unchained,South_Park
BlacKkKlansman


All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/6
Epoch 2/6
Epoch 3/6
Epoch 4/6
Epoch 5/6
Epoch 6/6
predicting
Train movies:  Pulp_Fiction,AmerricanHistoryX,TheWolfofWallStreet,Django_Unchained,South_Park
Test movie:  BlacKkKlansman
Classification report: 
 {'0': {'precision': 0.969632768361582, 'recall': 0.939124487004104, 'f1-score': 0.9541348158443363, 'support': 1462}, '1': {'precision': 0.611353711790393, 'recall': 0.7650273224043715, 'f1-score': 0.6796116504854368, 'support': 183}, 'accuracy': 0.9197568389057751, 'macro avg': {'precision': 0.7904932400759874, 'recall': 0.8520759047042378, 'f1-score': 0.8168732331648866, 'support': 1645}, 'weighted avg': {'precision': 0.9297755845606535, 'recall': 0.9197568389057751, 'f1-score': 0.9235951567193037, 'support': 1645}}
------------------------------------------------
BlacKkKlansman,AmerricanHistoryX,TheWolfofWallStreet,Django_Unchained,South_Park
Pulp_Fiction


All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/6
Epoch 2/6
Epoch 3/6
Epoch 4/6
Epoch 5/6
Epoch 6/6
predicting
Train movies:  BlacKkKlansman,AmerricanHistoryX,TheWolfofWallStreet,Django_Unchained,South_Park
Test movie:  Pulp_Fiction
Classification report: 
 {'0': {'precision': 0.964951528709918, 'recall': 0.9707426856714179, 'f1-score': 0.967838444278235, 'support': 1333}, '1': {'precision': 0.8612099644128114, 'recall': 0.8373702422145328, 'f1-score': 0.8491228070175437, 'support': 289}, 'accuracy': 0.9469790382244143, 'macro avg': {'precision': 0.9130807465613646, 'recall': 0.9040564639429753, 'f1-score': 0.9084806256478893, 'support': 1622}, 'weighted avg': {'precision': 0.9464673658974249, 'recall': 0.9469790382244143, 'f1-score': 0.9466862746306766, 'support': 1622}}
------------------------------------------------
BlacKkKlansman,Pulp_Fiction,TheWolfofWallStreet,Django_Unchained,South_Park
AmerricanHistoryX


All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/6
Epoch 2/6
Epoch 3/6
Epoch 4/6
Epoch 5/6
Epoch 6/6
predicting
Train movies:  BlacKkKlansman,Pulp_Fiction,TheWolfofWallStreet,Django_Unchained,South_Park
Test movie:  AmerricanHistoryX
Classification report: 
 {'0': {'precision': 0.9573672400897532, 'recall': 0.9815950920245399, 'f1-score': 0.96932979931844, 'support': 1304}, '1': {'precision': 0.8947368421052632, 'recall': 0.7816091954022989, 'f1-score': 0.8343558282208589, 'support': 261}, 'accuracy': 0.9482428115015974, 'macro avg': {'precision': 0.9260520410975082, 'recall': 0.8816021437134194, 'f1-score': 0.9018428137696495, 'support': 1565}, 'weighted avg': {'precision': 0.9469221705217329, 'recall': 0.9482428115015974, 'f1-score': 0.9468197632440193, 'support': 1565}}
------------------------------------------------
BlacKkKlansman,Pulp_Fiction,AmerricanHistoryX,Django_Unchained,South_Park
TheWolfofWallStreet


All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/6
Epoch 2/6
Epoch 3/6
Epoch 4/6
Epoch 5/6
Epoch 6/6
predicting
Train movies:  BlacKkKlansman,Pulp_Fiction,AmerricanHistoryX,Django_Unchained,South_Park
Test movie:  TheWolfofWallStreet
Classification report: 
 {'0': {'precision': 0.9810024252223121, 'recall': 0.9810024252223121, 'f1-score': 0.9810024252223121, 'support': 2474}, '1': {'precision': 0.9202037351443124, 'recall': 0.9202037351443124, 'f1-score': 0.9202037351443124, 'support': 589}, 'accuracy': 0.9693111328762651, 'macro avg': {'precision': 0.9506030801833123, 'recall': 0.9506030801833123, 'f1-score': 0.9506030801833123, 'support': 3063}, 'weighted avg': {'precision': 0.9693111328762651, 'recall': 0.9693111328762651, 'f1-score': 0.9693111328762651, 'support': 3063}}
------------------------------------------------
BlacKkKlansman,Pulp_Fiction,AmerricanHistoryX,TheWolfofWallStreet,South_Park
Django_Unchained


All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/6
Epoch 2/6
Epoch 3/6
Epoch 4/6
Epoch 5/6
Epoch 6/6
predicting
Train movies:  BlacKkKlansman,Pulp_Fiction,AmerricanHistoryX,TheWolfofWallStreet,South_Park
Test movie:  Django_Unchained
Classification report: 
 {'0': {'precision': 0.9825806451612903, 'recall': 0.9819471308833011, 'f1-score': 0.9822637858755241, 'support': 1551}, '1': {'precision': 0.8578680203045685, 'recall': 0.8622448979591837, 'f1-score': 0.8600508905852418, 'support': 196}, 'accuracy': 0.9685174585002863, 'macro avg': {'precision': 0.9202243327329294, 'recall': 0.9220960144212424, 'f1-score': 0.9211573382303829, 'support': 1747}, 'weighted avg': {'precision': 0.9685888452346061, 'recall': 0.9685174585002863, 'f1-score': 0.9685524364325387, 'support': 1747}}
------------------------------------------------
BlacKkKlansman,Pulp_Fiction,AmerricanHistoryX,TheWolfofWallStreet,Django_Unchained
South_Park


All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/6
Epoch 2/6
Epoch 3/6
Epoch 4/6
Epoch 5/6
Epoch 6/6
predicting
Train movies:  BlacKkKlansman,Pulp_Fiction,AmerricanHistoryX,TheWolfofWallStreet,Django_Unchained
Test movie:  South_Park
Classification report: 
 {'0': {'precision': 0.9455930359085963, 'recall': 0.9764044943820225, 'f1-score': 0.9607517965726922, 'support': 890}, '1': {'precision': 0.8346456692913385, 'recall': 0.6794871794871795, 'f1-score': 0.7491166077738516, 'support': 156}, 'accuracy': 0.9321223709369025, 'macro avg': {'precision': 0.8901193525999674, 'recall': 0.827945836934601, 'f1-score': 0.8549342021732719, 'support': 1046}, 'weighted avg': {'precision': 0.9290463923213188, 'recall': 0.9321223709369025, 'f1-score': 0.9291886135395955, 'support': 1046}}
------------------------------------------------


In [41]:
df_main.to_csv('classification_reports/bert_crossvalid_finetune_fox.csv')

In [42]:
print(df_main)

              precision  ...           movie_test
0              0.969633  ...       BlacKkKlansman
1              0.611354  ...       BlacKkKlansman
accuracy       0.919757  ...       BlacKkKlansman
macro avg      0.790493  ...       BlacKkKlansman
weighted avg   0.929776  ...       BlacKkKlansman
0              0.964952  ...         Pulp_Fiction
1              0.861210  ...         Pulp_Fiction
accuracy       0.946979  ...         Pulp_Fiction
macro avg      0.913081  ...         Pulp_Fiction
weighted avg   0.946467  ...         Pulp_Fiction
0              0.957367  ...    AmerricanHistoryX
1              0.894737  ...    AmerricanHistoryX
accuracy       0.948243  ...    AmerricanHistoryX
macro avg      0.926052  ...    AmerricanHistoryX
weighted avg   0.946922  ...    AmerricanHistoryX
0              0.981002  ...  TheWolfofWallStreet
1              0.920204  ...  TheWolfofWallStreet
accuracy       0.969311  ...  TheWolfofWallStreet
macro avg      0.950603  ...  TheWolfofWallStreet


In [43]:
len(classification_reports[0])

5

In [44]:
df_main.head()

Unnamed: 0,precision,recall,f1-score,support,movie_train,movie_test
0,0.969633,0.939124,0.954135,1462.0,"Pulp_Fiction,AmerricanHistoryX,TheWolfofWallSt...",BlacKkKlansman
1,0.611354,0.765027,0.679612,183.0,"Pulp_Fiction,AmerricanHistoryX,TheWolfofWallSt...",BlacKkKlansman
accuracy,0.919757,0.919757,0.919757,0.919757,"Pulp_Fiction,AmerricanHistoryX,TheWolfofWallSt...",BlacKkKlansman
macro avg,0.790493,0.852076,0.816873,1645.0,"Pulp_Fiction,AmerricanHistoryX,TheWolfofWallSt...",BlacKkKlansman
weighted avg,0.929776,0.919757,0.923595,1645.0,"Pulp_Fiction,AmerricanHistoryX,TheWolfofWallSt...",BlacKkKlansman


In [45]:
def get_precision_recall_f1(category, result_df):
    precision = result_df[result_df.label==category].precision.mean()
    recall = result_df[result_df.label==category].recall.mean()
    f1 = result_df[result_df.label==category]['f1-score'].mean()
    
    return {'label': category, 'precision': precision, 'recall': recall, 'f1': f1}

In [46]:
df_cv= pd.read_csv('classification_reports/bert_crossvalid_finetune_fox.csv')

In [47]:
df_cv = df_cv.rename(columns={'Unnamed: 0': 'label', 'b': 'Y'})
df_cv.head()

Unnamed: 0,label,precision,recall,f1-score,support,movie_train,movie_test
0,0,0.969633,0.939124,0.954135,1462.0,"Pulp_Fiction,AmerricanHistoryX,TheWolfofWallSt...",BlacKkKlansman
1,1,0.611354,0.765027,0.679612,183.0,"Pulp_Fiction,AmerricanHistoryX,TheWolfofWallSt...",BlacKkKlansman
2,accuracy,0.919757,0.919757,0.919757,0.919757,"Pulp_Fiction,AmerricanHistoryX,TheWolfofWallSt...",BlacKkKlansman
3,macro avg,0.790493,0.852076,0.816873,1645.0,"Pulp_Fiction,AmerricanHistoryX,TheWolfofWallSt...",BlacKkKlansman
4,weighted avg,0.929776,0.919757,0.923595,1645.0,"Pulp_Fiction,AmerricanHistoryX,TheWolfofWallSt...",BlacKkKlansman


In [48]:
normal_dict = get_precision_recall_f1('0', df_cv)
offensive_dict = get_precision_recall_f1('1',df_cv)

#### Aggregated results of all 6 folds

In [49]:
df_result = pd.DataFrame([normal_dict, offensive_dict])
df_result

Unnamed: 0,label,precision,recall,f1
0,0,0.966855,0.971803,0.96922
1,1,0.830003,0.807657,0.81541


In [50]:
for cr in classification_reports:
  print(cr)

{'0': {'precision': 0.969632768361582, 'recall': 0.939124487004104, 'f1-score': 0.9541348158443363, 'support': 1462}, '1': {'precision': 0.611353711790393, 'recall': 0.7650273224043715, 'f1-score': 0.6796116504854368, 'support': 183}, 'accuracy': 0.9197568389057751, 'macro avg': {'precision': 0.7904932400759874, 'recall': 0.8520759047042378, 'f1-score': 0.8168732331648866, 'support': 1645}, 'weighted avg': {'precision': 0.9297755845606535, 'recall': 0.9197568389057751, 'f1-score': 0.9235951567193037, 'support': 1645}}
{'0': {'precision': 0.964951528709918, 'recall': 0.9707426856714179, 'f1-score': 0.967838444278235, 'support': 1333}, '1': {'precision': 0.8612099644128114, 'recall': 0.8373702422145328, 'f1-score': 0.8491228070175437, 'support': 289}, 'accuracy': 0.9469790382244143, 'macro avg': {'precision': 0.9130807465613646, 'recall': 0.9040564639429753, 'f1-score': 0.9084806256478893, 'support': 1622}, 'weighted avg': {'precision': 0.9464673658974249, 'recall': 0.9469790382244143, '