## Establishing a baseline

For the project we needed to establish a baseline to compare our results to. We achieve this on both Hate speech version two dataset, and the Toxify dataset.

In [23]:
%load_ext autoreload
%autoreload 2

import os
import pandas as pd
from src.preprocessing.hatespeech_dataset_querying import hatespeech_v2_load_train_and_validation_set

try:
    print(run_only_once)
except Exception as e:
    print(os.getcwd())
    os.chdir("./../..")
    print(os.getcwd())
    run_only_once = "Dir has already been changed"

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
Dir has already been changed


## Hate speech version 2

In [24]:
train_df, validation_df = hatespeech_v2_load_train_and_validation_set()
print(len(train_df), len(validation_df))

43901 10976


In [25]:
def print_label_counts(df):
    ## predicting the majority label 
    label_distribution = df["label"].value_counts().to_frame().reset_index() 
    label_distribution["count"] = label_distribution["count"] / len(df.index) * 100
    
    for index, row in label_distribution.iterrows():
        print(f"Label in the dataset is '{row['label']}' comprising {row['count']:.2f}% of the dataset.")
        
print_label_counts(train_df)

Label in the dataset is '0.0' comprising 78.80% of the dataset.
Label in the dataset is '1.0' comprising 18.81% of the dataset.
Label in the dataset is '2.0' comprising 2.38% of the dataset.


We can see that the majority of data in the Hate speech version 2 dataset, is labeled as non-offensive -> Normal, with this class representing 78.80% of the data points. So a model that would always predict this, can achieve the minimum accuracy of this. 

In [26]:
from sklearn.metrics import classification_report, accuracy_score

def get_baseline_predictions(validation_df_):
    y_baseline_pred = [0] * len(validation_df_.index)
    print(f"Accuracy: {accuracy_score(validation_df_['label'], y_baseline_pred)}")
    print(f"Classification report:\n{classification_report(validation_df_['label'], y_baseline_pred, zero_division=1)}")

get_baseline_predictions(validation_df)

Accuracy: 0.7909985422740525
Classification report:
              precision    recall  f1-score   support

           0       0.79      1.00      0.88      8682
           1       1.00      0.00      0.00      2050
           2       1.00      0.00      0.00       244

    accuracy                           0.79     10976
   macro avg       0.93      0.33      0.29     10976
weighted avg       0.83      0.79      0.70     10976



Authors report they reached accuracy of 98% on the English dataset.

# Toxigen dataset

In [27]:
from src.preprocessing.toxigen_querying import load_toxigen_train_and_validation

# the b'' is present only because the text is in UTF-8 encoding
train_df_toxi, validation_df_toxi = load_toxigen_train_and_validation()
train_df_toxi

Loaded 7168/8960 train samples and 1792/8960 validation samples


Unnamed: 0,label,text,target_group
0,hate,b'asians are trying to take over the world wit...,asian
1,neutral,b'i have never seen a hotter lineup in my life...,mexican
2,hate,b'muslims deseve to die',muslim
3,neutral,b'people who are deaf or hard of hearing shoul...,physical_dis
4,hate,"b""Jews were all bad people because they're in ...",jewish
...,...,...,...
7163,neutral,"b""i don't think i would be as good at math as ...",chinese
7164,hate,b'theres a HUGE difference between many (i.e. ...,latino
7165,hate,"b'he is from china, but he looks like a white ...",chinese
7166,hate,"b'no matter what they do or how hard they try,...",black


In [28]:
print_label_counts(train_df_toxi)

Label in the dataset is 'neutral' comprising 50.10% of the dataset.
Label in the dataset is 'hate' comprising 49.90% of the dataset.


In [29]:
# map hate to 1, and neutral to 0
validation_df_toxi["label"] = validation_df_toxi["label"].replace({"neutral": 0, "hate": 1})
get_baseline_predictions(validation_df_toxi)

Accuracy: 0.5061383928571429
Classification report:
              precision    recall  f1-score   support

           0       0.51      1.00      0.67       907
           1       1.00      0.00      0.00       885

    accuracy                           0.51      1792
   macro avg       0.75      0.50      0.34      1792
weighted avg       0.75      0.51      0.34      1792



  validation_df_toxi["label"] = validation_df_toxi["label"].replace({"neutral": 0, "hate": 1})


## Merging toxigen data to the hate speech dataset version 2 to improve accuracy

In [30]:
tox_train_m, tox_val_m = load_toxigen_train_and_validation()
print(tox_train_m, tox_val_m )

Loaded 7168/8960 train samples and 1792/8960 validation samples
        label                                               text  target_group
0        hate  b'asians are trying to take over the world wit...         asian
1     neutral  b'i have never seen a hotter lineup in my life...       mexican
2        hate                           b'muslims deseve to die'        muslim
3     neutral  b'people who are deaf or hard of hearing shoul...  physical_dis
4        hate  b"Jews were all bad people because they're in ...        jewish
...       ...                                                ...           ...
7163  neutral  b"i don't think i would be as good at math as ...       chinese
7164     hate  b'theres a HUGE difference between many (i.e. ...        latino
7165     hate  b'he is from china, but he looks like a white ...       chinese
7166     hate  b'no matter what they do or how hard they try,...         black
7167     hate  b'if a woman wears a hijab, she must be forced...   

In [31]:
# frames = [tox_train_m, tox_val_m ]
# result = pd.concat(frames)[["text", "label"]]
# result

In [32]:
toxigen_large = pd.read_csv("./data/toxigen/toxigen.csv")
toxigen_large = toxigen_large[["generation", "prompt_label"]]
toxigen_large.columns = ["text", "label"]

# replace 1 with 2, since 2 in hate speech dataset is the label for hate
toxigen_large["label"] = toxigen_large["label"].replace({1: 2})
toxigen_large

Unnamed: 0,text,label
0,"if youre white, you should be proud to be whit...",2
1,the best thing that could have happened to the...,2
2,theres no hope for the future because theres n...,2
3,"if youre white, you should be proud of it.",2
4,"no matter how much money or power you have, if...",2
...,...,...
250946,My Chinese name is ai-lun (�媈礍) .\r\n- My fam...,0
250947,I can't believe they won,0
250948,fín had a reputation as a musicians' city,0
250949,I am thinking about getting a pet dog,0


In [33]:
len(toxigen_large)

250951

In [34]:
frames = [train_df, toxigen_large]
result = pd.concat(frames)
# drop row if it is longer than 512 characters 
result = result[result["text"].str.len() < 512]
result

Unnamed: 0,text,label
0,"Father Brennan : When the Jews return to Zion,...",0
1,"On December 30, 1767, in Savannah, Georgia, sl...",0
2,Russian hackers are still going to be a huge p...,0
3,Let me be clear whatever you have to say about...,1
4,In dorana district of Mp hindutava organizatio...,0
...,...,...
250946,My Chinese name is ai-lun (�媈礍) .\r\n- My fam...,0
250947,I can't believe they won,0
250948,fín had a reputation as a musicians' city,0
250949,I am thinking about getting a pet dog,0


In [35]:
result.to_csv("./data/large_merged_training_set_toxigen_and_hate.csv", index=False)

In [36]:
print_label_counts(result)
get_baseline_predictions(result)

Label in the dataset is '0.0' comprising 54.22% of the dataset.
Label in the dataset is '2.0' comprising 42.98% of the dataset.
Label in the dataset is '1.0' comprising 2.80% of the dataset.
Accuracy: 0.542225146516171
Classification report:
              precision    recall  f1-score   support

           0       0.54      1.00      0.70    159874
           1       1.00      0.00      0.00      8259
           2       1.00      0.00      0.00    126715

    accuracy                           0.54    294848
   macro avg       0.85      0.33      0.23    294848
weighted avg       0.75      0.54      0.38    294848



In [37]:
print_label_counts(validation_df)

Label in the dataset is '0.0' comprising 79.10% of the dataset.
Label in the dataset is '1.0' comprising 18.68% of the dataset.
Label in the dataset is '2.0' comprising 2.22% of the dataset.


In [38]:
len(result)

294848