# **Dynamically Generated Hate Speech Data : Data Analysis of Performance**

**About The Dataset**

The Dynamically Generated Hate Speech Dataset is provided in two tables.

The first table is the dataset of entries, with the entry ID, label, type, annotator ID, status, round, split, round model predictions and whether the model was fooled (model_wrong).

The second table is the targets of the hate, in a wide format. Because annotators could identify targets inductively, a large number were identified with only or two associated entries, often if they were intersectional characteristics. We combine all identities mentioned in fewer than 15 entries into an 'Other category'. This affects less than 1% of all the hateful entries, whilst reducing the number of target identities to 41. The two tables can be merged on the 'ID' variable.

**Let's dive into notebook without further Ado!**

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import seaborn as sns
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings("ignore")

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

In [None]:
!pip install contractions

In [None]:
import nltk
import re
import contractions
nltk.download('stopwords')
from nltk.corpus import stopwords
STOPWORDS = set(stopwords.words('english'))
from nltk.stem import PorterStemmer
stemmer = PorterStemmer()
from nltk.stem import WordNetLemmatizer 
lemmatizer = WordNetLemmatizer() 

# **Loading the Dataset**

In [None]:
data = pd.read_csv("/kaggle/input/dynamically-generated-hate-speech-dataset/2020-12-31-DynamicallyGeneratedHateDataset-entries-v0.1.csv")
data.head()

In [None]:
data.drop(columns = ['Unnamed: 0'], axis = 1 , inplace = True)
data.head()

In [None]:
data.head()

In [None]:
data.info()

In [None]:
data['id'].dtype

# **Checking for NaN values**

I am basically looking for NaN values to see how they might influence the analysis I'll perform.

In [None]:
for i in data.columns:
    print(i,":",data[str(i)].isnull().sum()/data.shape[0])

The values which are nan only make up 35% of the data , hence I am thinking of dropping them for now.

In [None]:
data.dropna(axis = 0,inplace = True)
for i in data.columns:
    print(i,":",data[str(i)].isnull().sum()/data.shape[0])

# **Exploratory Data Analysis**


Given below are the headings to analysis of certain topics:

**1.  All About Type  : Exploring Type Distribution**

**2.  Model Evaluation on Labels**

**3.  Text Properties and Relations With Other Variables**
        
       3.1 Word Clouds and More
    
       3.2 Lists of Common Words
    
**4.  Annotators and Model Wrong**

**5.  Exploring Targets : Merging Tables**
        
       5.1 Target Distribution in Dataset
       
       5.2 Exploring Top Two Values
      
**6.  Visualizing Embeddings**



#  **All About Type**

Let's check out the distribution of types in the dataset.

In [None]:
data.groupby('type').count()['id']

In [None]:
data['type'].nunique()

In [None]:
fig,ax = plt.subplots(ncols = 3, figsize = (20,4) , dpi = 100)
#plt.tight_layout()

colors = ['#66c2a5', '#fc8d62' , '#8da0cb' ,'#e78ac3' , '#a6d854' , '#ffd92f','#e5c494']
data['type'].value_counts().plot(kind = 'pie',ax = ax[0], labels = data['type'].value_counts().index , colors = colors)
sns.countplot(x = 'type',data = data , ax = ax[1] , palette = 'Paired')
sns.countplot(x = 'type' , data = data , hue = 'model_wrong', palette = 'Paired')

for i in range(3):
    ax[i].set_ylabel('')
    ax[i].tick_params(axis='x', labelsize=15 , rotation = 45)
    ax[i].tick_params(axis='y', labelsize=15)

ax[0].set_title('Type Distribution in Data', fontsize=13)
ax[1].set_title('Type Count in Data', fontsize=13)
ax[2].set_title('Model Evaluation on Type', fontsize = 13)

plt.show()


From the above charts we can conclude the following about type :

1.  None and Not Given are the types which are in maximum quantity. 
2.  Derogation follows them in the third place.
3.  The Model was able to succesfully identify the label in the cases of sentences with types None and Derogation.
4.  The Model was succesfully fooled maximum number of times by None and Not Given types. Since Not Given types is a mixture of other types, we cannot really point out the specific characterstic which might have resulted in this. 

# **Model Evaluation on Lables**

Let's see how well the model performed on labels.

In [None]:
plt.figure(figsize=(10,10))
sns.countplot(x = 'label' , data = data, hue = 'model_wrong' , palette = 'Paired')
plt.ylabel("")
plt.tick_params(axis = 'x',labelsize = 15)
plt.tick_params(axis = 'y',labelsize = 15)
plt.title("Model Evaluation on Label" , fontsize = 15)
plt.show()

While the model assigns correct labels nearly equal number of times in both cases of hate and not hate labels, when it comes to assigning a wrong label , it is more likely that that sentence would be a hate comment. This could possibly be because of the distribution of hate and not hate comments in dataset. Let's check that out.

In [None]:
plt.figure(figsize=(10,10))
sns.countplot(x = 'label' , data = data, palette = 'Paired')
plt.ylabel("")
plt.tick_params(axis = 'x',labelsize = 15)
plt.tick_params(axis = 'y',labelsize = 15)
plt.title("Label Distribution" , fontsize = 15)
plt.show()

Okay , this is not something I was expecting. The countplot prior to this would ideally suggest more number of not hate sentences as compared to hate , but here the graph shows a completely different story. 

Bookmarking this for now and will come back to it later.

# **Text Properties and Relations with Other Variables**

Let's check the text properties and relations with other variables

In [None]:
def clean_txt(txt):
        ##html code
        TAG_RE = re.compile(r'<[^>]+>') 
        txt = TAG_RE.sub('', txt.lower())
        ##emojis
        txt=txt.encode("ascii","ignore")
        txt=txt.decode()
        ##numbers removing
        txt=''.join(i for i in txt if not i.isdigit())
        ##punctuation
        txt = re.sub(r'[^\w\s]', ' ', txt) 
        ##stopwords
        txt = ' '.join([i for i in txt.split() if not i in STOPWORDS])
        ##removing certain sized words
        txt=' '.join([i for i in txt.split() if len(i)>2])
        ##contractions
        txt=contractions.fix(txt)
        ##stemmers
        ##txt= stemmer.stem(txt)  should stemming be performed or lemmatization and why?
        ##lemmatizer
        txt=lemmatizer.lemmatize(txt)
        return txt

In [None]:
data['Clean Text'] = data['text'].apply(clean_txt)
data.head()

In [None]:
data['model_wrong'] = data['model_wrong'].astype("string")
data['model_wrong'].dtype

In [None]:
vocab = [ ]
model_wrong = []
label = []
for _,row in data.iterrows():
    a = row['Clean Text'].split()
    if(row['label'] == 'hate'):
        label+=[0 for i in range(len(a))]
    else:
        label+=[1 for i in range(len(a))]
    if(row['model_wrong'] == 'True'):
        model_wrong+=[0 for i in range(len(a))]
    else:
        model_wrong+=[1 for i in range(len(a))]
    vocab+=a
    

In [None]:
vocab_model_relation = pd.DataFrame({'Words': vocab , 'Model Wrong': model_wrong ,'Label': label })
#vocab_model_relation.drop_duplicates(subset=['Words'],inplace=True)
vocab_model_relation.head()

## **Word Clouds and More**

We are looking for possible keywords in our sentences which the model might have associated with a certain label while learning the kind of characterstics of words have, on making embeddings. 

In [None]:
words = vocab_model_relation[vocab_model_relation['Model Wrong'] == 1]['Words'].value_counts().index
words

In [None]:
words2 = vocab_model_relation[vocab_model_relation['Model Wrong'] == 0]['Words'].value_counts().index
words2

In [None]:
common_words = list(set(words)&set(words2))
common_words[:10]

In [None]:
words = list(set(words).difference(set(common_words)))
words[:10]

In [None]:
words2 = list(set(words2).difference(set(common_words)))
words2[:10]

In [None]:
from wordcloud import WordCloud
def wc(data,bgcolor,title):
    plt.figure(figsize = (13,10))
    wc = WordCloud(background_color = bgcolor, max_words = 1000,  max_font_size = 50)
    wc.generate(' '.join(data))
    plt.title(title , fontsize = 20)
    plt.imshow(wc)
    plt.axis('off')

wc(common_words,'black','Common Words')

These words are used in abundance in both hate and non hate sentences. Hence , for now we are assuming that they have an equal influence on both hate and non hate sentences for the model.

In [None]:
wc(words,'black','Unique Words For Which Predictions Were Wrong')

When we remove the common words used , we get unique words for each category. The ones for which the model was wrong are shown above. 


In [None]:
wc(words2,'black','Unique Words For Which Model Evaluted True')

Again these words have both , what most of us call 'good' and 'bad' words. The context in which these words are used ends up determining whether the sentences is hateful or not.

## **Lists of Common Words**

Here we are going to see the twenty most used words , their total number of occurences and how did the model perform when it encountered them.
We will also be checking whether our assumption that the common words have an equal influence on both hateful and not hateful sentences is correct or not.

In [None]:
fig , ax = plt.subplots(ncols = 2,figsize = (20,10) , dpi = 100)

sns.barplot(y = vocab_model_relation[vocab_model_relation['Model Wrong'] == 0]['Words'].value_counts().index[0:20] , x = vocab_model_relation[vocab_model_relation['Model Wrong'] == 0]['Words'].value_counts().values[:20], ax = ax[0] , color = '#97d83e')
sns.barplot(y = vocab_model_relation[vocab_model_relation['Model Wrong'] == 1]['Words'].value_counts().index[0:20] , x = vocab_model_relation[vocab_model_relation['Model Wrong'] == 1]['Words'].value_counts().values[:20], ax = ax[1] , color = '#e55063')

for i in range(2):
    ax[i].tick_params(axis = 'x' , labelsize = 13)
    ax[i].tick_params(axis = 'y' , labelsize = 13)

ax[0].set_title('Model Got Them Right')
ax[1].set_title('Model Got Them Wrong')


From the above charts we can see that the top 5 words are same for both cases. While most words are common , if you look carefully , the count of these words is not. The frequency of words in usage is more when the model correctly assigns a label as compared to the ones in which it assigns them wrong.

We can also clearly see that sentences containing the words everyone, wrong and really are the words which do not make it to the top 20 usage of words in the Model Got them Wrong List.

In [None]:
fig , ax = plt.subplots(ncols = 2,figsize = (20,10) , dpi = 100)

sns.barplot(y = vocab_model_relation[vocab_model_relation['Label'] == 0]['Words'].value_counts().index[0:20] , x = vocab_model_relation[vocab_model_relation['Label'] == 0]['Words'].value_counts().values[:20], ax = ax[0] , color = '#97d83e')
sns.barplot(y = vocab_model_relation[vocab_model_relation['Label'] == 1]['Words'].value_counts().index[0:20] , x = vocab_model_relation[vocab_model_relation['Label'] == 1]['Words'].value_counts().values[:20], ax = ax[1] , color = '#e55063')

for i in range(2):
    ax[i].tick_params(axis = 'x' , labelsize = 13)
    ax[i].tick_params(axis = 'y' , labelsize = 13)

ax[0].set_title('Top 20 Words Used In Hate Comments')
ax[1].set_title('Top 20 Words Used In Non Hate Comments')


# **Annotators and Model Wrong**

In [None]:
data['annotator'].unique()

In [None]:
colors_false = ['grey' for i in data[data['model_wrong'] == 'False']['annotator'].value_counts().index]
colors_false[2] = '#dd5a5b'
colors_true = ['grey' for i in data[data['model_wrong'] == 'True']['annotator'].value_counts().index]
colors_true[1] = '#dd5a5b'

In [None]:
fig , ax = plt.subplots(ncols = 2,figsize = (20,10) , dpi = 100)

sns.barplot(y = data[data['model_wrong'] == 'False']['annotator'].value_counts().index , x = data[data['model_wrong'] == 'False']['annotator'].value_counts().values, ax = ax[0] , palette = colors_false)
sns.barplot(y = data[data['model_wrong'] == 'True']['annotator'].value_counts().index, x = data[data['model_wrong'] == 'True']['annotator'].value_counts().values, ax = ax[1] , palette = colors_true)

for i in range(2):
    ax[i].tick_params(axis = 'x' , labelsize = 13)
    ax[i].tick_params(axis = 'y' , labelsize = 13)
    
ax[0].set_title('Model Got Them Right')
ax[1].set_title('Model Got Them Wrong')


We can clearly see from the above graph that the Model gets more sentences right , if the annotator lqlkttromx has assigned the label and it gets them wrong in the case the annotator is elgzzdd8tvb.

However implying that this is a causal relationship between annotators and label assignment by model  would be wrong as there is a possiblity of other confounders being present. We would require a way to test the same before inferring that this is indeed a causal relationship. 

# **Exploring Targets**

We will now merge the tables and explore the targets.

In [None]:
new_data = pd.read_csv("/kaggle/input/dynamically-generated-hate-speech-dataset/2020-12-31-DynamicallyGeneratedHateDataset-targets-v0.1.csv")

In [None]:
tags = []
for i in range(new_data.shape[0]):
    try:
        tags.append(list(new_data.iloc[i,:].index)[list(new_data.iloc[i,:].values).index(1)])
    except:
        tags.append('Nothing')
    
print(tags[:2])

In [None]:
m_data = data.merge(pd.DataFrame({'id':new_data['id'],'targets':tags}) , on = 'id' , how='inner')
m_data.head()

In [None]:
m_data['label'].nunique()

## **Target Distribution in Dataset**

In [None]:
colors = ['grey' for i in range(len(m_data['targets'].value_counts().index))] 
colors[2] = '#dd5a5b'
colors[3] = '#97d83e'

In [None]:
plt.figure(figsize=(15,15))

sns.countplot(y=m_data['targets'],order = m_data['targets'].value_counts().index, palette = colors)
plt.tick_params(axis = 'y' , labelsize = 15)
plt.tick_params(axis = 'x' , labelsize = 15)
plt.ylabel('Targets')
plt.xlabel('')
plt.title("Target Distribution in Dataset" , fontsize = 20)
plt.xticks(rotation = 90)

We can see that the black community and women are the most targeted in the dataset we have.

## **Exploring The Top Two Values**

I am kinda scared and concerned about the type of results which might show up. 

### **Sentences Targetting Black People**

In [None]:
words_black = [ ]
labels = []
for _,row in m_data[m_data['targets'] == 'bla'].iterrows():
    a = row['Clean Text'].split()
    if(row['label'] == 0):
        labels+=[0 for i in range(len(a))]
    else:
        labels+=[1 for i in range(len(a))]
    words_black+=a

words_black = pd.DataFrame({'Word':words_black , 'Label':labels})
words_black.head()

In [None]:
plt.figure(figsize = (10,10))
sns.countplot(words_black['Label'] , palette = 'Paired')
plt.ylabel("")
plt.legend('Hate')
plt.title('Total Sentences Labeled Hate and Not Hate' , fontsize = 15)

In [None]:
wc(words_black['Word'].unique(),'black','Unique Words Found in Sentences Targetting Black People')

Well , this wordcloud , is just ugly. But I guess it is merely a reflection of the population of people chosen for this study.

In [None]:
words_black[words_black['Label'] == 1]['Word'].value_counts()

In [None]:
plt.figure(figsize = (10,10))
sns.barplot(y = words_black['Word'].value_counts()[:20].index , x = words_black['Word'].value_counts()[:20].values , color = '#97d83e')
plt.title('Top 20 Words Appearing In Sentences Targetting The Black Community' , fontsize = 15)
plt.xticks(rotation = 90)

## **Sentences Targetting Women**

In [None]:
words_women = [ ]
labels = []
for _,row in m_data[m_data['targets'] == 'wom'].iterrows():
    a = row['Clean Text'].split()
    if(row['label'] == 0):
        labels+=[0 for i in range(len(a))]
    else:
        labels+=[1 for i in range(len(a))]
    words_women+=a

words_women = pd.DataFrame({'Word':words_women , 'Label':labels})
words_women.head()

In [None]:
plt.figure(figsize = (10,10))
sns.countplot(words_women['Label'] , palette = 'Paired')
plt.ylabel("")
plt.legend('Hate')
plt.title('Total Sentences Labeled Hate and Not Hate' , fontsize = 15)

In [None]:
wc(words_women['Word'].unique() , 'black' , 'Unique Words Found in Sentences Targetting Women')

Here is a true mixture of words.

In [None]:
plt.figure(figsize = (10,10))
sns.barplot(y = words_women['Word'].value_counts()[:30].index , x = words_women['Word'].value_counts()[:30].values , color = '#97d83e')
plt.tick_params(axis = 'y', labelsize = 12)
plt.title('Common Words Found in Sentences Targetting Women' , fontsize = 15)
plt.xticks(rotation = 90)

## **Visualizing Embeddings**

Now this visualization might be inaccurate considering I am using Count Vectorizer and TFIDF to make embeddings and not what was probably used in the model. Still let's check it out!

In [None]:
label = {'hate':0 , 'nothate':1}
data['label'] = data['label'].map(label)
data.head()

In [None]:
from sklearn.feature_extraction.text import CountVectorizer,TfidfVectorizer
from sklearn.model_selection import train_test_split
from sklearn.decomposition import TruncatedSVD,PCA
def cv(data):
    count_vectorizer = CountVectorizer()

    emb = count_vectorizer.fit_transform(data)

    return emb, count_vectorizer

list_corpus = data["text"].tolist()
list_labels = data["label"].tolist()

X_train, X_test, y_train, y_test = train_test_split(list_corpus, list_labels, test_size=0.2)

X_train_counts, count_vectorizer = cv(X_train)
X_test_counts = count_vectorizer.transform(X_test)

In [None]:
import matplotlib
import matplotlib.patches as mpatches
def plot_LSA(test_data, test_labels, savepath="PCA_demo.csv", plot=True):
        lsa = TruncatedSVD(n_components=2)
        lsa.fit(test_data)
        lsa_scores = lsa.transform(test_data)
        color_mapper = {label:idx for idx,label in enumerate(set(test_labels))}
        color_column = [color_mapper[label] for label in test_labels]
        colors = ['orange','blue']
        if plot:
            plt.scatter(lsa_scores[:,0], lsa_scores[:,1], s=8, alpha=.8, c=test_labels, cmap=matplotlib.colors.ListedColormap(colors))
            orange_patch = mpatches.Patch(color='orange', label='Hate')
            blue_patch = mpatches.Patch(color='blue', label='Not Hate')
            plt.legend(handles=[orange_patch, blue_patch], prop={'size': 30})

fig = plt.figure(figsize=(12, 12))          
plot_LSA(X_train_counts, y_train)
plt.show()

Here we cannot actually make out much of a difference. Let's use TFIDF vectorizer for the same and check.

In [None]:
def tfidf(data):
    tfidf_vectorizer = TfidfVectorizer()

    train = tfidf_vectorizer.fit_transform(data)

    return train, tfidf_vectorizer

X_train_tfidf, tfidf_vectorizer = tfidf(X_train)
X_test_tfidf = tfidf_vectorizer.transform(X_test)

In [None]:
fig = plt.figure(figsize=(12, 12))          
plot_LSA(X_train_tfidf, y_train)
plt.show()

Well, that makes quite the difference.

Thank you for reading ! If you liked what I did , give me an upvote ! Saw something which could have been better or have a suggestion to make it better ? Leave a comment and I'll get back to you ASAP.

# **References**

Below are some awesome notebooks where I discovered new ways to do EDA for NLP . Do check them out !!

1. https://www.kaggle.com/gunesevitan/nlp-with-disaster-tweets-eda-cleaning-and-bert
2. https://www.kaggle.com/vbmokin/nlp-eda-bag-of-words-tf-idf-glove-bert
