### Final Project: Analyzing Hate Speech

For this project we will suppose the author is a data scientist working on behalf of a Vietnamese government stakeholder interested in curtailing online harassment. Follow along as we walk through the process of defining and isolating hate speech

### Business Background

The earliest use for what would eventualy become the internet was for the exchange of text and other messages. For nearly as long, a major problem in any virtual forum has been those who would rather harass and intimidate than communicate. As society and culture have been thrust more and more into these online public centers, the salience of this issue has only grown for stakeholders including the owners and operators of these public forums and the infastructure behind them, from the individual user to the highest levels of government. We aim to use the tools of analysis at our disposal to further define efforts agaisnt this maladaptive social behavior. 

### Data Understanding

Our data for this inquiry is generosly provided by the Vietnamese and international researchers who have scraped the major social media platforms (everything from YouTube to TicTok to Facebook to Twitter) who meticulously curated approximately 30000 comments, sorting them as either clean, offensive, or hateful. A subset of that data was put to use for our purpose here.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from collections import Counter
import nltk
from nltk.tokenize import RegexpTokenizer
from nltk.corpus import stopwords, words, wordnet
import string
nltk.download('words')
nltk.download('stopwords', quiet=True)
from nltk import FreqDist
from nltk.stem.wordnet import WordNetLemmatizer
nltk.download('omw-1.4')
nltk.download('tagsets')
from nltk import pos_tag
from scipy import stats

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.naive_bayes import MultinomialNB
from sklearn.tree import DecisionTreeClassifier, plot_tree
from sklearn.metrics import plot_confusion_matrix, precision_score, classification_report
from sklearn import svm

from sklearn.linear_model import LinearRegression, LogisticRegression
from sklearn.metrics import log_loss

#Ignore warnings
import warnings

### Data Exploration

In [None]:
df = pd.read_csv('vihsd/data/vihsd/dev.csv')
df_train = pd.read_csv('vihsd/data/vihsd/train.csv')
df_test = pd.read_csv('vihsd/data/vihsd/test.csv')

In [None]:
! pwd


In [None]:
! ls


In [None]:
! ls vihsd


In [None]:
! ls vihsd/data/vihsd


Datasets provided by source material : 
@InProceedings{10.1007/978-3-030-79457-6_35,
author="Luu, Son T.
and Nguyen, Kiet Van
and Nguyen, Ngan Luu-Thuy",
editor="Fujita, Hamido
and Selamat, Ali
and Lin, Jerry Chun-Wei
and Ali, Moonis",
title="A Large-Scale Dataset for Hate Speech Detection on Vietnamese Social Media Texts",
booktitle="Advances and Trends in Artificial Intelligence. Artificial Intelligence Practices",
year="2021",
publisher="Springer International Publishing",
address="Cham",
pages="415--426",
abstract="In recent years, Vietnam witnesses the mass development of social network users on different social platforms such as Facebook, Youtube, Instagram, and Tiktok. On social media, hate speech has become a critical problem for social network users. To solve this problem, we introduce the ViHSD - a human-annotated dataset for automatically detecting hate speech on the social network. This dataset contains over 30,000 comments, each comment in the dataset has one of three labels: CLEAN, OFFENSIVE, or HATE. Besides, we introduce the data creation process for annotating and evaluating the quality of the dataset. Finally, we evaluate the dataset by deep learning and transformer models.",
isbn="978-3-030-79457-6"
}

In [None]:
df.head()

In [None]:
df.info()

In [None]:
df.shape

In [None]:
df['label_id'].value_counts()

In [None]:
df['label_id'].value_counts(normalize=True)

In [None]:
plt.bar(df['label_id'].value_counts().index, df['label_id'].value_counts().values)
plt.xticks(rotation=45)

This data is heavily imbalanced towards 'clean' or inoffensive speech, but relatively balanced between the offensive and hate categories

### Cleaning Data

First we remove any blank comments

In [None]:
df.dropna()

Now we're going to remove the inoffensive comments to focus solely on the offensive and hate categoreies

In [None]:
df = df[(df['label_id'] != 0)]

In [None]:
df.shape

In [None]:
df.tail()

Here we convert the column names

In [None]:
df.set_axis(['comment', 'hateful'], axis=1, inplace=True)

In [None]:
df.head()

Now we subtract each column by one to convert to boolean

In [None]:
df['hateful'] = df['hateful'] - 1

In [None]:
df.head()

### Final dataframe

In [None]:
df['hateful'] = df['hateful'].astype(bool)
df.head()

### Preprocessing

Here we are going to further prepare our data for making our models by finding the most common phrases in both the offensive and hateful categories so we can use them to predict the accuracy of the hatefulness status

In [None]:
words = Counter()

for comment in df['comment']:
    tokenizer = RegexpTokenizer(r"(?u)\b\w\w+\b")
    tokenized = tokenizer.tokenize(tweet)
    for token in tokenized:
        words[token] += 1

In [None]:
test_words = words.most_common()[10:20]
test_words_list = [i[0] for i in test_words]
test_words_list

Now we add a column with the count of instances of these words in a given comment

In [None]:
df['contained_words'] = 0
df.head()
for contained in df['contained_words']:
    for x in test_words_list:
        if 'ta' in df['comment']:
            df['contained_words'] = df['contained_words'] + 1
        elif 'ah' in df['comment']:
            df['contained_words'] = df['contained_words'] + 1
        elif 'cho' in df['comment']:
            df['contained_words'] = df['contained_words'] + 1
        elif 'de' in df['comment']:
            df['contained_words'] = df['contained_words'] + 1

In [None]:
df['contained_words'].sum()

In [None]:
df.head()

Due to the lack of automated translation only the first 10 words could be translated and were eliminated as most were consonants. Instead we're working with the 10 most common words following that for the list.

### Train Test Split

Although we've been generously given a train and a test by the original researchers, to avoid data leakage we are generating our own here. 

In [None]:
X = pd.DataFrame(df['comment'])
y = pd.DataFrame(df['hateful'])

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1)

### First Simple Model

Linear regression

In [None]:
#linear regression
linreg = LinearRegression()
feature_cols = ['hateful']
X = df[feature_cols]
y = df.contained_words
linreg.fit(X, y)
df['hateful'] = linreg.predict(X)

In [None]:
fig, ax = plt.subplots()
ax.scatter(df.contained_words, df.hateful)
ax.plot(df.contained_words, df.hateful, color='red')
ax.set_xlabel('contained words')
ax.set_ylabel('hateful');

In [None]:
linreg_test_score = linreg.score(X_test, y_test)
linreg_train_score = linreg.score(X_train, y_train)
linreg_test_score
linreg_train_score

We can conclude from this that there is an equal likelihood of the common words apppearing in either the offensive or hateful category

### Baseline Model

Here we use logistic regression to improve upon and further refine our initial linear regression by reducing it to a binary.

In [None]:
#logistic regression
logreg = LogisticRegression(random_state=42)
feature_cols = ['hateful']
X = df[feature_cols]
y = df.contained_words
logreg.fit(X, y)
df['contained_words'] = logreg.predict(X)

In [None]:
log_loss(y, logreg.predict_proba(X))

In [None]:
lonreg_test_score = linreg.score(X_test, y_test)
lonreg_train_score = linreg.score(X_train, y_train)
lonreg_test_score
lonreg_train_score

We can conclude similarly to the above from this that the probability of a difference between hateful and non hateful offensive content is small enough as to be nonexistent

### Second Model

For our final model we are using a naive bayes model. Here we will set up baseline probabilities after creating separate lists of our data from both the offensive and hateful categories.

In [None]:
offense = [element for element in df['hateful'] if element]
offense
hateful = [element for element in df['hateful'] if element]
hateful

In [None]:
p_offense = len(offense) / (len(offense) + len(hateful))
p_hateful = len(hateful) / (len(hateful) + len(offense))
p_offense
p_hateful

Now we'll graph the distributions of the commonly contained words.

In [None]:
plt.style.use('fivethirtyeight')

fig, ax = plt.subplots()

sns.kdeplot(data=df[df['hateful'] == True]['contained_words'],
            ax=ax, label='hateful')
sns.kdeplot(data=df[df['hateful'] == False]['contained_words'],
            ax=ax, label='offensive')

plt.legend();

We'll do another train test split separate from the previous one, then we add our priors.

In [None]:
train, test = train_test_split(df, random_state=42)

In [None]:
train['contained_words'].value_counts()

Moving forward we calculate likelihoods using the standard deviation and means. For our purposes we generate a hypothetical new entry into the dataset.

In [None]:
test_pt = test.tail(1)
new_entry = test_pt['contained_words'].values(0)

true_stats = train[train['hateful'] == True].describe().loc[['mean', 'std'], :]

true_likelihood = stats.norm(loc=true_stats['contained_words'][0],
           scale=true_stats['contained_words'][1]).pdf(new_entry)
true_likelihood

Because we are using the boolean true or false status of weather the comment is hateful we only need to define the liklihood for a true value.

### Model Comparison

Overall our models returned extremely simlar results showing a negligible differnece between offensive speech and hate speech from the perspective of the measures undertaken here. As of this writing, the error due to the Vietnamese character read in detailed above makes exact valuation impossible until that is resolved.

In [None]:
linreg_test_score
linreg_train_score

lonreg_test_score
lonreg_train_score

true_liklihood