# Automatic Detection of Hate Speech and Offensive Content

### References

1. https://womensmediacenter.com/speech-project/research-statistics

2. https://www.pewresearch.org/internet/2021/01/13/the-state-of-online-harassment/

3. https://www.statista.com/statistics/264810/number-of-monthly-active-facebook-users-worldwide/

4. https://www.forbes.com/sites/johnkoetsier/2020/06/09/300000-facebook-content-moderation-mistakes-daily-report-says/?sh=2e15774d54d0

5. https://www.forbes.com/sites/traversmark/2020/03/21/facebook-spreads-fake-news-faster-than-any-other-social-website-according-to-new-research/

## Introduction 

In this assignment I will go through the steps of building a classifier that can detect hateful content, and after that I will give a practical demonstration of how this model can be used on real world data.

One of the reasons I chose to work on this topic is because I feel there is not enough being done to curb the rampant online harassment that people have to deal with on social media. Hate speech, racial slurs, offensive content, threats of violence, and unwanted sexual comments are against the terms of service of all major social media websites, and yet for some people they are as common as cat gifs. 

Some reasons why people get harassed online are because of their gender, race, sexual orientation, or political views. Of course, none of these are good reasons as to why you should be hostile towards someone, but many people do not realize this. The majority of online harassment occurs on social media websites [$^{citation}$](), so the responsibility falls on these large tech companies to look after their users' safety and wellbeing. Unfortunately, these companies are not doing enough.

Let's look at how Facebook moderates its content. Most of Facebook's content moderation team does not work directly for Facebook as they are outsourced, and there are not that many of them. [$^{citation}$]() Facebook uses a hybrid system to moderate their content. Content is input into an A.I. model and will be flagged for review if the model detects any content violating the platform's terms of service. (Users can also flag content for review) This model does prevent a lot of harmful content from being seen, but it clearly could be implemented better based on the amount of users experiencing harassment on the platform. This model is also used to filter spam, fake news, and adult content, but of course this type of content is still very common on Facebook. [$^{citation}$]() Given the low number of human moderators, only 15,000 for over 2.9 billion monthly active users, [$^{citation}$]() [$^{citation}$]() there is a huge burden on Facebook's classifier to correctly flag objectionable content.

This type of classifier can also of course be used by any website that hosts a significant amount of user-generated content. So it is of great importance to any type of social media website. The techniques shown here can be used to build a classifier that can detect spam, fake news, or explicit sexual content, provided you have appropriate dataset of course.

In [52]:
import neattext as nt
import pandas as pd
from nltk.tokenize import TweetTokenizer
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer

JIGSAW_DATA_PATH = ''
DYNABENCH_DATA_PATH = ''

In [5]:
jigsaw_data = pd.read_csv(
    './data/jigsaw-toxic-comment-classification-challenge/train.csv')

In [6]:
jigsaw_data.head(10)

Unnamed: 0,id,comment_text,toxic,severe_toxic,obscene,threat,insult,identity_hate
0,0000997932d777bf,Explanation\nWhy the edits made under my usern...,0,0,0,0,0,0
1,000103f0d9cfb60f,D'aww! He matches this background colour I'm s...,0,0,0,0,0,0
2,000113f07ec002fd,"Hey man, I'm really not trying to edit war. It...",0,0,0,0,0,0
3,0001b41b1c6bb37e,"""\nMore\nI can't make any real suggestions on ...",0,0,0,0,0,0
4,0001d958c54c6e35,"You, sir, are my hero. Any chance you remember...",0,0,0,0,0,0
5,00025465d4725e87,"""\n\nCongratulations from me as well, use the ...",0,0,0,0,0,0
6,0002bcb3da6cb337,COCKSUCKER BEFORE YOU PISS AROUND ON MY WORK,1,1,1,0,1,0
7,00031b1e95af7921,Your vandalism to the Matt Shirvington article...,0,0,0,0,0,0
8,00037261f536c51d,Sorry if the word 'nonsense' was offensive to ...,0,0,0,0,0,0
9,00040093b2687caa,alignment on this subject and which are contra...,0,0,0,0,0,0


In [89]:
jigsaw_data.shape

(159571, 8)

In [7]:
sample_text = jigsaw_data['comment_text'].iloc[10]

print(sample_text[:500] + "...")

"
Fair use rationale for Image:Wonju.jpg

Thanks for uploading Image:Wonju.jpg. I notice the image page specifies that the image is being used under fair use but there is no explanation or rationale as to why its use in Wikipedia articles constitutes fair use. In addition to the boilerplate fair use template, you must also write out on the image description page a specific explanation or rationale for why using this image in each article is consistent with fair use.

Please go to the image descr...


In [23]:
text_frame = nt.TextFrame(text=sample_text)

In [31]:
nt.clean_text(sample_text,
              puncts=True,
              stopwords=True,
              urls=True,
              special_char=True,
              multiple_whitespaces=True)

'fair use rationale imagewonjujpg thanks uploading imagewonjujpg notice image page specifies image fair use explanation rationale use wikipedia articles constitutes fair use addition boilerplate fair use template write image description page specific explanation rationale image article consistent fair use image description page edit include fair use rationale uploaded fair use media consider checking specified fair use rationale pages find list image pages edited clicking contributions link it located wikipedia page logged in selecting image dropdown box note fair use images uploaded 4 2006 lacking explanation deleted week uploaded described criteria speedy deletion questions ask media copyright questions page thank talk  contribs   unspecified source imagewonjujpg thanks uploading imagewonjujpg noticed files description page currently doesnt specify created content copyright status unclear create file need specify owner copyright obtained website link website taken restatement website

In [108]:
text_column = jigsaw_data['comment_text']

In [118]:
from functools import partial


def preprocess(text_column,
               stopwords=True,
               n_grams=2,
               tfidf=False,
               twitter=False,
               min_df=10,
               max_df=0.60):

    clean_func = partial(nt.clean_text,
                         stopwords=stopwords,
                         puncts=True,
                         urls=True,
                         special_char=True,
                         multiple_whitespaces=True)

    normalized = text_column.map(clean_func)

    vectorizer_options = {
        "ngram_range": (1, n_grams),
        "min_df": min_df,
        "max_df": max_df,
        "tokenizer": TweetTokenizer if twitter else None
    }

    if tfidf:
        vectorizer = TfidfVectorizer(**vectorizer_options)
    else:
        vectorizer = CountVectorizer(**vectorizer_options)

    vectors = vectorizer.fit_transform(normalized)
    feature_names = vectorizer.get_feature_names_out()

    return vectors, feature_names

In [119]:
values, feature_names = preprocess(text_column)

In [120]:
values.shape

(159571, 53856)

In [121]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(values,
                                                    jigsaw_data['toxic'])

In [122]:
from sklearn.naive_bayes import MultinomialNB

nb = MultinomialNB()

nb.fit(X_train, y_train)

MultinomialNB()

In [123]:
from sklearn.metrics import f1_score

preds = nb.predict(X_test)

f1_score(y_test, preds)

0.7003463513674907