# Hate Speech Detection With Python
https://copyassignment.com/hate-speech-detection/

根据这个教程的方法，使用另一个更大的数据集来训练数据，看看能不能有更好的效果

数据集：https://huggingface.co/datasets/ucberkeley-dlab/measuring-hate-speech

数据来源：https://hatespeech.berkeley.edu/

This is a public release of the dataset described in Kennedy et al. (2020) and Sachdeva et al. (2022), consisting of 39,565 comments annotated by 7,912 annotators, for 135,556 combined rows. The primary outcome variable is the **"hate speech score" but the 10 constituent ordinal labels (sentiment, (dis)respect, insult, humiliation, inferior status, violence, dehumanization, genocide, attack/defense, hate speech benchmark)** can also be treated as outcomes. Includes 8 target identity groups (race/ethnicity, religion, national origin/citizenship, gender, sexual orientation, age, disability, political ideology) and 42 target identity subgroups, as well as 6 annotator demographics and 40 subgroups. The hate speech score incorporates an IRT adjustment by estimating variation in annotator interpretation of the labeling guidelines.

In [2]:
import pandas as pd
import numpy as np
from sklearn. feature_extraction. text import CountVectorizer
from sklearn. feature_extraction. text import TfidfVectorizer
from sklearn. model_selection import train_test_split
from sklearn. tree import DecisionTreeClassifier

In [3]:
import nltk
import re
#nltk. download('stopwords')
from nltk. corpus import stopwords
stopword=set(stopwords.words('english'))
stemmer = nltk. SnowballStemmer("english")

In [4]:
data = pd. read_csv('measuring_hate_speech.csv')
#To preview the data
data. head()

Unnamed: 0,comment_id,annotator_id,platform,sentiment,respect,insult,humiliate,status,dehumanize,violence,...,annotator_religion_hindu,annotator_religion_jewish,annotator_religion_mormon,annotator_religion_muslim,annotator_religion_nothing,annotator_religion_other,annotator_sexuality_bisexual,annotator_sexuality_gay,annotator_sexuality_straight,annotator_sexuality_other
0,47777,10873,3,0.0,0.0,0.0,0.0,2.0,0.0,0.0,...,False,False,False,False,False,False,False,False,True,False
1,39773,2790,2,0.0,0.0,0.0,0.0,2.0,0.0,0.0,...,False,False,False,False,False,False,False,False,True,False
2,47101,3379,3,4.0,4.0,4.0,4.0,4.0,4.0,0.0,...,False,False,False,False,True,False,False,False,True,False
3,43625,7365,3,2.0,3.0,2.0,1.0,2.0,0.0,0.0,...,False,False,False,False,False,False,False,False,True,False
4,12538,488,0,4.0,4.0,4.0,4.0,4.0,4.0,4.0,...,False,False,False,False,False,False,False,False,True,False


## Key dataset columns
hate_speech_score - continuous hate speech measure, where higher = more hateful and lower = less hateful. > 0.5 is approximately hate speech, < -1 is counter or supportive speech, and -1 to +0.5 is neutral or ambiguous.
text - lightly processed text of a social media post
- comment_id - unique ID for each comment
- annotator_id - unique ID for each annotator
- sentiment - ordinal label that is combined into the continuous score
- respect - ordinal label that is combined into the continuous score
- insult - ordinal label that is combined into the continuous score
- humiliate - ordinal label that is combined into the continuous score
- status - ordinal label that is combined into the continuous score
- dehumanize - ordinal label that is combined into the continuous score
- violence - ordinal label that is combined into the continuous score
- genocide - ordinal label that is combined into the continuous score
- attack_defend - ordinal label that is combined into the continuous score
- hatespeech - ordinal label that is combined into the continuous score
- annotator_severity - annotator's estimated survey interpretation bias

In [5]:
data.describe()

Unnamed: 0,comment_id,annotator_id,platform,sentiment,respect,insult,humiliate,status,dehumanize,violence,...,hatespeech,hate_speech_score,infitms,outfitms,annotator_severity,std_err,annotator_infitms,annotator_outfitms,hypothesis,annotator_age
count,135556.0,135556.0,135556.0,135556.0,135556.0,135556.0,135556.0,135556.0,135556.0,135556.0,...,135556.0,135556.0,135556.0,135556.0,135556.0,135556.0,135556.0,135556.0,135556.0,135451.0
mean,23530.416138,5567.097812,1.281352,2.954307,2.828875,2.56331,2.278638,2.698575,1.846211,1.052045,...,0.744733,-0.567428,1.034322,1.001052,-0.018817,0.300588,1.007158,1.011841,0.014589,37.910772
std,12387.194125,3230.508937,1.023542,1.231552,1.309548,1.38983,1.370876,0.8985,1.402372,1.345706,...,0.93226,2.380003,0.496867,0.791943,0.487261,0.23638,0.269876,0.675863,0.613006,11.641276
min,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,-8.34,0.1,0.07,-1.82,0.02,0.39,0.28,-1.578693,18.0
25%,18148.0,2719.0,0.0,2.0,2.0,2.0,1.0,2.0,1.0,0.0,...,0.0,-2.33,0.71,0.56,-0.38,0.03,0.81,0.67,-0.341008,29.0
50%,20052.0,5602.5,1.0,3.0,3.0,3.0,3.0,3.0,2.0,0.0,...,0.0,-0.34,0.96,0.83,-0.02,0.34,0.97,0.85,0.110405,35.0
75%,32038.25,8363.0,2.0,4.0,4.0,4.0,3.0,3.0,3.0,2.0,...,2.0,1.41,1.3,1.22,0.35,0.42,1.17,1.13,0.449555,45.0
max,50070.0,11142.0,3.0,4.0,4.0,4.0,4.0,4.0,4.0,4.0,...,2.0,6.3,5.9,9.0,1.36,1.9,2.01,9.0,0.987511,81.0


In [17]:
#filter the dataset
subdata = data[['text','hate_speech_score','hatespeech']].copy()

In [18]:
# > 0.5 is approximately hate speech, < -1 is counter or supportive speech, and -1 to +0.5 is neutral or ambiguous.
def score2hate(score):
    if score>0.5:
        return 1
    if score <-1:
        return -1
    else:
        return 0

In [19]:
subdata

Unnamed: 0,text,hate_speech_score,hatespeech
0,Yes indeed. She sort of reminds me of the elde...,-3.90,0.0
1,The trans women reading this tweet right now i...,-6.52,0.0
2,Question: These 4 broads who criticize America...,0.36,2.0
3,It is about time for all illegals to go back t...,0.26,0.0
4,For starters bend over the one in pink and kic...,1.54,2.0
...,...,...,...
135551,عاجل سماحة #السيد_عبدالملك_بدرالدين_الحوثي نص...,-4.88,0.0
135552,Millions of #Yemen-is participated in mass ral...,-4.40,0.0
135553,@AbeShinzo @realDonaldTrump @shinzoabe 独裁者は行きま...,-2.49,0.0
135554,Millions of #Yemen-is participated in mass ral...,-4.40,0.0


In [20]:
subdata['is_hate'] = subdata['hate_speech_score'].apply(score2hate)

In [21]:
subdata

Unnamed: 0,text,hate_speech_score,hatespeech,is_hate
0,Yes indeed. She sort of reminds me of the elde...,-3.90,0.0,-1
1,The trans women reading this tweet right now i...,-6.52,0.0,-1
2,Question: These 4 broads who criticize America...,0.36,2.0,0
3,It is about time for all illegals to go back t...,0.26,0.0,0
4,For starters bend over the one in pink and kic...,1.54,2.0,1
...,...,...,...,...
135551,عاجل سماحة #السيد_عبدالملك_بدرالدين_الحوثي نص...,-4.88,0.0,-1
135552,Millions of #Yemen-is participated in mass ral...,-4.40,0.0,-1
135553,@AbeShinzo @realDonaldTrump @shinzoabe 独裁者は行きま...,-2.49,0.0,-1
135554,Millions of #Yemen-is participated in mass ral...,-4.40,0.0,-1


In [22]:
subdata[subdata['is_hate']==1].text.iloc[1]

'Fuck off you insufferable retarded faggot.'

## Preprocessing the data
In Data preprocessing, we prepare the raw data and make it suitable for a machine learning model. It is the first and crucial step while creating a machine learning model. When creating a machine learning project, it is not always a case that we come across clean and formatted data. And while doing any operation with data, it is mandatory to clean it and put it in a formatted way. So for this, we use the data preprocessing task.

In [23]:
subdata["label"] = subdata["is_hate"]. map({0: "neutral", 1: "hate speech", -1: "supportive"})
subdata = subdata[["text", "label"]]
subdata. head()

Unnamed: 0,text,label
0,Yes indeed. She sort of reminds me of the elde...,supportive
1,The trans women reading this tweet right now i...,supportive
2,Question: These 4 broads who criticize America...,neutral
3,It is about time for all illegals to go back t...,neutral
4,For starters bend over the one in pink and kic...,hate speech


In [39]:
subdata[subdata.label =='hate speech']

Unnamed: 0,text,label
4,starter bend one pink kick ass pussi get tast ...,hate speech
7,fuck insuffer retard faggot,hate speech
14,white small littl invis clit disgrac 🤢 pay sma...,hate speech
17,suddenlywestan countri mental ill terrorist wh...,hate speech
18,sub full grade faggot comedi fuck ebola,hate speech
...,...,...
122428,stop suck fatass dick niggerlov kike,hate speech
122429,stupid asian bitch your idiot go back ching ch...,hate speech
122430,stupid asian bitch your idiot go back ching ch...,hate speech
122431,fuck fagot burn hell,hate speech


In [24]:
import string
def clean (text):
    text = str (text). lower()
    text = re. sub('[.?]', '', text) 
    text = re. sub('https?://\S+|www.\S+', '', text)
    text = re. sub('<.?>+', '', text)
    text = re. sub('[%s]' % re. escape(string. punctuation), '', text)
    text = re. sub('\n', '', text)
    text = re. sub('\w\d\w', '', text)
    text = [word for word in text.split(' ') if word not in stopword]
    text=" ". join(text)
    text = [stemmer. stem(word) for word in text. split(' ')]
    text=" ". join(text)
    return text

subdata["text"] = subdata["text"]. apply(clean)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  subdata["text"] = subdata["text"]. apply(clean)


In [25]:
subdata.label.value_counts()

supportive     53651
hate speech    49048
neutral        32857
Name: label, dtype: int64

## Splitting the data
The next important step is to explore the dataset and divide the dataset into training and testing data.

NLP三种词袋模型CountVectorizer/TFIDF/HashVectorizer
https://zhuanlan.zhihu.com/p/268886634

In [26]:
x = np. array(subdata["text"])
y = np. array(subdata["label"])

In [27]:
cv = CountVectorizer()
X = cv. fit_transform(x)
# Splitting the Data
#X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

#StratifiedKFold 分层k折
from sklearn.model_selection import StratifiedKFold
skf = StratifiedKFold(n_splits=10)
skf.get_n_splits(X, y)


10

In [28]:
for train_index, test_index in skf.split(X, y):
    print("TRAIN:", train_index, "TEST:", test_index)
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]

TRAIN: [ 11128  11129  11130 ... 135553 135554 135555] TEST: [    0     1     2 ... 18499 18501 18503]
TRAIN: [     0      1      2 ... 135553 135554 135555] TEST: [11128 11129 11130 ... 37097 37105 37106]
TRAIN: [     0      1      2 ... 135553 135554 135555] TEST: [22146 22154 22159 ... 55224 55226 55228]
TRAIN: [     0      1      2 ... 135553 135554 135555] TEST: [33103 33106 33110 ... 73695 73698 73701]
TRAIN: [     0      1      2 ... 135553 135554 135555] TEST: [44339 44342 44344 ... 92005 92008 92010]
TRAIN: [     0      1      2 ... 135553 135554 135555] TEST: [55315 55320 55326 ... 97512 97513 97514]
TRAIN: [     0      1      2 ... 135553 135554 135555] TEST: [ 66269  66274  66277 ... 102909 102910 102911]
TRAIN: [     0      1      2 ... 135553 135554 135555] TEST: [ 77389  77390  77398 ... 124346 124347 124349]
TRAIN: [     0      1      2 ... 135553 135554 135555] TEST: [ 88408  88409  88411 ... 130186 130187 130188]
TRAIN: [     0      1      2 ... 130186 130187 130188] 

In [29]:
tfv = TfidfVectorizer()
X2 = tfv. fit_transform(x)
#X2_train, X2_test, y2_train, y2_test = train_test_split(X2, y, test_size=0.33, random_state=42)

#StratifiedKFold 分层k折
skf = StratifiedKFold(n_splits=10)
skf.get_n_splits(X2, y)
for train_index, test_index in skf.split(X2, y):
    print("TRAIN:", train_index, "TEST:", test_index)
    X2_train, X2_test = X2[train_index], X2[test_index]
    y2_train, y2_test = y[train_index], y[test_index]

TRAIN: [ 11128  11129  11130 ... 135553 135554 135555] TEST: [    0     1     2 ... 18499 18501 18503]
TRAIN: [     0      1      2 ... 135553 135554 135555] TEST: [11128 11129 11130 ... 37097 37105 37106]
TRAIN: [     0      1      2 ... 135553 135554 135555] TEST: [22146 22154 22159 ... 55224 55226 55228]
TRAIN: [     0      1      2 ... 135553 135554 135555] TEST: [33103 33106 33110 ... 73695 73698 73701]
TRAIN: [     0      1      2 ... 135553 135554 135555] TEST: [44339 44342 44344 ... 92005 92008 92010]
TRAIN: [     0      1      2 ... 135553 135554 135555] TEST: [55315 55320 55326 ... 97512 97513 97514]
TRAIN: [     0      1      2 ... 135553 135554 135555] TEST: [ 66269  66274  66277 ... 102909 102910 102911]
TRAIN: [     0      1      2 ... 135553 135554 135555] TEST: [ 77389  77390  77398 ... 124346 124347 124349]
TRAIN: [     0      1      2 ... 135553 135554 135555] TEST: [ 88408  88409  88411 ... 130186 130187 130188]
TRAIN: [     0      1      2 ... 130186 130187 130188] 

## Building the model
After segregating the data, our next work is to find a good algorithm suited for our model. We can use a Decision tree classifier for building the Hate Speech detection project. Decision Trees are a type of Supervised Machine Learning used mainly for classification problems.

In [30]:
#Model building
model = DecisionTreeClassifier()
#Training the model
model. fit(X_train,y_train)

In [31]:
model2 = DecisionTreeClassifier()
#Training the model
model2. fit(X2_train,y2_train)

## Evaluating the results
The final step in machine learning model building is prediction. In this step, we can measure how well our model performs for the test input.

In [32]:
#Testing the model
y_pred = model. predict (X_test)
y_pred

array(['hate speech', 'hate speech', 'hate speech', ..., 'supportive',
       'supportive', 'supportive'], dtype=object)

In [33]:
#Testing the model
y2_pred = model2. predict (X2_test)
y2_pred

array(['hate speech', 'hate speech', 'hate speech', ..., 'supportive',
       'supportive', 'hate speech'], dtype=object)

In [34]:
#Accuracy Score of our model
from sklearn. metrics import accuracy_score, f1_score, recall_score
print (accuracy_score (y_test,y_pred))
print (f1_score (y_test,y_pred, average='micro'))
print (recall_score (y_test,y_pred, average='micro'))

0.8393950571744744
0.8393950571744744
0.8393950571744744


### skmetrics输出acc、precision、recall、f1值相同的问题
average='micro'的原理是：
把每个类别的TP、FP、FN先相加，再把这个问题当成二分类来进行计算

在某一类中被判断成FP的样本，在其他类中一定是FN的样本

解决方法的话就是换一种平均的方法average = 'macro’

这种方法是对于不同的类分别计算评估指标，然后加起来求平均
https://blog.csdn.net/fujikoo/article/details/119926390

In [35]:
#Accuracy Score of our model
from sklearn. metrics import accuracy_score, f1_score, recall_score
print (accuracy_score (y_test,y_pred))
print (f1_score (y_test,y_pred, average='macro'))
print (recall_score (y_test,y_pred, average='macro'))

0.8393950571744744
0.824345058618583
0.8287084916363843


In [36]:
#Accuracy Score of our model2
from sklearn. metrics import accuracy_score
print (accuracy_score (y2_test,y2_pred))
print (f1_score (y2_test,y2_pred, average='macro'))
print (recall_score (y2_test,y2_pred, average='macro'))

0.8906676503135375
0.8764654094066571
0.8759275550528535


In [49]:
#Predicting the outcome
inp = "fuck russians"
inp = cv.transform([inp]).toarray()
print(model.predict(inp))

['supportive']


In [51]:
#Predicting the outcome
inp = "fuck russians"
inp = tfv.transform([inp]).toarray()
print(model2.predict(inp))

['supportive']


## Conclusion
In this article, we have built a project for Hate Speech detection using Machine Learning. Hate speech is one of the serious issues we see on social media platforms like Facebook and Twitter. Hope you enjoyed this article by building a project to detect hate speech with Python.

# Use Russia-Ukraine Dataset

In [75]:
#"H:\课程\毕业论文\cleaned\tweet_ids_day_2022-2-22_clean.csv"
df = pd.read_csv("H:/课程/毕业论文/cleaned/tweet_ids_day_2022-2-22_clean.csv")

In [76]:
df = df[df.Tweet_isRT==0]

In [77]:
#model1: cv+ decision tree
def hate_detection_1(text):
    text = clean(text)
    inp = cv.transform([text]).toarray()
    result = model.predict(inp)
    return result[0]
    

In [78]:
#model1: tf-idf+ decision tree
def hate_detection_2(text):
    text = clean(text)
    inp = tfv.transform([text]).toarray()
    result = model2.predict(inp)
    return result[0]
    

In [79]:
df['is_Hate_m1'] = df['Tweet_content'].apply(hate_detection_1) 

In [80]:
df['is_Hate_m2'] = df['Tweet_content'].apply(hate_detection_2) 

In [81]:
df.is_Hate_m1.value_counts()

supportive     245
neutral        115
hate speech     26
Name: is_Hate_m1, dtype: int64

In [82]:
df.is_Hate_m2.value_counts()

supportive     269
neutral         94
hate speech     23
Name: is_Hate_m2, dtype: int64

In [83]:
df[df['is_Hate_m1']=="hate speech"].Tweet_content.iloc[15]

'why ukraina is fighting back russia, learn from our modiji simply ban russian apps and declare victory. russiaukrainecrisis russiaukraineconflict ukrainewar'

In [85]:
df[df['is_Hate_m2']=="hate speech"].Tweet_content.iloc[3]

'russian map is now bigger than the older one russian ukrainian donbass crimea moscow ukrainerussiacrisis ukraina kiev'

# 尝试换一下模型

In [86]:
from sklearn.naive_bayes import MultinomialNB
#让我们从朴素的贝叶斯分类器开始，它为该任务提供了一个很好的基准。 scikit-learn包含此分类器的多种变体； 多项式最适合单词计数：
clf = MultinomialNB().fit(X2_train,y2_train)

In [87]:
predicted = clf.predict(X2_test)

In [88]:
print (accuracy_score (y2_test,predicted))
print (f1_score (y2_test,predicted, average='macro'))
print (recall_score (y2_test,predicted, average='macro'))

0.8425673183327186
0.800798623542625
0.7944992042351556


In [91]:
x = np. array(subdata["text"])
y = np. array(subdata["label"])
# Splitting the Data
skf = StratifiedKFold(n_splits=10)
skf.get_n_splits(x, y)
for train_index, test_index in skf.split(x, y):
    print("TRAIN:", train_index, "TEST:", test_index)
    X3_train, X3_test = x[train_index], x[test_index]
    y3_train, y3_test = y[train_index], y[test_index]

TRAIN: [ 11128  11129  11130 ... 135553 135554 135555] TEST: [    0     1     2 ... 18499 18501 18503]
TRAIN: [     0      1      2 ... 135553 135554 135555] TEST: [11128 11129 11130 ... 37097 37105 37106]
TRAIN: [     0      1      2 ... 135553 135554 135555] TEST: [22146 22154 22159 ... 55224 55226 55228]
TRAIN: [     0      1      2 ... 135553 135554 135555] TEST: [33103 33106 33110 ... 73695 73698 73701]
TRAIN: [     0      1      2 ... 135553 135554 135555] TEST: [44339 44342 44344 ... 92005 92008 92010]
TRAIN: [     0      1      2 ... 135553 135554 135555] TEST: [55315 55320 55326 ... 97512 97513 97514]
TRAIN: [     0      1      2 ... 135553 135554 135555] TEST: [ 66269  66274  66277 ... 102909 102910 102911]
TRAIN: [     0      1      2 ... 135553 135554 135555] TEST: [ 77389  77390  77398 ... 124346 124347 124349]
TRAIN: [     0      1      2 ... 135553 135554 135555] TEST: [ 88408  88409  88411 ... 130186 130187 130188]
TRAIN: [     0      1      2 ... 130186 130187 130188] 

In [92]:
from sklearn.feature_extraction.text import TfidfTransformer
from sklearn.pipeline import Pipeline
text_clf = Pipeline([
    ('vect', CountVectorizer()),
    ('tfidf', TfidfTransformer()),
    ('clf', MultinomialNB()),
])

In [93]:
text_clf.fit(X3_train,y3_train)

In [94]:
#线性支持向量机（SVM）
from sklearn.linear_model import SGDClassifier
text_clf = Pipeline([
    ('vect', CountVectorizer()),
    ('tfidf', TfidfTransformer()),
    ('clf', SGDClassifier(loss='hinge', penalty='l2',
                          alpha=1e-3, random_state=42,
                          max_iter=5, tol=None)),
])

text_clf.fit(X3_train, y3_train)
y3_pred = text_clf.predict(X3_test)
print (accuracy_score (y3_test,y3_pred))
print (f1_score (y3_test,y3_pred, average='macro'))
print (recall_score (y3_test,y3_pred, average='macro'))

0.7576540022132054
0.588504571982918
0.6666666666666666


换了以后还不如之前的呢……

# 优化工作
从上面的结果可以看出，目前的分类器虽然在测试集上表现不错，但泛化性能很差。可能可以从以下方向尝试排查问题并优化分类器：
1. 考虑过拟合，学习相关代码
2. 检查训练集的数据格式与俄乌战争推特有无区别，将俄乌战争推特进一步清洗或转换成训练集一致格式
3. 优化特征工程，如加入句法标注器nltk pos tagger