# Processing Data from Annotated Datasets

Since data is has data annotations in different formats, we need to process it to a unified format. This includes cleaning, transforming, and merging data from different sources. The goal is to create a single, consistent dataset that can be used for further analysis and modeling.  This notebook outlines the steps involved in this process.

In [1]:
import pandas as pd
import numpy as np
import dtale
from master_thesis.config import INTERIM_DATA_DIR, load_dataframe_from_pickle, save_dataframe_as_pickle

[32m2025-03-14 22:14:14.927[0m | [1mINFO    [0m | [36mmaster_thesis.config[0m:[36m<module>[0m:[36m12[0m - [1mPROJ_ROOT path is: /home/takosaga/Projects/master_thesis[0m


In [2]:
hatexplain_df = load_dataframe_from_pickle(INTERIM_DATA_DIR.as_posix() + '/hatexplain_df.pkl')
measuring_hate_speech_df = load_dataframe_from_pickle(INTERIM_DATA_DIR.as_posix() + '/measuring_hate_speech_df.pkl')
mlma_df = load_dataframe_from_pickle(INTERIM_DATA_DIR.as_posix() + '/mlma_df.pkl')

In [3]:
hatexplain_df.dtypes

post_id    object
text       object
label      object
target     object
dtype: object

### Understanding Hatexplain Dataset


Following markdown is from
`
{mathew2021hatexplain,
  title={HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection},
  author={Mathew, Binny and Saha, Punyajoy and Yimam, Seid Muhie and Biemann, Chris and Goyal, Pawan and Mukherjee, Animesh},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={35},
  number={17},
  pages={14867--14875},
  year={2021}
}
`

* Annotation Procedure
  
We use Amazon Mechanical Turk (MTurk) workers for our
annotation task. Each post in our dataset contains three types
of annotations. First, whether the text is a hate speech, offensive speech, or normal. Second, the target communities
in the text. Third, if the text is considered as hate speech, or
offensive by majority of the annotators, we further ask the
annotators to annotate parts of the text, which are words or
phrases that could be a potential reason for the given annotation. These additional span annotations allow us to further
explore how hate or offensive speech manifests itself.
* Target Group Annotation 

The primary goal of the annotation task is to determine whether a given text is hateful, offensive, or neither of the two, i.e. normal. As noted
above, we also get span annotations as reasons for the label
assigned to a post (hateful or offensive). To further enrich
the dataset, we ask the workers to decide the groups that the
hate/offensive speech is targeting. We included target groups
based on Race, Religion, Gender, Sexual Orientation etc.
* Annotation Instructions And Design Of The Interface
  
Before starting the annotation task, workers are explicitly
warned that the annotation task displays some hateful or
offensive content. We prepare instructions for workers that
clearly explain the goal of the annotation task, how to annotate spans and also include a definition for each category. We
provided multiple examples with classification, target community and span annotations to help the annotators understand the task. To further ensure high quality dataset, we use
built-in MTurk qualification requirements, namely the HIT
Approval Rate (95%) for all Requesters’ HITs and the Number of HITs Approved (5,000) requirements.
* Class labels: 
  
The class label (hateful, offensive, normal) of
a post was decided based on majority voting. We found 919
cases where all the three annotators chose a different class.
We did not consider these posts for our analysis.
To decide the target community of a post, we rely on majority voting. We consider that a target community is present in the post, if at least two out of the three annotators have
selected the target community. We also add a filter that the
community should be present in at least 100 posts. Based
on this criteria, our dataset had the following ten communities: African, Islam, Jewish, LGBTQ, Women, Refugee, Arab,
Caucasian, Hispanic, Asian. The target community information would allow researchers to delve into issues related
to bias in hate speech (Davidson, Bhattacharya, and Weber
2019). In our dataset, the top three communities that are targets of hate speech are the African, Islam, and Jewish community. In case of offensive speech, the top three targets
are Women, Africans, and LGBTQ. These observations are
in agreement with previous research (Silva et al. 2016).

In [4]:
hatexplain_df.head()

Unnamed: 0,post_id,text,label,target
0,1179055004553900032_twitter,i dont think im getting my baby them white 9 h...,normal,[None]
1,1179063826874032128_twitter,we cannot continue calling ourselves feminists...,normal,[None]
2,1178793830532956161_twitter,nawt yall niggers ignoring me,normal,[African]
3,1179088797964763136_twitter,<user> i am bit confused coz chinese ppl can n...,hatespeech,[Asian]
4,1179085312976445440_twitter,this bitch in whataburger eating a burger with...,hatespeech,"[Caucasian, Women]"


In [5]:
d = dtale.show(hatexplain_df)
d.open_browser()

In [6]:
d.kill()

2025-03-14 22:14:15,110 - INFO     - Shutdown complete


In [7]:
mlma_df.dtypes

HITId         int64
tweet        object
sentiment    object
target       object
group        object
dtype: object

### Understanding MLMA Dataset

Following markdown is from
` {ousidhoum-etal-multilingual-hate-speech-2019,
        title = "Multilingual and Multi-Aspect Hate Speech Analysis",
        author = "Ousidhoum, Nedjma
                 and Lin, Zizheng
                 and Zhang, Hongming
                and Song, Yangqiu
                and Yeung, Dit-Yan",
            booktitle = "Proceedings of EMNLP",
        year = "2019",
        publisher =	"Association for Computational Linguistics",
}	`

* Hostility type 

To identify the hostility type of
the tweet, we stick to the following conventions:
(1) if the tweet sounds dangerous, it should be la-
beled as abusive; (2) according to the degree to
which it spreads hate and the tone its author uses, it
can be hateful, offensive or disrespectful; (3) if the
tweet expresses or spreads fear out of ignorance
against a group of individuals, it should be labeled
as fearful; (4) otherwise it should be annotated as
normal. We define this task to be multilabel. Table 2 shows that hostility types are relatively consistent across different languages and offensive is
the most frequent label.
* Target attribute 

After annotating the pilot
dataset, we noticed common misconceptions regarding race, ethnicity, and nationality, therefore
we merged these attributes into one label origin. Then, we asked the annotators to determine
whether the tweet insults or discriminates against
people based on their (1) origin, (2) religious affiliation, (3) gender, (4) sexual orientation, (5) special needs or (6) other. Table 2 shows there are
fewer tweets targeting disability in Arabic compared to English and French and no tweets insulting people based on their sexual orientation which
may be due to the fact that the labels of gender,
gender identity, and sexual orientation use almost
the same wording. On the other hand, French contains a small number of tweets targeting people
based on their gender in comparison to English
and Arabic. We have observed significant differences in terms of target attributes in the three languages. More data may help us examine the problems affecting targets of different linguistic backgrounds.
* Target group 

We determined 16 common target
groups tagged by the annotators after the first annotation step. The annotators had to decide on
whether the tweet is aimed at women, people of
African descent, Hispanics, gay people, Asians,
Arabs, immigrants in general, refugees; people
of different religious affiliations such as Hindu,
Christian, Jewish people, and Muslims; or from
political ideologies socialists, and others. We also
provided the annotators with a category to cover
hate directed towards one individual, which cannot be generalized. In case the tweet targets more
than one group of people, the annotators should
choose the group which would be the most affected by it according to them. Table 1 shows
the counts of the five categories out of 16 that
commonly occur in the three languages. In fact,
most of the tweets target individuals or fall into the
“other” category. In the latter case, they may target
people with different political views such as liberals or conservatives in English and French, or specific ethnic groups such as Kurdish people in Arabic. English tweets tend to have more tweets targeting people with special needs, due to common
language-specific demeaning terms used in conversations where people insult one another. Arabic tweets contain more hateful comments towards
women for the same reason. On the other hand, the
French corpus contains more tweets that are offensive towards African people, due to hateful comments generated by debates about immigrants.

In [8]:
mlma_df.head()

Unnamed: 0,HITId,tweet,sentiment,target,group
0,0,If America had another 8 years of Obama's ideo...,fearful_abusive_hateful_disrespectful_normal,origin,other
1,1,Most Canadians have never met seen or associat...,offensive,disability,special_needs
2,2,Hahaha grow up faggot @URL,offensive,sexual_orientation,women
3,3,@user queue is fucking retarded it makes every...,offensive_hateful,disability,special_needs
5,5,dude i\u2019m so tired of being retarded,offensive,disability,special_needs


In [9]:
d = dtale.show(mlma_df)
d.open_browser()

In [10]:
d.kill()

2025-03-14 22:14:15,195 - INFO     - Shutdown complete


### Understanding Measuring Hate Speech Dataset


Following Markdown is from `{ousidhoum-etal-multilingual-hate-speech-2019,
        title = "Multilingual and Multi-Aspect Hate Speech Analysis",
        author = "Ousidhoum, Nedjma
                 and Lin, Zizheng
                 and Zhang, Hongming
                and Song, Yangqiu
                and Yeung, Dit-Yan",
            booktitle = "Proceedings of EMNLP",
        year = "2019",
        publisher =	"Association for Computational Linguistics",
}	`

Table 3. Example comment per theorized level of the proposed hate speech – counterspeech spectrum.

| Level            | Text                                                                                       | Hate score |
|------------------|--------------------------------------------------------------------------------------------|------------|
| 5. Genocidal     | Every g**k in #LosAngeles should be deported or killed.                                    | 2.69       |
| 4. Violence      | #whitepower...I'll hang you n****r                                                         | 2.55       |
| 3. Dehumanization| You're all a bunch of human cockroaches is what you are... You ch**ks...                   | 1.51       |
| 2. Hostility     | Learn the fucking language you fucking useless immigrant.                                  | 1.18       |
| 1. Bias          | @[NAME] They are arabs. Do you need any other explanations?                                | 0.90       |
| 0. Neutral       | Go get a job at Dick's Sporting Goods and try to work at being a better person.            | -0.50      |
| -1. Supportive   | I'm bi. And a good listener if you need a friend                                           | -2.99      |
| -2. Counterspeech| No, the chances of a muslim shooting you in America is almost nil. There are over 50K gun deaths every year christian USA... | -0.82      |


* 3.2 Labeling instrument

In order to evaluate the comments we needed one or more human reviewers (also
called “annotators”, “labelers”, “judges”, “raters”, or “moderators”) to examine the
comment text and provide data to estimate where each comment fell on each of our
theorized components of hate speech. Most labeling tasks for training data give the
labeler a detailed definition of the construct and then ask them to assign a binary
label to each data point (e.g. designating a block of text as hate speech or not,
designating whether an image contains a stop sign). This approach has to two
shortcomings: labelers cannot indicate uncertainty and if the construct has multiple
components that labelers differ on, the label does not indicate which element they
disagree in. The approach described below overcomes both of these issues by
decomposing the construct of hate speech into multiple labeling items and by giving
labelers Likert-style response options to incorporate uncertainty.
The labeling instrument, similar to a a survey instrument, has three sections: 1)
6/33
identity target items, which establish whether the comment targets a protected group ,
2) scale items that measure the content of the comment along several distinguishing
features of hate speech, and 3) a set of demographic questions asked about the
labelers. The target items and scale items are asked for each of the comments that
labelers review, and then are followed by the demographic items. One of the scale
items, sentiment, is asked before the target items, to get one measure for items that
target a non-protected group or no group at all. If no identity groups were mentioned
they were not asked any remaining scale items and proceeded to the next comment. If
at least one identity group was mentioned they were asked to specify the sub-identity
group(s),2 and then asked the remaining scale items. All comments were also rated on
a binary hate speech item sourced from Siegel et al. (2019) to allow comparison to the
current best practice in binary hate- speech measurement.
Differences in labeler knowledge and views make consistent annotations difficult to
obtain. We address differences in labeler knowledge by providing a dictionary tool for
niche slurs that appear in the comments we showed to labelers. Using a new
dictionary, slur words were underlined in our survey user interface. If the user moved
their mouse over the underlined slur word they would be shown a tooltip stating “This
word may be a slur against identity group [X]” (see Figure 1 for an example). This
user interface feature was intended to reduce response variation due to varying
awareness of slur terms, as well as to make noticeable any coded slur language in the
comment (for more details on the problem of covert slurs see Magu et al. 2017).
Figure 1. Highlighting slurs for human labelers. We highlighted known slurs for
our human labelers in the annotation interface.
After rating the comments reviewers were asked a series of demographic questions
about themselves, followed by an optional free response feedback item. The
demographic items included the reviewer’s gender, education, race, year of birth,
income, religion, sexual orientation, and political ideology.
* 3.3 Comment collection

We sourced our comments from three major social media platforms: YouTube,
Twitter, and Reddit. We chose these platforms for their popularity, as respectively,
they are used by 73%, 22%, and 11% of U.S. adults (Perrin et al. 2019). Prior work on
hate speech has often focused on a single platform, commonly Twitter, but our goal
was to study hate speech in a variety of settings and to ultimately build an
algorithmic model to accurately measure hate speech across multiple platforms
(Fortuna et al. 2018). We used public APIs to download recent comments posted to
each site. Comments were considered eligible for labeling if they were written
primarily in English and were not too short (< 4 characters) or too long (> 600
2As described in the construct theorization section, groups consisted of the categories protected
under US law (e.g., religion), while the sub-identity groups are a short list of the most commonly
occurring groups within these categories (e.g., Muslims). See Appendix zz for full list of identity and
sub-identity groups.
7/33
characters) after removing URLs, phone numbers and contiguous whitespace. Our
comment collection took place between March and August 2019.
On Reddit we collected all comments from the real-time stream of the subreddit
“/r/all”. For Twitter, we collected tweets from Twitter’s streaming API, which is a
random sample of all tweets on Twitter. YouTube required additional consideration
because one must first select videos and then download comments associated with the
selected videos. We searched for videos within proximity of the top 300 most
populated U.S. cities in order to focus on videos originating in the U.S. and most
likely to contain English comments with U.S.-based authors. From those videos we
then downloaded all comments and responses.
* 3.4 Crowdsourced labeling

Hate speech is a rare phenomenon, estimated at less than 1% of online comments
when viewed as a binary outcome, so randomly sampling from the collected comments
would not have been efficient. That is, the outcome in the labeled data would be
highly imbalanced at < 1% hate speech and 99% non-hate speech, which would make
it difficult for statistical machine learning analysis to find patterns that differentiate
between hate speech and non-hate speech and costly for our labeling process. Instead,
we used a sampling method that would increase the relevance of the labeled comments
to our theorized levels of hate speech; we targeted an even distribution of labeled
comments across our 8 levels (12.5% each). We also wanted to avoid common
shortcuts to increase rates of hate speech in labeled text, such as filtering on slur
terms or Twitter hashtags. Those approaches would artificially reduce the linguistic
variation in the comments and allow the deep learning to learn those shortcuts
(confounded associations) without capturing true patterns (i.e. causal relationships),
which is known as the “Clever Hans” effect (Heinzerling 2019; Niven et al. 2019). In an
effort to maximize the generalizability of our deep learning algorithm, we maintained a
positive probability of selection for all sampled comments (i.e. no comments would be
excluded based on their word usage).
Our sampling method relied on two dimensions for stratified sampling: 1) a
relevance estimate of how likely the comment was to contain a target identity group,
and 2) a hypothesis score for how hateful the comment was estimated to be. Both
scores were built from a pilot set of 4,000 labeled comments, using pre-trained
Universal Sentence Encoder representations (TensorFlow) plus a genetically optimized
prediction head (Olson, Urbanowicz, et al. 2016). For identity prediction the genetic
optimization algorithm selected a multilayer perceptron model while for the hypothesis
score it selected a random forest.3 With each future iteration of the project we can
leverage the models developed in the prior iteration to improve the stratified sampling
efficiency.
We used the identity relevance and hate speech hypothesis scores to create five
stratification bins: 1) irrelevant (i.e. estimated to contain no references to identity
groups), 2) relevant and low on predicted hate speech score (potential counterspeech
or positive identity speech), 3) relevant and moderate on predicted hate speech score
(neutral), 4) relevant and high on predicted hate speech score (low or moderate
intensity hate speech), and 5) relevant and very high on predicted hate speech score
(violent hate speech). We heavily oversampled bins 2, 4, and 5, and undersampled
bins 1 and 3. Because this stratification scheme covered all comments, each comment
had a positive probability of being sampled, but we improved the likelihood of labeling
3
In the pilot set we used simpler scores to create the stratification bins. The maximum cosine
similarity to an identity-term dictionary was used for relevance estimation in the pilot. For our pilot
hypothesis score we used the Perspective API’s identity attack model, which gave a predicted probability
of the comment being hate speech (https://perspectiveapi.com).
8/33
comments that were some form of hate speech or counterspeech. As in a case-control
study, this biased sample could be re-weighted back to the original population of
comments through inverse probability weighting (Horvitz et al. 1952). We
incorporated platform sample size targets such that our labeled data consisted of 40%
sourced from Reddit, 40% from Twitter, and 20% from YouTube.

In [11]:
hatexplain_df = load_dataframe_from_pickle(INTERIM_DATA_DIR.as_posix() + '/hatexplain_df.pkl')
measuring_hate_speech_df = load_dataframe_from_pickle(INTERIM_DATA_DIR.as_posix() + '/measuring_hate_speech_df.pkl')
mlma_df = load_dataframe_from_pickle(INTERIM_DATA_DIR.as_posix() + '/mlma_df.pkl')

In [12]:
measuring_hate_speech_df.dtypes

comment_id                                 int32
target_race_asian                           bool
target_race_black                           bool
target_race_latinx                          bool
target_race_middle_eastern                  bool
target_race_native_american                 bool
target_race_pacific_islander                bool
target_race_white                           bool
target_race_other                           bool
target_race                                 bool
target_religion_atheist                     bool
target_religion_buddhist                    bool
target_religion_christian                   bool
target_religion_hindu                       bool
target_religion_jewish                      bool
target_religion_mormon                      bool
target_religion_muslim                      bool
target_religion_other                       bool
target_religion                             bool
target_origin_immigrant                     bool
target_origin_migran

In [13]:
measuring_hate_speech_df.head()

Unnamed: 0,comment_id,target_race_asian,target_race_black,target_race_latinx,target_race_middle_eastern,target_race_native_american,target_race_pacific_islander,target_race_white,target_race_other,target_race,...,target_disability_cognitive,target_disability_neurological,target_disability_visually_impaired,target_disability_hearing_impaired,target_disability_unspecific,target_disability_other,target_disability,text,hate_speech_score,platform
0,1,False,False,False,False,False,False,True,False,True,...,False,False,False,False,False,False,False,White people are trash.,0.46,0
1,2,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,"You've caught the big gay, it's known to be ai...",0.03,0
2,3,False,True,False,False,False,False,False,False,True,...,False,False,False,False,False,False,False,"I'm not saying you said that, I'm just saying ...",-1.29,0
3,4,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,Donald Trump. Yeet myself off a building onto ...,-0.24,0
4,5,False,True,True,False,False,False,False,False,True,...,False,False,False,False,False,False,False,Fabrice Fabrice is ostensibly black or black/l...,-2.84,0


In [14]:
d = dtale.show(measuring_hate_speech_df)
d.open_browser()

In [15]:
d.kill()

2025-03-14 22:14:15,578 - INFO     - Shutdown complete


# Adjusting columns from datasets

* original_id
* original_dataset_title
* platform
* original_label
* original_targets (list)
* label_hatespeech_binary_offensive_not_included
* label_hatespeech_binary_offensive_included
* label_normal_offensive_hatespeech

### Hatexplain Dataset Adjustments


In [16]:
hatexplain_df.head()

Unnamed: 0,post_id,text,label,target
0,1179055004553900032_twitter,i dont think im getting my baby them white 9 h...,normal,[None]
1,1179063826874032128_twitter,we cannot continue calling ourselves feminists...,normal,[None]
2,1178793830532956161_twitter,nawt yall niggers ignoring me,normal,[African]
3,1179088797964763136_twitter,<user> i am bit confused coz chinese ppl can n...,hatespeech,[Asian]
4,1179085312976445440_twitter,this bitch in whataburger eating a burger with...,hatespeech,"[Caucasian, Women]"


In [17]:
def split_twitter_id(df, id_column, new_id_column='original_id', new_platform_column='platform'):
    """
    Splits a column of Twitter-style IDs (e.g., '1179055004553900032_twitter') into original ID and platform.

    Args:
        df (pd.DataFrame): The DataFrame containing the IDs.
        id_column (str): The name of the column containing the IDs.
        new_id_column (str): The name of the new column for the original IDs.
        new_platform_column (str): The name of the new column for the platform.

    Returns:
        pd.DataFrame: The DataFrame with the new columns added.
    """

    def split_id(id_str):
        parts = str(id_str).split('_', 1)  # Split only once at the first underscore
        if len(parts) == 2:
            return parts[0], parts[1]
        else:
            return None, None #Handle cases where the split does not work.

    df[[new_id_column, new_platform_column]] = df[id_column].apply(split_id).apply(pd.Series)
    return df

In [18]:
hatexplain_df = split_twitter_id(hatexplain_df, 'post_id')

In [19]:
hatexplain_df.head()

Unnamed: 0,post_id,text,label,target,original_id,platform
0,1179055004553900032_twitter,i dont think im getting my baby them white 9 h...,normal,[None],1179055004553900032,twitter
1,1179063826874032128_twitter,we cannot continue calling ourselves feminists...,normal,[None],1179063826874032128,twitter
2,1178793830532956161_twitter,nawt yall niggers ignoring me,normal,[African],1178793830532956161,twitter
3,1179088797964763136_twitter,<user> i am bit confused coz chinese ppl can n...,hatespeech,[Asian],1179088797964763136,twitter
4,1179085312976445440_twitter,this bitch in whataburger eating a burger with...,hatespeech,"[Caucasian, Women]",1179085312976445440,twitter


In [20]:
d = dtale.show(hatexplain_df)
d.open_browser()


In [21]:
d.kill()

2025-03-14 22:14:17,249 - INFO     - Shutdown complete


In [22]:
hatexplain_df.drop('post_id', axis=1, inplace=True)

In [23]:
hatexplain_df['original_dataset_title'] = 'HateXplain'
hatexplain_df.head()

Unnamed: 0,text,label,target,original_id,platform,original_dataset_title
0,i dont think im getting my baby them white 9 h...,normal,[None],1179055004553900032,twitter,HateXplain
1,we cannot continue calling ourselves feminists...,normal,[None],1179063826874032128,twitter,HateXplain
2,nawt yall niggers ignoring me,normal,[African],1178793830532956161,twitter,HateXplain
3,<user> i am bit confused coz chinese ppl can n...,hatespeech,[Asian],1179088797964763136,twitter,HateXplain
4,this bitch in whataburger eating a burger with...,hatespeech,"[Caucasian, Women]",1179085312976445440,twitter,HateXplain


In [24]:
column_remapping = {
    'label':'original_label',
    'target':'original_target',
}

hatexplain_df = hatexplain_df.rename(columns=column_remapping)
hatexplain_df.head()

Unnamed: 0,text,original_label,original_target,original_id,platform,original_dataset_title
0,i dont think im getting my baby them white 9 h...,normal,[None],1179055004553900032,twitter,HateXplain
1,we cannot continue calling ourselves feminists...,normal,[None],1179063826874032128,twitter,HateXplain
2,nawt yall niggers ignoring me,normal,[African],1178793830532956161,twitter,HateXplain
3,<user> i am bit confused coz chinese ppl can n...,hatespeech,[Asian],1179088797964763136,twitter,HateXplain
4,this bitch in whataburger eating a burger with...,hatespeech,"[Caucasian, Women]",1179085312976445440,twitter,HateXplain


In [25]:
hatexplain_df['label_hatespeech_binary_offensive_not_included'] = hatexplain_df['original_label'].apply(
    lambda x: 'hatespeech' if x == 'hatespeech' else 'not_hatespeech'
)

# Create label_hatespeech_binary (offensive included) as text
hatexplain_df['label_hatespeech_binary_offensive_included'] = hatexplain_df['original_label'].apply(
    lambda x: 'hatespeech/offensive' if x in ['hatespeech', 'offensive'] else 'normal'
)

# Create label_normal_offensive_hatespeech as text
hatexplain_df['label_normal_offensive_hatespeech'] = hatexplain_df['original_label']

In [26]:
hatexplain_df.head()

Unnamed: 0,text,original_label,original_target,original_id,platform,original_dataset_title,label_hatespeech_binary_offensive_not_included,label_hatespeech_binary_offensive_included,label_normal_offensive_hatespeech
0,i dont think im getting my baby them white 9 h...,normal,[None],1179055004553900032,twitter,HateXplain,not_hatespeech,normal,normal
1,we cannot continue calling ourselves feminists...,normal,[None],1179063826874032128,twitter,HateXplain,not_hatespeech,normal,normal
2,nawt yall niggers ignoring me,normal,[African],1178793830532956161,twitter,HateXplain,not_hatespeech,normal,normal
3,<user> i am bit confused coz chinese ppl can n...,hatespeech,[Asian],1179088797964763136,twitter,HateXplain,hatespeech,hatespeech/offensive,hatespeech
4,this bitch in whataburger eating a burger with...,hatespeech,"[Caucasian, Women]",1179085312976445440,twitter,HateXplain,hatespeech,hatespeech/offensive,hatespeech


In [27]:
hatexplain_df.dtypes

text                                              object
original_label                                    object
original_target                                   object
original_id                                       object
platform                                          object
original_dataset_title                            object
label_hatespeech_binary_offensive_not_included    object
label_hatespeech_binary_offensive_included        object
label_normal_offensive_hatespeech                 object
dtype: object

The original labels have normal, offensive, and hatespeech.  I modified to inclued 2 columns to inclue hatespeech as a binary to include and exlude text labeled offensive

### MLMA Dataset Adjustments


In [28]:
mlma_df.head()

Unnamed: 0,HITId,tweet,sentiment,target,group
0,0,If America had another 8 years of Obama's ideo...,fearful_abusive_hateful_disrespectful_normal,origin,other
1,1,Most Canadians have never met seen or associat...,offensive,disability,special_needs
2,2,Hahaha grow up faggot @URL,offensive,sexual_orientation,women
3,3,@user queue is fucking retarded it makes every...,offensive_hateful,disability,special_needs
5,5,dude i\u2019m so tired of being retarded,offensive,disability,special_needs


In [29]:
mlma_df['original_dataset_title'] = 'MLMA'
mlma_df['platform'] = 'twitter'

In [30]:
column_remapping = {
    'sentiment':'original_label',
    'tweet':'text',
    'HITId': 'original_id'
}

mlma_df = mlma_df.rename(columns=column_remapping)
mlma_df.head()

Unnamed: 0,original_id,text,original_label,target,group,original_dataset_title,platform
0,0,If America had another 8 years of Obama's ideo...,fearful_abusive_hateful_disrespectful_normal,origin,other,MLMA,twitter
1,1,Most Canadians have never met seen or associat...,offensive,disability,special_needs,MLMA,twitter
2,2,Hahaha grow up faggot @URL,offensive,sexual_orientation,women,MLMA,twitter
3,3,@user queue is fucking retarded it makes every...,offensive_hateful,disability,special_needs,MLMA,twitter
5,5,dude i\u2019m so tired of being retarded,offensive,disability,special_needs,MLMA,twitter


In [31]:
d = dtale.show(mlma_df)
d.open_browser()

In [32]:
d.kill()

2025-03-14 22:14:17,424 - INFO     - Shutdown complete


In [33]:
import numpy as np

# label_hatespeech_binary_offensive_not_included (as text)
mlma_df['label_hatespeech_binary_offensive_not_included'] = np.where(mlma_df['original_label'].str.contains('hateful'), 'hatespeech', 'not_hatespeech')

# label_hatespeech_binary_offensive_included (as text)
mlma_df['label_hatespeech_binary_offensive_included'] = np.where(mlma_df['original_label'].str.contains('hateful') | mlma_df['original_label'].str.contains('offensive'), 'hatespeech/offensive', 'normal')

# label_normal_offensive_hatespeech (as text)
conditions = [
    mlma_df['original_label'].str.contains('hateful'),
    mlma_df['original_label'].str.contains('offensive')
]

choices = ['hatespeech', 'offensive']
mlma_df['label_normal_offensive_hatespeech'] = np.select(conditions, choices, default='normal')

In [34]:
mlma_df.head()

Unnamed: 0,original_id,text,original_label,target,group,original_dataset_title,platform,label_hatespeech_binary_offensive_not_included,label_hatespeech_binary_offensive_included,label_normal_offensive_hatespeech
0,0,If America had another 8 years of Obama's ideo...,fearful_abusive_hateful_disrespectful_normal,origin,other,MLMA,twitter,hatespeech,hatespeech/offensive,hatespeech
1,1,Most Canadians have never met seen or associat...,offensive,disability,special_needs,MLMA,twitter,not_hatespeech,hatespeech/offensive,offensive
2,2,Hahaha grow up faggot @URL,offensive,sexual_orientation,women,MLMA,twitter,not_hatespeech,hatespeech/offensive,offensive
3,3,@user queue is fucking retarded it makes every...,offensive_hateful,disability,special_needs,MLMA,twitter,hatespeech,hatespeech/offensive,hatespeech
5,5,dude i\u2019m so tired of being retarded,offensive,disability,special_needs,MLMA,twitter,not_hatespeech,hatespeech/offensive,offensive


In [35]:
mlma_df['original_target'] = mlma_df.apply(lambda row: [row['target'], row['group']], axis=1)

In [36]:
mlma_df.drop(columns=['target','group'], axis=1, inplace=True)

In [37]:
d = dtale.show(mlma_df)
d.open_browser()

### Measuring Hate Speech Dataset Adjustments

In [38]:
measuring_hate_speech_df.head()

Unnamed: 0,comment_id,target_race_asian,target_race_black,target_race_latinx,target_race_middle_eastern,target_race_native_american,target_race_pacific_islander,target_race_white,target_race_other,target_race,...,target_disability_cognitive,target_disability_neurological,target_disability_visually_impaired,target_disability_hearing_impaired,target_disability_unspecific,target_disability_other,target_disability,text,hate_speech_score,platform
0,1,False,False,False,False,False,False,True,False,True,...,False,False,False,False,False,False,False,White people are trash.,0.46,0
1,2,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,"You've caught the big gay, it's known to be ai...",0.03,0
2,3,False,True,False,False,False,False,False,False,True,...,False,False,False,False,False,False,False,"I'm not saying you said that, I'm just saying ...",-1.29,0
3,4,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,Donald Trump. Yeet myself off a building onto ...,-0.24,0
4,5,False,True,True,False,False,False,False,False,True,...,False,False,False,False,False,False,False,Fabrice Fabrice is ostensibly black or black/l...,-2.84,0


In [39]:
d = dtale.show(measuring_hate_speech_df)
d.open_browser()

In [40]:
d.kill()

2025-03-14 22:14:17,881 - INFO     - Shutdown complete


In [41]:
measuring_hate_speech_df['platform'].value_counts()

platform
0    15842
2    15475
3     8178
1       70
Name: count, dtype: int64

`Pratik Sachdeva, Renata Barreto, Geoff Bacon, Alexander Sahn, Claudia von Vacano, and Chris Kennedy. 2022. The Measuring Hate Speech Corpus: Leveraging Rasch Measurement Theory for Data Perspectivism. In Proceedings of the 1st Workshop on Perspectivist Approaches to NLP @LREC2022, pages 83–94, Marseille, France. European Language Resources Association.`

We sourced comments from three major platforms–
YouTube, Twitter, and Reddit–performing collection
between March and August 2019. We only considered comments that were written primarily in English
and were between 4 and 600 characters. Additionally,
1https://huggingface.co/datasets/
ucberkeley-dlab/measuring-hate-speech
2https://github.com/dlab-projects/
hate_measure_data
we aimed to source 40% of the corpus from Reddit,
40% from Twitter, and 20% from YouTube.


* 0 uses the u/username reddit syntax 
* 3 is youtube since is 20%
* 2 is twitter
* 1 is unknown and will be removed


In [42]:
# Remove rows where the column value is 1
measuring_hate_speech_df = measuring_hate_speech_df[measuring_hate_speech_df['platform'] != 1]

# Replace integer values with strings
measuring_hate_speech_df['platform'] = measuring_hate_speech_df['platform'].copy().replace({
    0: 'reddit',
    2: 'twitter',
    3: 'youtube'
})


In [43]:
measuring_hate_speech_df['platform'].value_counts()

platform
reddit     15842
twitter    15475
youtube     8178
Name: count, dtype: int64

In [44]:
target_columns = []
for column in measuring_hate_speech_df.columns.to_list():
    if column.startswith('target'):
        target_columns.append(column)

In [None]:
measuring_hate_speech_df['original_target'] = measuring_hate_speech_df.apply(lambda row: [
    col.split('_')[-1] if len(col.split('_')) == 2 else '_'.join(col.split('_')[2:])
    for col in target_columns if row[col]
], axis=1)

In [46]:
measuring_hate_speech_df['original_target'].value_counts()

original_target
[women, gender]                                                                                                        6335
[black, race]                                                                                                          2580
[specific_country, origin]                                                                                             2118
[gay, sexuality]                                                                                                       1970
[men, gender]                                                                                                          1685
                                                                                                                       ... 
[men, transgender_men, women, gender, gay, sexuality]                                                                     1
[latinx, middle_eastern, other, race, specific_country, origin]                                                     

In [47]:
measuring_hate_speech_df.drop(columns=target_columns,axis=1, inplace=True)

In [48]:
d = dtale.show(measuring_hate_speech_df)
d.open_browser()

In [49]:
d.kill()

2025-03-14 22:14:21,544 - INFO     - Executing shutdown...
2025-03-14 22:14:21,545 - INFO     - Not running with the Werkzeug Server, exiting by searching gc for BaseWSGIServer


In [50]:
measuring_hate_speech_df['original_dataset_title'] = 'Measuring Hate Speech'

In [51]:
column_remapping = {
    'hate_speech_score':'original_label',
    'comment_id': 'original_id'
}

measuring_hate_speech_df = measuring_hate_speech_df.rename(columns=column_remapping)
measuring_hate_speech_df.head()

Unnamed: 0,original_id,text,original_label,platform,original_target,original_dataset_title
0,1,White people are trash.,0.46,reddit,"[white, race]",Measuring Hate Speech
1,2,"You've caught the big gay, it's known to be ai...",0.03,reddit,"[gay, sexuality]",Measuring Hate Speech
2,3,"I'm not saying you said that, I'm just saying ...",-1.29,reddit,"[black, race]",Measuring Hate Speech
3,4,Donald Trump. Yeet myself off a building onto ...,-0.24,reddit,"[gay, sexuality]",Measuring Hate Speech
4,5,Fabrice Fabrice is ostensibly black or black/l...,-2.84,reddit,"[black, latinx, race, gay, sexuality]",Measuring Hate Speech


from `https://huggingface.co/datasets/ucberkeley-dlab/measuring-hate-speech` hate_speech_score - continuous hate speech measure, where higher = more hateful and lower = less hateful. > 0.5 is approximately hate speech, < -1 is counter or supportive speech, and -1 to +0.5 is neutral or ambiguous.

Will decide between 0.5 and .9 as offensive and above as hatespeech based off UN

The following variables can be adjusted to change the thresholds for hate speech, offensive, and normal.

In [52]:
offensive_min = 0.5
hatespeech_min = 0.9

In [53]:
# label_hatespeech_binary_offensive_not_included
measuring_hate_speech_df['label_hatespeech_binary_offensive_not_included'] = np.where(
    measuring_hate_speech_df['original_label'] > hatespeech_min,
    'hatespeech',
    'not_hatespeech'
)


In [54]:

# label_hatespeech_binary_offensive_included
measuring_hate_speech_df['label_hatespeech_binary_offensive_included'] = np.where(
    measuring_hate_speech_df['original_label'] >= offensive_min,
    'hatespeech/offensive',
    'normal'
)


In [55]:

# label_normal_offensive_hatespeech
conditions = [
    measuring_hate_speech_df['original_label'] > hatespeech_min,
    (measuring_hate_speech_df['original_label'] >= offensive_min) & (measuring_hate_speech_df['original_label'] <= hatespeech_min)
]
choices = ['hatespeech', 'offensive']
measuring_hate_speech_df['label_normal_offensive_hatespeech'] = np.select(
    conditions,
    choices,
    default='normal'
)

In [56]:
d = dtale.show(measuring_hate_speech_df)
d.open_browser()

In [57]:
d.kill()

2025-03-14 22:14:22,266 - INFO     - Executing shutdown...
2025-03-14 22:14:22,267 - INFO     - Not running with the Werkzeug Server, exiting by searching gc for BaseWSGIServer


In [58]:
measuring_hate_speech_df['label_normal_offensive_hatespeech'].value_counts()

label_normal_offensive_hatespeech
normal        29095
hatespeech     7466
offensive      2934
Name: count, dtype: int64

# Saving Preped Dataset Dataframes

Further work needs to be done when combined so saving and then making a new notebook

In [59]:
save_dataframe_as_pickle(mlma_df, INTERIM_DATA_DIR.as_posix() + '/prep_mlma_df.pkl')
save_dataframe_as_pickle(hatexplain_df, INTERIM_DATA_DIR.as_posix() + '/prep_hatexplain_df.pkl')
save_dataframe_as_pickle(measuring_hate_speech_df, INTERIM_DATA_DIR.as_posix() + '/prep_measuring_hate_speech_df.pkl')

DataFrame saved to /home/takosaga/Projects/master_thesis/data/interim/prep_mlma_df.pkl
DataFrame saved to /home/takosaga/Projects/master_thesis/data/interim/prep_hatexplain_df.pkl
DataFrame saved to /home/takosaga/Projects/master_thesis/data/interim/prep_measuring_hate_speech_df.pkl
