# HateXplain Dataset Processing Documentation

## Overview
The dataset was prepared for finetuning target model by performing tokenization, label aggregation, and mapping target groups to broader categories.

---

## Steps

### 1. Dataset Loading
- The dataset was loaded from a JSON file, containing `post_id`, `post_tokens`, and `annotators`.
- Key fields included:
  - **Labels**: Annotator-provided labels (`normal`, `offensive`, `hatespeech`).
  - **Rationales**: Tokens identified as important by annotators.
  - **Target Groups**: Specific groups targeted in the post (e.g., `African`, `Christian`, `Women`).

---

### 2. Preprocessing
- **Label Aggregation**: A majority voting mechanism was used to determine the final label. Posts with all different labels were marked as `undecided`.
- Text was created by joining tokens.

---

### 3. Group Mapping
Target groups (e.g., `African`, `Christian`) were mapped to broader categories:
- **Race**: African, Arabs, Asians, Caucasian, Hispanic
- **Religion**: Buddhism, Christian, Hindu, Islam, Jewish
- **Gender**: Men, Women
- **Sexual Orientation**: Heterosexual, Gay
- **Miscellaneous**: Indigenous, Refugee/Immigrant, None, Others

---

### 4. Dataset Splitting
- The dataset was divided into `train`, `val`, and `test` sets based on predefined post IDs.

---

### 5. Sampling and Filtering
- Filtered the dataset to include specific labels (e.g., `label == 0` or `label == 1`).
- Sampled posts based on label requirements for balanced representation.

---

## Final Output
The final dataset includes:
- **post_id**: Unique identifier for each post.
- **text**: Tokenized text.
- **attention**: Aggregated attention masks.
- **label**: Final aggregated label.
- **categories**: Broader categories for target groups.

---


In [8]:
import json

# Load the dataset
with open('../dataset.json', 'r') as file:
    data = json.load(file)


In [9]:
import pandas as pd
pd.set_option('display.max_colwidth', None)

# Initialize an empty list to store post data
posts_data = []
for post_id, content in data.items():
    # Combine tokens to reconstruct the original post
    post_text = ' '.join(content['post_tokens'])
    
    # Aggregate annotator labels and targets
    labels = [annotator['label'] for annotator in content['annotators']]
    targets = [target for annotator in content['annotators'] for target in annotator['target']]
    
    # Determine the majority label
    majority_label = max(set(labels), key=labels.count)
    if labels.count(majority_label) == 1:
        majority_label = 'undecided'
    # Identify unique target communities
    unique_targets = list(set(targets))
    
    # Append the extracted information to the list
    posts_data.append({
        'post_id': post_id,
        'post_text': post_text,
        'majority_label': majority_label,
        'target_communities': unique_targets
    })


In [10]:
# Create a DataFrame from the list
df = pd.DataFrame(posts_data)
# Define all possible target communities
all_communities = [
    'African', 'Arab', 'Asian', 'Caucasian', 'Hispanic', 'Indian', 'Other',
    'Women', 'Homosexual', 'Men', 'Jewish', 'Islam', 'Christian', 'Buddhist',
    'Refugee', 'Hindu', 'Disability', 'Economic', 'Heterosexual', 'Gay', 'Indigenous', 'Refugee/Immigrant', 'None', 'Others'
]

# Initialize columns for each community with default value 0
for community in all_communities:
    df[community] = 0

# Update columns based on the presence of each community in the target_communities list
for index, row in df.iterrows():
    for community in row['target_communities']:
        if community in all_communities:
            df.at[index, community] += 1

# Define a mapping for labels
label_mapping = {'normal': 0, 'offensive': 2, 'hatespeech': 1, 'undecided': 3}

# Apply the mapping to the 'majority_label' column
df['label'] = df['majority_label'].map(label_mapping)

# Select relevant columns
final_df = df[['post_id', 'post_text', 'label'] + all_communities]

In [11]:
import pandas as pd

def map_to_categories(df, feature_columns, category_mapping):
    """
    Transform DataFrame so that active features are mapped to corresponding categories.

    Parameters:
        df (pd.DataFrame): Input DataFrame with feature columns representing target groups.
        feature_columns (list): List of columns indicating active features.
        category_mapping (dict): Dictionary mapping target groups to their categories.

    Returns:
        pd.DataFrame: Transformed DataFrame with categories and their active status.
    """
    # Create a new column for each category with default value 0
    for category in category_mapping.keys():
        df[category] = 0

    # Iterate through rows and map active features to categories
    for index, row in df.iterrows():
        for feature in feature_columns:
            if row[feature] == 1:  # Active feature
                for category, groups in category_mapping.items():
                    if feature in groups:
                        df.at[index, category] = 1

    return df

all_communities = [
    'African', 'Arab', 'Asian', 'Caucasian', 'Hispanic', 'Indian', 'Other',
    'Women', 'Homosexual', 'Men', 'Jewish', 'Islam', 'Christian', 'Buddhist',
    'Refugee', 'Hindu', 'Disability', 'Economic', 'Heterosexual', 'Gay', 'Indigenous', 'Refugee/Immigrant', 'None', 'Others'
]
category_mapping = {
    'Race': ['African', 'Arabs', 'Asians', 'Caucasian', 'Hispanic'],
    'Religion': ['Buddhism', 'Christian', 'Hindu', 'Islam', 'Jewish'],
    'Gender': ['Men', 'Women'],
    'Sexual Orientation': ['Heterosexual', 'Gay'],
    'Miscellaneous': ['Indigenous', 'Refugee/Immigrant', 'None', 'Others']
}

final_df = map_to_categories(final_df, all_communities, category_mapping)
final_df.drop(columns=all_communities, inplace=True)


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df[category] = 0


In [12]:
final_df = final_df[(final_df.label == 0) | (final_df.label == 1)]
final_df[final_df.label==0] = final_df[final_df.label==0].sample(n=5935)
final_df.to_csv('../hatexplain_processed.csv', index=False)
final_df.head()


Unnamed: 0,post_id,post_text,label,Race,Religion,Gender,Sexual Orientation,Miscellaneous
0,1179055004553900032_twitter,i dont think im getting my baby them white 9 he has two white j and nikes not even touched,0.0,0.0,0.0,0.0,0.0,1.0
1,1179063826874032128_twitter,we cannot continue calling ourselves feminists if the rights of all womxn arent addressed yes to a sexual offences public list but will a trans lesbian bisexual and queer womxn be able to enter their information on the reporting sheet gender forum,0.0,0.0,0.0,0.0,0.0,1.0
2,1178793830532956161_twitter,nawt yall niggers ignoring me,0.0,1.0,0.0,0.0,0.0,1.0
3,1179088797964763136_twitter,<user> i am bit confused coz chinese ppl can not access twitter thn how this ching chong using it i think he pakistani 🤔 🤔 🤔,1.0,0.0,0.0,0.0,0.0,0.0
4,1179085312976445440_twitter,this bitch in whataburger eating a burger with no top bun and holding it palm up i hate white bitches,1.0,1.0,0.0,1.0,0.0,0.0


In [13]:
import ast
import os
# os.mkdir('../data/', exist_ok=True)

with open('../post_id_divisions.json') as f:
    split = f.read()
split = ast.literal_eval(split)
for k, v in split.items():
    if k == 'train':
        train_df = final_df[final_df.post_id.isin(split[k])]
    elif k == 'val':
        val_df = final_df[final_df.post_id.isin(split[k])]
    else:
        test_df = final_df[final_df.post_id.isin(split[k])] 

train_df.to_csv('../data/hatexplain_train.csv', index=False)
val_df.to_csv('../data/hatexplain_val.csv', index=False)
test_df.to_csv('../data/hatexplain_test.csv', index=False)

In [14]:
import pandas as pd
df = pd.read_csv("../data/hatexplain_train.csv")
df.columns

Index(['post_id', 'post_text', 'label', 'Race', 'Religion', 'Gender',
       'Sexual Orientation', 'Miscellaneous'],
      dtype='object')

In [18]:
df['Gender'].value_counts()

Gender
0.0    7857
1.0    1648
Name: count, dtype: int64

In [1]:
import pandas as pd
df = pd.read_csv("../data/hatexplain_train.csv")
df.columns

Index(['post_id', 'post_text', 'label', 'Race', 'Religion', 'Gender',
       'Sexual Orientation', 'Miscellaneous'],
      dtype='object')

In [3]:
df['Sexual Orientation'].value_counts()

Sexual Orientation
0.0    9445
1.0      60
Name: count, dtype: int64