Hatespeech Identification Shared Task at BLP Workshop @IJCNLP-AACL 2025

Objective

The Bangla Multi-task Hate Speech Identification shared task is designed to address the complex and nuanced problem of detecting and understanding hate speech in Bangla across multiple related subtasks such as type of hate, severity, and target group. Find the Task Description below.

Table of contents:

Contents of the Directory
File Structure
Task Description
Dataset
Baseline Script and Official Evaluation Metrics
Baselines
Running Transformer Models
Organizers

Contents of the Directory

Main folder: data
This directory contains data files for the task.
Main folder: scripts
Contains scripts provided to run transformer-based models for subtask 1A and subtask 1B.
Main folder: output-subtask-1a
Contains an output files genrated from the run for subtask 1A.
Main folder: output-subtask-1a
Contains an output files genrated from the run for subtask 1B.
baseline.ipynb
Driver code to run baseline scripts and generate baseline results.
blp-subtask-1a.ipynb
Driver codce to run scripts for transformer models for subtask 1A.
blp-subtask-1b.ipynb
Driver codce to run scripts for transformer models for subtask 1B.
README.md
This file!

File Structure

Hate_Speech_Classification/
│
├── baseline.ipynb
├── blp-subtask-1a.ipynb
├── blp-subtask-1b.ipynb
│
├── data/
│   ├── sub-task-1a/
│   │   ├── dev.tsv
│   │   ├── test.tsv
│   │   └── train.tsv
│   │   ├── original_data/
│   │   │   ├── dev.tsv
│   │   │   ├── test.tsv
│   │   │   └── train.tsv
│   ├── tokenized/
│   │   ├── dev.csv
│   │   ├── test.csv
│   │   └── train.csv
│   ├── sub-task-1b/
|   ├── sub-task-1c/
│
├── output-subtask-1a/
│   ├── output_banglabert/
│   ├── output_bert-base-multilingual-cased/
│   ├── output_distilbert-base-cased/
│   ├── output_distilbert-base-uncased/
│   └── output_xlm-roberta-base/
│
├── output-subtask-1b/
│   ├── output_banglabert/
│   ├── output_bert-base-multilingual-cased/
│   ├── output_distilbert-base-cased/
│   ├── output_distilbert-base-uncased/
│   └── xlm-roberta-base/
│
├── scripts/
│   ├── baselines/
│   │   ├── format_checker/
│   │   │   └── task.py
│   │   ├── prediction/
│   │   |   └── baseline_prediction_files
│   │   └── scorer/
│   │       └── task.py
│   │
│   ├── task.py
│   ├── run_glue_v1.py
│   └── run_glue_v2.py
│
└── README.md
├── requirements.txt

Task Description

This shared task is designed to identify the type of hate, its severity, and the targeted group from social media content. The goal is to develop robust systems that advance research in this area. This shared task have three subtasks:

Subtask 1A: Given a Bangla text collected from YouTube comments, categorize whether it contains Abusive, Sexism, Religious Hate, Political Hate, Profane, or None.
Subtask 1B: Given a Bangla text collected from YouTube comments, categorize whether the hate towards Individuals, Organizations, Communities, or Society.
Subtask 1C: This subtask is a multi-task setup. Given a Bangla text collected from YouTube comments, categorize it into type of hate, severity, and targeted group.

We only focus on subtask 1A and 1B for this project.

Dataset

For a brief overview of the dataset, kindly refer to the README.md file located in the data directory.

Input data format

Subtask 1A

Each file uses the tsv format. A row within the tsv adheres to the following structure:

id	text	label

Where:

id: an index or id of the text
text: text
label: Abusive, Sexism, Religious Hate, Political Hate, Profane, or None.

Example

490273	আওয়ামী লীগের সন্ত্রাসী কবে দরবেন এই সাহস আপনাদের নাই	Political Hate

Subtask 1B

Each file uses the tsv format. A row within the tsv adheres to the following structure:

id	text	label

Where:

id: an index or id of the text
text: text
label: Individuals, Organizations, Communities, or Society.

Example

490273	আওয়ামী লীগের সন্ত্রাসী কবে দরবেন এই সাহস আপনাদের নাই	Organization

Subtask 1C

Each file uses the tsv format. A row within the tsv adheres to the following structure:

id	text	hate_type   hate_severity   to_whom

Where:

id: an index or id of the text
text: text
hate_type: Abusive, Sexism, Religious Hate, Political Hate, Profane, or None.
hate_severity: Little to None, Mild, or Severe.
to_whom: Individuals, Organizations, Communities, or Society.

Example

490273	আওয়ামী লীগের সন্ত্রাসী কবে দরবেন এই সাহস আপনাদের নাই	"Political Hate"  "Little to None"  Organization

Baseline Script and Official Evaluation Metrics

Baseline Script

The scorer for the task is located in the scripts/baselines module of the project. The scorer reports official evaluation metrics and other metrics of a prediction file.

You can install all prerequisites through,

pip install -r requirements.txt

Launch the scorer for the task as follows:

python scripts/baselines/task.py \
--train-file-path=<train_file> \
--test-file-path=<test_file> \
-- subtask = <subtask 1A, 1B, or 1C>

Example

#For subtask 1A
!python scripts/baselines/task.py \
  --train-file-path data/sub-task-1a/train.tsv \
  --dev-file-path data/sub-task-1a/dev.tsv \
  --subtask 1A

Alternatively running baseline.ipynb would produce the basline results

Running Transformer Models

The files (blp-subtask-1a.ipynb and blp-subtask-1b.ipynb) provide details for running the scripts/run_glue_v2.py. A sample command for running the script is provided in the following:

!python scripts/run_glue_v2.py \
  --model_name_or_path distilbert-base-cased \
  --train_file ./data/sub-task-1a/tokenized/train.csv \
  --validation_file ./data/sub-task-1a/tokenized/dev.csv \
  --test_file ./data/sub-task-1a/tokenized/test.csv \
  --do_train \
  --do_eval \
  --do_predict \
  --max_seq_length 128 \
  --per_device_train_batch_size 16 \
  --learning_rate 2e-5 \
  --num_train_epochs 2 \
  --output_dir ./output-subtask-1a/output_distilbert-base-cased/ \
  --overwrite_output_dir

The above example shows the parameters for running distilbert. Besides distilbet, we also ran m-Bert-base, xlmROBETa-base, and banlgaBERT for both the subtasks. To run other models model name needs to be changed an appropriate output directory should be used to safely output files.

Official Evaluation Metrics

The official evaluation metric for the subtask 1A and 1B is micro-F1.

Baselines

The baselines module currently contains a majority, random and a simple n-gram Support Vector Machine (SVM) baseline. For this project, we have downsampled the data by one-third to reduce computational cost and time. Using the downsampled data, we have reproduced the baseline scores. Besides the mentioned baselines, we also incorporate Logistic Regression, Random Forest, and Decision Tree as baseline methods.

Subtask 1A

Baseline Results for the task on Test set (Evaluation Phase)

Model	micro-F1
Random Baseline	0.1609
Majority Baseline	0.5703
n-gram (SVM) Baseline	0.6079
Logistic Regression	0.6041
Random Forest	0.5779
Decision Tree	0.4812

Baseline Results for the task on Dev-Test set

Model	micro-F1
Random Baseline	0.1398
Majority Baseline	0.5639
n-gram (SVM) Baseline	0.5974
Logistic Regression	0.5926
Random Forest	0.5878
Decision Tree	0.5161

Subtask 1B

Baseline Results for the task on Test set (Evaluation Phase)

Model	micro-F1
Random Baseline	0.2082
Majority Baseline	0.6038
n-gram (SVM) Baseline	0.6250
Logistic Regression	0.6215
Random Forest	0.6003
Decision Tree	0.4782

Baseline Results for the task on Dev-Test set

Model	micro-F1
Random Baseline	0.2222
Majority Baseline	0.5747
n-gram (SVM) Baseline	0.6057
Logistic Regression	0.6129
Random Forest	0.5926
Decision Tree	0.4970

Project Collaborator

Krishno Dey, PhD Student, University of New Brunswick
Yalda Keivan Jafari, MCS Student, University of New Brunswick

This project is conducted as final project for the CS6765: Natural Language Processing, at University of New Brunswick.

Credit goes to the task organizers

Md Arid Hasan, PhD Student, The University of Toronto
Firoj Alam, Senior Scientist, Qatar Computing Research Institute
Md Fahad Hossain, Lecturer, Daffodil International University
Usman Naseem, Assistant Professor, Macquarie University
Syed Ishtiaque Ahmed, Associate Professor, The University of Toronto

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Hatespeech Identification Shared Task at BLP Workshop @IJCNLP-AACL 2025

Objective

Contents of the Directory

File Structure

Task Description

Dataset

Input data format

Subtask 1A

Example

Subtask 1B

Example

Subtask 1C

Example

Baseline Script and Official Evaluation Metrics

Baseline Script

Example

Running Transformer Models

Official Evaluation Metrics

Baselines

Subtask 1A

Subtask 1B

Project Collaborator

Credit goes to the task organizers

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
output-subtask-1a		output-subtask-1a
output-subtask-1b		output-subtask-1b
scripts		scripts
.DS_Store		.DS_Store
README.md		README.md
baseline.ipynb		baseline.ipynb
blp-subtask-1a.ipynb		blp-subtask-1a.ipynb
blp-subtask-1b.ipynb		blp-subtask-1b.ipynb
requirements.txt		requirements.txt

krishnodey/Hate_Speech_Classification

Folders and files

Latest commit

History

Repository files navigation

Hatespeech Identification Shared Task at BLP Workshop @IJCNLP-AACL 2025

Objective

Contents of the Directory

File Structure

Task Description

Dataset

Input data format

Subtask 1A

Example

Subtask 1B

Example

Subtask 1C

Example

Baseline Script and Official Evaluation Metrics

Baseline Script

Example

Running Transformer Models

Official Evaluation Metrics

Baselines

Subtask 1A

Subtask 1B

Project Collaborator

Credit goes to the task organizers

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages