Interpretable Multi Labeled Bengali Toxic Comments Classification using Deep Learning

This paper presents a deep learning based pipeline for categorizing Bengali toxic comments, in which at first a binary classification model is used to determine whether a comment is toxic or not, and then a multi-label classifier is employed to determine which toxicity type the comment belongs to. For this purpose, we have prepared a manually labeled dataset consisting of 16,073 instances among which 8,488 are Toxic and any toxic comment may correspond to one or more of the six toxic categories – vulgar, hate, religious, threat, troll, and insult simultaneously. Long Short Term Memory (LSTM) with BERT Embedding achieved 89.42% accuracy for the binary classification task while as a multi-label classifier, a combination of Convolutional Neural Network and Bi-directional Long Short Term Memory (CNN-BiLSTM) with attention mechanism achieved 78.92% accuracy and 0.86 as weighted F1-score. To explain the predictions and interpret the word feature importance during classification by the proposed models, we utilized Local Interpretable Model-Agnostic Explanations (LIME) framework.

The paper is published in International Conference on Electrical, Computer and Communication Engineering (ECCE) in 2023.
Paper Link: https://ieeexplore.ieee.org/document/10101588
arXiv PDF: https://arxiv.org/abs/2304.04087

Repository Structure

The repository has three folders:

Codes: All the codes for proposed models.
Dataset: Contains two files (a) one csv and (b) one zip file.
- The csv file contains all the 16,073 instances altogether. It has seven columns: text, vulgar, hate, religious, threat, troll, Insult. The text column contains the Bangla comments, and the rest six columns contain either 1 or 0. Here, 1 indicates the comment belongs to that toxic category. If all the six categories have 0, then the comment is considered not toxic.
- The zip file contains the train, test, validation data split for the experimental results reported in the paper.
Figs: Contains images used in the readme file.

Dataset Statistics

The text samples were gathered from three sources: the multi-labeled "Bangla-Abusive-Comment-Dataset" (https://github.com/aimansnigdha/Bangla-Abusive-Comment-Dataset) and the multi-class "Bengali Hate Speech Dataset" (https://github.com/rezacsedu/Bengali-Hate-Speech-Dataset) and "Bangla Online Comments Dataset" (https://data.mendeley.com/datasets/9xjx8twk8p/1). Upon careful examination, it was determined that the original labeling of the texts in these datasets was not accurate or consistent. In some cases, the texts were deemed to belong in multiple categories at once, leading to the decision to manually categorize them into six classes: vulgar, hate, religious, threat, troll, insult, where each text could be assigned multiple labels. Reclassifying the texts from multiple datasets into a new set of categories is crucial for several reasons, including ensuring consistency, enhancing data quality, providing more insightful information about the nature and extent of toxicity, boosting the performance of machine learning models, and streamlining data management. The reclassification also offers a clearer understanding of the toxicity present in the data and improves the accuracy of machine learning models.

Number of Total Data: 16073
- Toxic - 8488
- Non-Toxic - 7585
Class wise data statistics

Methodology

Performance Comparison

Binary classification

Multi-label classification

Class-wise performance for CNN-BiLSTM with Attention model

LIME Explanation

Citation

@inproceedings{belal2023interpretable,
  title={Interpretable multi labeled bengali toxic comments classification using deep learning},
  author={Belal, Tanveer Ahmed and Shahariar, GM and Kabir, Md Hasanul},
  booktitle={2023 International Conference on Electrical, Computer and Communication Engineering (ECCE)},
  pages={1--6},
  year={2023},
  organization={IEEE}
}

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
Codes		Codes
Dataset		Dataset
Figs		Figs
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Codes

Codes

Dataset

Dataset

Figs

Figs

README.md

README.md

Repository files navigation

Interpretable Multi Labeled Bengali Toxic Comments Classification using Deep Learning

Repository Structure

Dataset Statistics

Methodology

Performance Comparison

LIME Explanation

Citation

About

Releases

Packages

Contributors 2

Languages

deepu099cse/Multi-Labeled-Bengali-Toxic-Comments-Classification

Folders and files

Latest commit

History

Repository files navigation

Interpretable Multi Labeled Bengali Toxic Comments Classification using Deep Learning

Repository Structure

Dataset Statistics

Methodology

Performance Comparison

LIME Explanation

Citation

About

Resources

Stars

Watchers

Forks

Languages