Multilingual Abusive Comment Detection

The dataset for this project is taken from Kaggle: https://www.kaggle.com/competitions/multilingualabusivecomment

This dataset include:

Massive & Multilungual: 10+ low-resource Indic languages
Human annotated
Metadata for each comment (Eg. #likes, #reports, etc.)

Dataset Stats

Training dataset:

No. of Non-abusive Comments: 1148019
No. of Abusive Comments: 352879

Directory Structure

.
├── config.py                   # contains the hyperparameters and file path locations
├── dataset.py                  # dataset class for processing the dataloader
├── engine.py                   # training and evaluation epochs
├── model.py                    # model class
└── train.py                    # script to perform model training
└── eval.py                     # script to perform model evaluation
└── data_process.py             # data preprocessing to remove noise, to generate relevant text from the data
└── utils.py                    # helper functions

Example run

For training

python train.py

Best model will be saved here: ./models/best_model.pth. For any changes in the hyperparameters, do it in the config.py file.

For evaluation

python eval.py

Final predictions will be stored here: submission.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multilingual Abusive Comment Detection

Dataset Stats

Directory Structure

Example run

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
config.py		config.py
data_process.py		data_process.py
dataset.py		dataset.py
engine.py		engine.py
eval.py		eval.py
model.py		model.py
train.py		train.py
utils.py		utils.py

ashwani-bhat/Multilingual-Abusive-Classification

Folders and files

Latest commit

History

Repository files navigation

Multilingual Abusive Comment Detection

Dataset Stats

Directory Structure

Example run

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages