This is the code for the paper Offensive Language Identification in Low-resourced Code-mixed Dravidian languages using Pseudo-labeling
This is a collaborative work by Adeep Hande, Karthik Puranik, Konthala Yasaswini, Ruba Priyadharshini, Sajeetha Thavareesan, Anbukkarasi Sampath, Kogilavani Shanmugavadivel, Durairaj Thenmozhi, and Bharathi Raja Chakravarthi
This approach could be used for any multilingual datasets. The weights of the fine-tuned models are available on my Huggingface account AdWeeb.
We have provided the notebooks for reference.
If you use our dataset, and/or find our codes useful, please cite our paper:
@article{hande-etal-offensive,
title = "Offensive Language Identification in Low-resourced Code-mixed Dravidian languages using Pseudo-labeling",
author = "Hande, Adeep and
Puranik, Karthik and
Yasaswini, Konthala and
Priyadharshini, Ruba and
Thavareesan, Sajeetha
Sampath, Anbukkarasi and
Shanmugavadivel, Kogilavani and
Thenmozhi, Durairaj and
Chakravarthi, Bharathi Raja ",
journal={Information Processing and Management},
publisher={Elsevier}
}