This project is part of my kaggle contest that I participated , in the first entry the submission file put me on the top 11% of
the public leaderboard. About the project: Google Jigsaw annually prepares a dataset of toxic and non-toxic texts data scrapped from various online plateforms. The task is to build a multilingual model that would identify the toxicity in the text data with high accuracy.
My Approach: I used XLM-Roberta multilingual model to train the data more about XLM-Roberta https://towardsdatascience.com/xlm-roberta-the-multilingual-alternative-for-non-english-nlp-cf0b889ccbbf