In this research project, a Naive Bayes classifier was utilized to identify the context of Mahabharata articles in Bengali and Hindi languages using Natural Language Processing (NLP). The classifier was trained on a labeled news article dataset, enabling it to predict categories such as sports, entertainment, economy, opinion, lifestyle, technology, and education. The analysis revealed that 63 out of 100 articles yielded consistent prediction results between the two languages. The study highlights the potential for using machine learning techniques to understand the contextual representation of ancient texts across different languages.
This project was done by me at Summer Research Internship(SURE) at IIT Hyderabad, 2023 Report Link: https://docs.google.com/document/d/17KzLr3WbzdNN67IoqBaJFleZjcAGwVlkqKkqbDcTWUw/edit?usp=sharing