Multilingual Sentiment Analysis and Intent Classification in Romanian

A translation approach to performing sentiment analysis and intent classification of social media texts in Romanian.

The method proposes translating the queries from the original (Romanian) language to English using Google's Cloud Translation API. The resulting texts are then used to generate sentiment predictions using a fine-tuned version of the RoBERTa model. This method led to an accuracy of 90% on a test dataset of 100 fictional queries, generated by the author. For Intent Classification, the DistilBERT model was trained on a dataset of 6500 synthetic and augmented queries from a dataset created by the author. A series of 13 possible intents is created, specific for a fictional cosmetics company. After training the model, it is applied on the test dataset, leading to an accuracy of 73%.

The methodology is explained in more detail in my paper: Budala, C., Multilingual Sentiment Analysis and Intent Classification in the Romanian language: an approach for enhanced Corporate Consumer Engagement on Social Media. 2024. Please cite if you use this method.

This research was part of my Bachelors thesis at the Bucharest University of Economics (ASE).

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
LICENSE		LICENSE
README.md		README.md
Romanian_SA_IC.ipynb		Romanian_SA_IC.ipynb
censored-classified-final-data.xlsx		censored-classified-final-data.xlsx
intent-data.csv		intent-data.csv
intent_dataset_code.ipynb		intent_dataset_code.ipynb
test-data.xlsx		test-data.xlsx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multilingual Sentiment Analysis and Intent Classification in Romanian

A translation approach to performing sentiment analysis and intent classification of social media texts in Romanian.

About

Releases

Packages

Languages

License

CristianBudala/Multilingual-Sentiment-Analysis-and-Intent-Classification

Folders and files

Latest commit

History

Repository files navigation

Multilingual Sentiment Analysis and Intent Classification in Romanian

A translation approach to performing sentiment analysis and intent classification of social media texts in Romanian.

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages