Skip to content

Implementing text mining for sentiment analysis of Indonesian public opinion on Twitter using Naive Bayes and Support Vector Machine (SVM) text classification.

Notifications You must be signed in to change notification settings

ndhartanto/twitter_sentiment_analysis_SVM

Repository files navigation

Text Mining Implementation for Sentiment Analysis of Indonesian Public Opinion on Trade Relations Between Indonesia and China With Naive Bayes and SVM Text Classification

Table of Contents

  1. About The Project
  2. Contact
  3. Acknowledgements

About The Project

The purpose of this research is to implement text mining for sentiment analysis of Indonesian public opinion on trade relations between Indonesia and China with Naïve Bayes and Support Vector Machine (SVM) text. The research begins with data crawling from Twitter; data cleansing; translating the text into English using GoSlate; sentiment analysis using VADER or TextBlob; data pre-processing with NTLK or SpaCy with or without lemmatization; splitting the data for training and testing using the Hold-Out method with 70:30 and 80:20 ratio; text classification with Naïve Bayes and SVM; and calculating the accuracy, precision, recall, and f-measure based on confusion matrix.

The results showed that SVM text classification consistently has a higher accuracy rate than Naive Bayes. The combination that produces the highest level of accuracy is using VADER, SpaCy without lemmatization, and SVM with 80:20 training:testing ratio, resulting in 76.40% accuracy, 74.55% precision, 76.4% recall, and 74.48% F-measure.

Poster

Built With

This project was built using:

Contact

ndhartanto1@gmail.com

Acknowledgements