Skip to content

naak-ktr/Sentiment-Analysis-Detecting-Food-Insecurity-Risk-Twitter

Repository files navigation

Sentiment Analysis in Detecting Food Insecurity Risk Among Twitter Users

Thumbnail

Tackling Food Insecurity with Twitter Sentiment Analysis: A Machine Learning Approach

Food insecurity remains a critical global issue, exacerbated by the recent pandemic and inflation. Social media platforms like Twitter offer valuable insights into individual struggles, but identifying those at risk requires advanced analytics. This project demonstrates the power of data-driven approaches in tackling food insecurity. By analyzing online sentiments, we can gain valuable insights to guide effective interventions and ultimately alleviate this critical issue.

Method

1) Data Collection & Preprocessing

I used Python's Snscrape module and RapidMiner to retrieve tweets containing keywords associated with food insecurity in Malaysia. Then, the tweets data were cleaned and preprocessed.

2) Lexicon-Based Sentiment Analysis

Employed three lexicon-based approaches to capture sentiment patterns within tweets:

  1. TextBlob
  2. VADER
  3. Harmonized Lexicon (A novel lexicon created by combining TextBlob and VADER scores with domain-specific food insecurity terms)

From here, the tweets dataset was categorized as positive or negative based on the overall sentiment score extracted from each lexicon.

3) Multi-Level Risk Classification:

Developed a machine learning pipeline using traditional supervised learning algorithms like Support Vector Machine (SVM), Naive Bayes, and Logistic Regression on the sentiment-annotated dataset.

Tweets with positive sentiment - Food Secure Tweets with negative sentiment - Food Insecure

Negative tweets were further categorized into risk levels (mild, moderate, severe) based on their sentiment score thresholds (Refer the figure below)

Labelling Sentiment for Severity-Page-3 drawio

4) Model Evaluation and Selection:

I used 70:30 and 80:20 train-test splits for robust model evaluation. Then, I compared the performance of various lexicon-algorithm combinations through accuracy testing with 70:30 and 80:20 train-test splits.

To amplify the project's significance, I translated the key insights into an interactive dashboard, enabling intuitive exploration of food insecurity risk patterns and empowering data-driven decision making.

Tracking Food Insecurity on Twitter Interactive Dashboard