Skip to content

zakaria-aabbou/arabic_tweet_sentiment_analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

Arabic Tweet Hashtag Sentiment Analysis

The project report: https://drive.google.com/file/d/1rjoW2kDerpI-tel6o6bF3T0393tbe86t/view

The PowerPoint presentation: https://drive.google.com/file/d/1U7mQ8lSuizDFTkKCvLqMDwk7tiYTjxch/view

Video of the presentation: https://drive.google.com/file/d/1utpwTH0NpTsQ1GFG5rF8LZpsTjB_Sw_5/view

Link to app: https://zakaria-aabbou-arabic-twee-sentimentanalysisarabicsa-app-dduf4o.streamlitapp.com/

Modules

Data used for training

The data used is a public dataset of tagged arabic tweets. There are 60k instances of data. The data is in the data folder.

LSTM model

The model training and definition is in the lstm_model.py file. This can be run to get the LSTM model and tokenizer frozen to the models folder. he model is a simple model with an Embedding Layer, an LSTM Layer and a final sigmoid layer for prediciting the probabilities. The optimizer used is adam and loss function is the binary_crossentropy. The Tokenizer in the keras library is used.

SVM Model

The model training and definition is in the svm_model.py file. The model takes in input from the TFIDF Vectorizer and uses the SVM.SVC classifier. The model is frozen to the models folder along with the vectorizer for inference.

DataCleaner

This module uses nltk package for various text cleaning and preprocessing. It removes stopwords, stemming etc.

Model Inference

This handles loading the models and tokenizer into memory and then doing prediction with them. The model_inference.py has the code for the same. The get_sentiment() in this module takes in a dataframe and the model with which to predict.

Tweet Manager

This module takes care of authentication to the twitter API and calls to the Twitter search endpoint. The config.ini file can be edited to add authentication credentials to do the same. The get_tweets() function takes in the query, result type, count and the language of the tweets to be parsed. The file tweetmanager.py has the code for the same.

Web App

The web app definition is in the SA_app.py. It uses streamlit library for defining the web app flow. Model loading and tweet scraping is cached so that it is not repeated each time causing the UI to freeze.

Instructions to run

In the project directory run the following to set up the required libraries

pip install -r requirements.txt

To run the app

streamlit run SA_app.py

This will start the app in localhost and in a given port displayed in the commandline.

Releases

No releases published

Packages

No packages published

Languages