A repo that includes a few examples on conducting Sentiment Analysis in Arabic.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
NileULex.csv
README.md
calculate_watson.py
keras_lstm.py
lexicon_based.py
run_all.sh
test.csv
tfidf_lgbm.py
tfidf_nb.py
train.csv
watson.py

README.md

Arabic Sentiment Analysis

I've made this repository as part of an interview task

This repo contains a few scripts to calculate sentiment of a given tweet based on the ASTD

I've mainly used IBM watson to train the classifier (Trial period from IBM Bluemix) To improve on Watson I've also trained a few other models which are:

  1. A Bidirectional LSTM model
  2. A Gradient boosted machine
  3. A Naive bayes model

In the end I use the predictions from all of these models and average the probabilities to achieve higher accuracies

I have also used the NileULex.

Here's a table demonstrating the results. (I don't use the lexicon in any of the ensembles)

Model Score
Lexicon only %41.3
IBM Watson %68.1
LightGBM %68.4
Naive Bayes %67.3
Bi-LSTM %63.4
Ensemble(All) %69.16
(Watson + LGBM + NB) %69.33
(Watson + LGBM) %69.5

You can rerun the script to recreate the results by running the run_all.sh script