ABSA-Toolkit

ABSA (Aspect-Based Sentiment Analysis) Toolkit developed for performing aspect-level sentiment analysis on customer reviews. The system has two main phases, a development phase and a production phase. Development Phase allows user to train models for performing aspect level sentiment analysis tasks on target domain. In the production phase, a web application is generated through which end user can submit reviews to analyze aspect level sentiments.

The system is developed using Python 2.7

Python Libraries

Following libraries are required to run python scripts

pandas
numpy
sklearn
gensim
sys
warnings
flask
json
time
pickle
nltk
xgboost
pystruct
re

To install libraries, use pip install -r requirements.txt

To run xgboost on Windows, please refer to installation guide available here : https://github.com/dmlc/xgboost/blob/master/doc/build.md After installing nltk, run following commands in python environment : import nltk nltk.download()

To install nltk tagger and stopwords list

Code Organization

1. data: contains training and testing datasets. Place your training and testing datasets inside data folder. Our system expects input datasets to be in specific format. Please see data/restaurants/train.csv for input data format.

2. flaskWebApp: contains the code for generating Flask Web Application. You don't need to modify this code.

3. lexicons: contains polarity lexicons files for computing polarity scores. You can place lexicon file of your choice inside this folder. Currently this folder contains lexicons publicly available at http://saifmohammad.com/WebPages/lexicons.html For any domain, you can use wnscores_inquirer.txt lexicon (Source: http://compprag.christopherpotts.net/iqap-experiments.html)

4. models: All the trained models saved during training phase are saved in this folder. Inside /acd, models trained for each aspect category is saved. Inside /ote, model trained for aspect term detection is saved. Inside /pd, models trained for polarity detection are saved.

Note : Before running script for training, remove all files inside models/acd, models/pd, models/ote folders

5. wordembeddings: contains word vectors in txt format. Currently, we have amazon200.txt suitable for Electronic Products dataset. vector_yelp_200.txt suitable for Restaurant domain. For any other domain, you can either train your own Word2Vec model and save word embeddings in txt format or you can use Glove pretrained models available at http://nlp.stanford.edu/projects/glove/

Training Models for Aspect-Based Sentiment Analysis

Follow instructions written below:

Clone/Download repository
Install all the required python libraries mentioned above
Place your csv files for training and testing inside /data folder. Currently, data folder contains Restaurant and Laptop datasets
Place word embedding file inside /wordembeddings folder. You can use already placed wordembeddings file if the dataset is of Restaurant or Electronic Product domain. Else you can download Glove pretrained vectors
Make sure to remove all files inside models/acd, models/pd, models/ote folders before training
Run script absa.py for training aspect-based sentiment analysis models using command below python absa.py Try running, python absa.py

Once the models are trained, you will see summary as shown below

After training phase is complete you are ready to use web application

Production Phase for Aspect-Based Sentiment Analysis

python absaweb.py -wordembeddingsFile -lexiconFile

Use the same vector file and lexicon file as used in training phase For example,

** python absaweb.py vectors_yelp_200.txt lexicons/Yelp-restaurant-reviews-AFFLEX-NEGLEX-unigrams.txt** This will start Flash Application accessible at 127.0.0.1:9000 on your browser

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
absa-snapshots		absa-snapshots
data		data
flaskWebApp		flaskWebApp
images		images
lexica		lexica
lexicons		lexicons
models		models
wordembeddings		wordembeddings
CategoryDetectionTrain.py		CategoryDetectionTrain.py
CategoryDetectionTrain.pyc		CategoryDetectionTrain.pyc
OTE.py		OTE.py
OTE.pyc		OTE.pyc
PolarityDetection.py		PolarityDetection.py
PolarityDetection.pyc		PolarityDetection.pyc
README.md		README.md
absa.py		absa.py
absaweb.py		absaweb.py
otemodel.sav		otemodel.sav
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ABSA-Toolkit

Python Libraries

Code Organization

Training Models for Aspect-Based Sentiment Analysis

Production Phase for Aspect-Based Sentiment Analysis

Snapshots

About

Releases

Packages

Languages

zarmeen92/ABSA-Toolkit

Folders and files

Latest commit

History

Repository files navigation

ABSA-Toolkit

Python Libraries

Code Organization

Training Models for Aspect-Based Sentiment Analysis

Production Phase for Aspect-Based Sentiment Analysis

Snapshots

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages