Further Steps:

This repository helps in doing Sentiment Analysis and Topic Modelling.

Basically, there are 4 parts to this:

Getting Twitter Data based upon hashtags
Training and saving the models (Word2Vec, TF-IDF and SVM model)
Using the model for Sentiment CLassification
Using individual sentiments to do Topic Modelling

The training data for Sentiment Classified tweets can be obtained from the below link and keep it under the folder train_data: http://thinknook.com/wp-content/uploads/2012/09/Sentiment-Analysis-Dataset.zip

There are a few activities which needs to be done once:

under train_models - execute train_word2vec.py to train the model and save it in pickle format
under train_models - execute train_tfidf.py to train the model and save it in pickle format
under train_models - execute train_classifier.py to train the model and save it in pickle format

The Classifier accuracy is around 78% in test dataset.

Once, the above is completed, the models are ready to predict.

Keep running the twitter_data.py in order to collect more samples of data.

Once, everything is done, run all_together.py to classify the tweets into positive and negative sentiments and do a topic modelling on each dataset separately.

Further Steps:

Improve the classifier by using negation statements
Improve the classifier using n-gram phrases
Use Convolution Neural Network for increased accuracy of the model

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
train_data		train_data
train_models		train_models
.gitattributes		.gitattributes
README.md		README.md
all_together.py		all_together.py
buildWordVector.py		buildWordVector.py
capture_sentiment.py		capture_sentiment.py
clean_text.py		clean_text.py
hashtags.py		hashtags.py
labelizeTweets.py		labelizeTweets.py
tweet_tokenize.py		tweet_tokenize.py
twitter_data.py		twitter_data.py
twitter_output.txt		twitter_output.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

train_data

train_data

train_models

train_models

.gitattributes

.gitattributes

README.md

README.md

all_together.py

all_together.py

buildWordVector.py

buildWordVector.py

capture_sentiment.py

capture_sentiment.py

clean_text.py

clean_text.py

hashtags.py

hashtags.py

labelizeTweets.py

labelizeTweets.py

tweet_tokenize.py

tweet_tokenize.py

twitter_data.py

twitter_data.py

twitter_output.txt

twitter_output.txt

Repository files navigation

Further Steps:

About

Releases

Packages

Languages

nagarmayank/twitter_sentiment_analysis

Folders and files

Latest commit

History

Repository files navigation

Further Steps:

About

Topics

Resources

Stars

Watchers

Forks

Languages