TwitterSentimentAnalysisTool (TSAT)

A highly adaptable tool for analyzing twitter data. Auto scrapes Tweets based on user keyword from the previous week and applies a naive bayes classifier to analyze sentiment. Comes prebuilt with a docker postgres server managed by pgadmin Includes graphical visuals to analyze and categorize data.

Basic Usage:

Clone the project
Install docker and docker-compose (not listed in requirements.txt)
Install requirements.txt
In terminal navigate to the project and cd to Docker
sudo docker-compose up
Check localhost:5555 and sign in to postgres using 'user' and 'password' as email and password respectively
create a server with the login 'user' and 'password' and the network 172.XX.0.1 - Look at the terminal for the ip address to fill in the XX
Go to .env file and set environment variables (Assumes you created a twitter app)
Run main.py, classify tweets and view projections

Important References:

Advanced Usage: The tool consists of a series of classes for managing and classifying twitterdata documented here:

Tweet

Constructor: * Initializes all the tweet properties to none. Properties include: {Text, User, Retweet_Count, Date, Favorite Count, Follower Count, Nlp Score, Given Score, Tokenized Text
insert_custom_value(name, value): * Allows user to add a custom tweet property
add_{insert_property}(property): * Allows users to update defined properties
add_tweet_json(tweet) * When referencing the twitter API, tweets are returned in Json. This parses the json into a tweet obj
add_tweet(text,user, favorite_count, retweet_count, follower_count, date, nlp_score, given_score, tokenized_text): * Allows the user to add multiple properties at once. None of the properties are required

TweetList

Constructor: * Initializes the dict for storing counds a begins indexing
insert_data(tweet) * Inserts a tweet
remove_{insert_property}(property): * Removes any tweets in the list matching the proeprty
generate_random_tweet_list(size): * Returns a random subset of current tweet list (used to make test and training sets)

TweetScraper

Constructor(consumer_key, consumer_secret, access_token, access_key): * Initializes conncetion with twitter api. Access token and key not required
search(keyword, user, start_date, end_date): * Searches twitter and returns a tweet list. Keyword is the only reqauired value. Combinations of {keyword, user},{keyword, user, start_date, end_date}, and {keyword,end_date} are supported
get_weekly_tweets(keyword): * Gets tweets containing the keyword from the past week
list_members(user, slug): * Gets members of a list
get_valuable_users(base_user): * Gets all members of all base_users lists
users_lists(user): * Gets user lists

Data_To_Csv (Not used in Default Main)

Constructor: * Does nothing
write_data_to_csv(tweet_list, file_name): * Writes tweets to a csv file at the file_name path. File_name defaults to src/Database/output/tweetlist.csv
read_data_from_csv(file_name): * Reads tweets from a csv file and parses into a TweetList obj
write_userlist_to_csv(userlist, file_name): * Writes a list of users to a csv file at the file_name path. File_name defaults to src/Database/output/userlist.csv
read_userlist_from_csv(file_name): * Reads a column of users from csv. Returns python list

Database

Constructor(user, password, host, port): * Attempts to initalize db
create_table(name, column_name, column_type): * Creates a table in the db with all column names in the vector column_name and their corresponding column_type. The number of column names and types must be equivalent.
num_rows(table_name, column_name): * Returns the number of rows in a column
update_column_by_id(table_name, column_name, tweet_id, new_value): * Updates the specified column name with a new value any time the id is the tweet_id
update_column_by_text(table_name, column_name, text, new_value): * Updates the specified column name with a new value any time the text is equal to the text param
insert_tweet(table_name, tweet_id, tweet): * Inserts a specific tweet
insert_tweet_list(table_name, tweet_list): * Inserts all tweets in order of their storage in tweet_list
parse_db_into_tweet_list(name): * Parses all tweets in the named table into tweet objs and then stores them in a TweetList obj

PreProccessing

Constructor: * Builds stop words
process_tweets(list_tweets): * Tokenizes a list of tweets and returns their associated tokenized array and label
generate_token_array(token_arr): * Transforms the multiarray tokenized list of tweets into single dim array for inserting into db

UserInterface

Constructor(data, db, db_name): * Initializes Widget Values. Set data equal to training set to classify tweets
Classify: * Creates a ui for easily self-classifying tweets for the training set. Automatically updates DB\

Plot

constructor(db): * Sets inital values
generate_projections(db_name, column_name): * Gathers all manually classified tweet scores and assigns values
build_projections_histogram: * Displays a histogram of manually classified results
create_classification_plot(pos_score, neg_score, neutral_score, irr_score): * Displays a histogram of given params. Used to display results from NaiveBayes Classification Tool

Name		Name	Last commit message	Last commit date
Latest commit History 117 Commits
Docker		Docker
src		src
.env		.env
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docker

Docker

src

src

.env

.env

.gitignore

.gitignore

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

TwitterSentimentAnalysisTool (TSAT)

About

Releases

Packages

Contributors 3

Languages

lukebhan/TwitterSentimentAnalysisTool

Folders and files

Latest commit

History

Repository files navigation

TwitterSentimentAnalysisTool (TSAT)

About

Resources

Stars

Watchers

Forks

Languages