# TwtrConvo

TwtrConvo is a python package that utilizes tweepy, pandas, TextBlob, and plotly to generate an overall sentiment of a company (given it's ticker symbol).  It does this by querying for tweets using tweepy and Twitter API keys, then organizing an the tweets using pandas DataFrame, then parsing the text and getting the sentiment using TextBlob and regex, and finally graphically displaying statistics using plotly.

## Tweets module (tweets.py)

The tweets module acts as a wrapper layer around tweepy with the main functions:

    get_tweets
    get_replies

### Setup

In order to use this module you will first need to setup your Twitter API keys.  If you don't have Twitter API keys, get them by following this guide:

https://developer.twitter.com/en/docs/basics/authentication/guides/access-tokens.html

Once you get your Twitter API keys you will need to add them to your environment with the variable names:

    TWITTER_CONSUMER_KEY
    TWITTER_CONSUMER_SECRET
    TWITTER_ACCESS_TOKEN
    TWITTER_ACCESS_TOKEN_SECRET

This will allow the TwtrConvo "tweets" module access to the Twitter API in order to query for tweets.

## TwtrConvo module (twtrconvo.py)

The twtrconvo module houses the main logic of the package including the methods to build or load the dataset as well as the ranking function for ranking the tweets that were queried. Let's step through each portion of the main function and show each step in creating the statistical analysis of the company's overall Twitter sentiment.

### Import package and load dataset

In [1]:
import os
import TwtrConvo

ticker = 'TSLA'

# Generally you would use the "build_dataset" method to get tweets and replies, however with
# the default of 500 tweets this generally maxes out you're hourly queries using the Twitter
# API if the ticker has a lot of interaction on Twitter and if you call build_dataset multiple
# times within an hour.  For this reason, you are also able to load previously built data and
# use it to conduct statistical analysis.

#tweet_df, reply_df = twtrconvo.build_dataset(ticker)

tweet_df, reply_df = TwtrConvo.twtrconvo.load_dataset(
    os.path.join(os.getcwd(), 'datasets', ticker)
)

print(tweet_df.head())

                    id       username  \
0  1122625832009097217    TeslaCharts   
1  1122620038576578562    ElonBachman   
2  1122628670366134272    QTRResearch   
3  1122642374180655104  GatorInvestor   
4  1122640088511524865         JTSEO9   

                                               tweet  \
0  There will never be a $TSLA robotaxi. https://...   
1  1\ $TSLA wants Wall Street to look at its 20% ...   
2  OH ITS JUST SO GREAT $TSLA $TSLAQ https://t.co...   
3  I am sure it's nothing. Dutch fleet / rental b...   
4  Did anyone else notice that $TSLA didn't break...   

                                                text  favorites  retweets  \
0                There will never be a TSLA robotaxi         94         8   
1  1 TSLA wants Wall Street to look at its 20 aut...         91        14   
2                    OH ITS JUST SO GREAT TSLA TSLAQ         16         2   
3  I am sure it s nothing Dutch fleet rental busi...         44        11   
4  Did anyone else notice that 

### Word Frequency

The first thing we'd like to look at is word frequency within the top ranks tweets and their replies.  This could identify any patterns and could point out key words that will effect the current social sentiment that we observe.  We'll use TextBlob and our functions "get_blob" and "get_word_count" to do this then display the word count data using plotly pie charts

In [2]:
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
init_notebook_mode(connected=True)

# Get text blobs and word count
tweet_blob = TwtrConvo.twtrconvo.get_blob(ticker, tweet_df)
tweet_word_count = TwtrConvo.twtrconvo.get_word_count(tweet_blob)
reply_blob = TwtrConvo.twtrconvo.get_blob(ticker, reply_df)
reply_word_count = TwtrConvo.twtrconvo.get_word_count(reply_blob)

# The pie chart will default to the top ten words for each word count unless n is
# specified to be different
fig = TwtrConvo.plots.create_pie_chart(tweet_word_count, reply_word_count)

iplot(fig)

### Sentiment Gauge

Using TextBlob we will get a general sentiment of all the tweets and display it on a guage using plotly.

In [3]:
fig = TwtrConvo.plots.create_sentiment_gauge(tweet_blob)
iplot(fig)