# Vignette
## Sentiment Analysis of Twitter Users (twitter_nlp)

### Introduction
This package enables users to get a visualized **sentiment analysis** of any selected twitter user on their political sentiment (liberal vs. conservative). Besides an authenticator token (bearer token), only the username (@handle) of an existing twitter user is required. Users get an overall and over-time visualization of the political sentiment of the account in question as well as frequently used words. 

*It is recommended to run this package through the command line.*

### How to use the package
The first step is to copy the repository from GitHub and run the the main.py file in your terminal (the exact execution is explained in the README file). Alternatively, users can run the following code to import the package and use the key function "get_tweets(username, bearer_token)" from the class "Tweets":

In [None]:
import twitter_nlp as tnp

obj = tnp.Tweets()

# Function takes two arguments: username (@handle) and an authenticator token
obj.get_tweets("username", "bearer_token")

### Explanation of output
The output includes a pie chart, time-series chart, wordcloud, and the classification of a randomly selected tweet to illustrate the classification process. The following example charts are based on **Jordan B. Peterson's** twitter account (a Canadian clinical psychologist). The sentiment analysis always considers the last 100 tweets or the available number of tweets.

#### Pie Chart
The pie chart hows the **ratio of as liberal vs. conservative classified tweets**. In this case, the user's tweets show a tendency towards a more liberal sentiment. However, one third of the tweets display a conservative impression.

<center><img src="img/pic_example_pie_chart.png"/></center>

#### Time Series Chart
The time series chart displays the **probability of sentiments on single tweets** over the given time period in which they were tweeted. The classification does not simply assign a sentiment to each tweet, but a probability with respect to both sentiments, which add up to **1**. The graph also shows a **running average of the last 14 tweets**. This facilitates the localization on the sentiment scale as it gives a more accurate position. In the example, tweet sentiment varies from one to the other extreme, however, a more liberal alignment can be spotted. 

<center><img src="img/pic_example_time_series.png" width = "800" height = "440" /></center>

#### WordCloud
The wordcloud provides a visulaized **overview of frequently used words** within the (up to) 100 tweets. The more often a word has been used, the larger it appears in the final graphic. In our example, the words "think", "life, and "people" appear to have made up a significant portion of the overall used terms.

<center><img src="img/pic_example_wordcloud.png"/></center>

#### SingleTweet
Each execution of the package provides a **random tweet and its respective sentiment**. This serves mainly to illustrate the classification process and enables us to test if the classification aligns with your own intuitions.

<center><img src="img/pic_example_single_tweet.PNG"/></center>

### Under the hood
#### How does the classifying work? 
The package uses a pre-trained classifier that was build using the **NaiveBayesClassifier** class from the Natural Language Toolkit (nltk). This classifier represents a simple probabilistics classifier that applies the **Bayes' theorem**. The used classifier makes the 'naive' assumption that all features are independent and assigns each tweet the probability of belonging to either sentiment (both probabilities add up to 1).

*More information on how this classifier works: https://www.nltk.org/_modules/nltk/classify/naivebayes.html* 

#### Doing your own sentiment analysis
The package includes a python script detailing how the classifier was build. Users are invited to do their own sentiment analysis using the provided script or replicate the procided classifier. As for the usage of the package functionalities, an authenticator token (bearer token) for Twitter's API is required.

<center><img src="img/pic_example_bayes.PNG" width = "300" height = "300"/></center>