# Introduction

In this course, for many of our homeworks, we have used Twitter data (remember @realDonaldTrump, @HillaryClinton?). Ever wondered how all that Twitter data was mined? Through this tutorial, we will try to demystify Twitter data mining.  

Twitter provides limited access to its data to registered developers through the [Twitter API](https://dev.twitter.com/overview/api), which is maintained by the company in Java. Due to the popularity of Twitter data, developers have built multiple libraries for accessing the Twitter API in multiple languages. Some of the popular ones are:

- [twitcurl](https://code.google.com/archive/p/twitcurl/) for C++
- [Tweetinvi](https://tweetinvi.codeplex.com/), which is a .NET based C# library
- [go-twitter](https://github.com/dghubble/go-twitter) for GO
- [Twitter4J](http://twitter4j.org/en/index.html) for Java
- [STTwitter](https://github.com/nst/STTwitter) for Objective C
- [codebird-php](https://github.com/jublonet/codebird-php) for PHP
- [twython](https://github.com/ryanmcgrath/twython) for Python



## Content

As part of this tutorial, we will be primarily focusing on [twython](https://github.com/ryanmcgrath/twython), which is a python wrapper for the Twitter API. In this tutorial, we will use various twython calls to GET as well as POST data to Twitter. The tutorial is structured in the following way: 

- [Twitter API basics](#1.-Twitter-API-basics)
- [Twython](#2.-Twython)
- [Installation](#3.-Installation)
- [Developer Registration and Authentication](#4.-Developer-Registration-and-Authentication)
- [Users](#5.-Users)
- [Tweets](#6.-Tweets)
- [Entities](#7.-Entities)
- [Places](#8.-Places)
- [Streaming API](#9.-Streaming-API)
- [Application: Realtime Sentiment Analysis from Twitter Data](#10.-Application:-Realtime-Sentiment-Analysis-from-Twitter-Data)
- [Summary and Resources](#11.-Summary-and-Resources)


## 1. Twitter API basics

Twitter provides data access in two forms:

- REST API: For simple read and write to Twitter using the stateless [REST](https://en.wikipedia.org/wiki/Representational_state_transfer) standard. Allows performing various operations such as searching tweets, updating status, getting list of followers etc.
- Streaming API: For streaming and monitoring data in real time. 

The latest version of the Twitter API is v1.1 . All requests to the API are authenticated using [OAuth](https://en.wikipedia.org/wiki/OAuth). Authentication is essential to prevent abusive behaviour by unauthorized sources and also helps keep a track of how various applications are using the Twitter data. 

Successful requests return data in [JSON](https://en.wikipedia.org/wiki/JSON) format.

### 1.1 Rate Limiting of Data

In order to prevent abuse, the APIs impose rate limiting (usually max 15 API calls of each type) on a 15 minute window. For more information on this, please check out Twitter's [rate limiting chart](https://dev.twitter.com/rest/public/rate-limits). When an application exceeds the rate limit for a particular API endpoint, the Twitter API will respond with an HTTP 429 'Too many requests' code.

Also, rate limits on 'reads' from the system are defined on a per user and per application basis, while rate limits on 'writes' into the system are defined solely on a per user basis. This means that if a user 'writes' from App X, the write count for that user in that window is reduced for App Y as well which the user might be using. For 'reads', the counts for both apps will be independent.  

The Streaming API has different rate limiting and access levels that are appropriate for long-lived connections. 

Note that these aforementioned properties are enforced by the Twitter API and hence applicable to all the libraries including twython.

### 1.2 Basic Objects

Twitter APIs have four basic types of objects:
- [users](#5.-Users)
- [tweets](#6.-Tweets)
- [entities](#7.-Entities)
- [places](#8.-Places)

We will study these in detail with examples using twython below.




## 2. Twython

Twython is an actively maintained, pure Python wrapper for the Twitter API. It supports:

- Querying for specific tweets
- Follow some other user
- Update status
- And all other features mentioned in [Twitter docs](https://dev.twitter.com/overview/documentation)

Other features include:

- Support for Twitter's Streaming API as well as normal REST API
- Support for Python 3
- Conversion of the returned JSON data to python object



## 3. Installation



Let's begin by installing Twython. It can be done using [pip](https://pypi.python.org/pypi/pip) with the following command:

    $ pip install twython

or by using [easy_install](http://setuptools.readthedocs.io/en/latest/easy_install.html):

    $ easy_install twython
    
Once installed, you should be able to import the library.

In [1]:
import twython
from twython import Twython

## 4. Developer Registration and Authentication

In our final setup step, before we start playing with Twitter data, we need to register our app with Twitter to get our Consumer/App key and Consumer/App secret. 
- You can create and register an app [here](https://apps.twitter.com/) after logging in. 
- While creating, you can leave the 'Callback url' blank and in the 'Website', you can give any website with https prefixed (eg: https://www.cmu.edu/). 
- After creation, you can manage keys in the 'Keys and Access Tokens' tab. 
- Click on 'Create my Access token' at the bottom of that tab to generate your [OAuth](https://en.wikipedia.org/wiki/OAuth) Access Token and Secret.

Alternatively, you can also generate OAuth Access Token and Secret programmatically using the following code:

In [30]:
APP_KEY = 'Enter App key here'
APP_SECRET = 'Enter App Secret here'



In [49]:
twitter = Twython(APP_KEY, APP_SECRET)

auth = twitter.get_authentication_tokens()

OAUTH_TOKEN = auth['oauth_token']
OAUTH_TOKEN_SECRET = auth['oauth_token_secret']

print OAUTH_TOKEN
print OAUTH_TOKEN_SECRET
print APP_KEY
print APP_SECRET

Now, we have all four keys (Consumer Key, Consumer Secret, Access Token, Access Token Secret) required to query and update Twitter. 

Note that it is important that you keep these keys secret! It is a good practice to store them in a file on your local machine and access them through the file. Here we will store them in a txt file with JSON format:

```
{
"APP_KEY":"enter value",
"APP_SECRET":"enter value",
"OAUTH_TOKEN":"enter value",
"OAUTH_TOKEN_SECRET":"enter value"
}
```

In [2]:
import io
import json

config_filepath = 'twitter_credentials.txt'
with io.open(config_filepath) as cred:
        credentials = json.load(cred)
        CONSUMER_KEY = credentials['APP_KEY']
        CONSUMER_SECRET = credentials['APP_SECRET']
        OAUTH_TOKEN = credentials['OAUTH_TOKEN']
        OAUTH_TOKEN_SECRET = credentials['OAUTH_TOKEN_SECRET'] 
        twitter = Twython(CONSUMER_KEY, CONSUMER_SECRET, OAUTH_TOKEN, OAUTH_TOKEN_SECRET)


## Note

For this tutorial, I have created a dummy account on Twitter with the handle: @pds_tutorial (follow me!). All proceeding fetches and updates have been performed through that account. 

<img src="homescreen.jpg",width=800, height=800>

## 5. Users

Users can be persons or organizations. It is the basic object with which every user of Twitter has their identities linked. A user tweets, follows other users, has a timeline among many things. A user is identified with 2 properties: A unique 64 bit integer id called `id` and a string `screen_name` which is nothing but the Twitter handle name. A user also has some other properties like `friends_list`, `followers_count`,`location`, `time_zone` and `profile_image_url`.    

### 5.1 Checking list of followers

Let us begin by first checking out the list of users we are following. Here is the graphical representation of the list for your reference:

<img src="friends_list.jpg",width=800, height=800>

In [23]:
friends = twitter.get_friends_list()
for friend in friends['users']:
    print (friend['screen_name'], friend['id'])

(u'iheartpgh', 7384442)
(u'billpeduto', 23541684)
(u'Dejan_Kovacevic', 51674513)
(u'MarkMaddenX', 382815761)
(u'AntonioBrown', 224221596)
(u'CityPGH', 168774634)
(u'PittsburghPG', 21780652)
(u'DowntownPitt', 32445738)
(u'CBSPittsburgh', 14085099)
(u'fsmikey', 15906474)
(u'penguins', 15020865)
(u'Pirates', 37947138)
(u'PittsburghMag', 17543584)
(u'tpolamalu', 50532092)
(u'Pittsburgh_Dad', 394826630)
(u'steelers', 19426729)
(u'malkin71_', 369506120)
(u'NeilWalker18', 295422142)
(u'ROOTSPORTSPIT', 20613068)
(u'TheCUTCH22', 66209470)


### 5.2 Follow someone programmatically

Twython allows you to follow new people by writing code! This can be achieved by using the `create_friendship` method and passing either the `id` or `screen_name` of the user you wish to follow. Let us try it out by following Hillary Clinton (@HillaryClinton). The `follow` parameter enables notifications from that user. On success, the method returns a 'users' object of the person you are following. 

In [25]:
new_friend = twitter.create_friendship(screen_name = 'HillaryClinton', follow = True)
print new_friend


{u'follow_request_sent': False, u'has_extended_profile': True, u'profile_use_background_image': False, u'default_profile_image': False, u'id': 1339835893, u'profile_background_image_url_https': u'https://abs.twimg.com/images/themes/theme1/bg.png', u'verified': True, u'translator_type': u'none', u'profile_text_color': u'000000', u'muting': False, u'profile_image_url_https': u'https://pbs.twimg.com/profile_images/750300510264107008/G8-PA5KA_normal.jpg', u'profile_sidebar_fill_color': u'000000', u'entities': {u'url': {u'urls': [{u'url': u'https://t.co/xhPHAcvdoc', u'indices': [0, 23], u'expanded_url': u'http://HillaryClinton.com', u'display_url': u'HillaryClinton.com'}]}, u'description': {u'urls': []}}, u'followers_count': 10084829, u'profile_sidebar_border_color': u'000000', u'id_str': u'1339835893', u'profile_background_color': u'0057B8', u'listed_count': 33081, u'status': {u'contributors': None, u'truncated': False, u'text': u'"The thought of Donald Trump with nuclear weapons scares me

It is a success:

<img src="following_clinton.jpg",width=800, height=800>

However, the returned users object appears pretty large and messy. This is because there are multiple layers in the returned object - dictionaries within dictionaries within dictionaires! Other objects like `media` entity can also be present. Wading through this complexity is a challenge with the Twitter API. We access using attributes. Eg: 

In [26]:
print new_friend['followers_count']

10084829


In [28]:
print new_friend['description']

Wife, mom, grandma, women+kids advocate, FLOTUS, Senator, SecState, hair icon, pantsuit aficionado, 2016 presidential candidate. Tweets from Hillary signed –H


## 6. Tweets

Tweets are the basic units of messaging. They can be favourited, retweeted, deleted. Tweets objects have properties like `id` for identification and others like `created_at` ,`lang`, `hashtags` etc. One interesting property is `withheld_in_countries` which includes a list of countries where the Tweet should not be seen.


### 6.1 Get Timeline

Let us understand Tweets by looking at our timeline. To understand better, here is how the timeline actually looks graphically: 

<img src="timeline.jpg",width=800, height=800>

In [15]:
twitter = Twython(CONSUMER_KEY, CONSUMER_SECRET, OAUTH_TOKEN, OAUTH_TOKEN_SECRET)

timeline = twitter.get_home_timeline()

print timeline



[{u'contributors': None, u'truncated': False, u'text': u'RT @WPXIVarnum: A local county changing policies because of heroin overdoses @WPXI at 11. https://t.co/X6WoCObX4l', u'is_quote_status': False, u'in_reply_to_status_id': None, u'id': 793277401342046209L, u'favorite_count': 0, u'source': u'<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>', u'retweeted': False, u'coordinates': None, u'entities': {u'symbols': [], u'user_mentions': [{u'id': 2398393345L, u'indices': [3, 14], u'id_str': u'2398393345', u'screen_name': u'WPXIVarnum', u'name': u'Catherine Varnum'}, {u'id': 14085146, u'indices': [77, 82], u'id_str': u'14085146', u'screen_name': u'WPXI', u'name': u'WPXI'}], u'hashtags': [], u'urls': [], u'media': [{u'source_user_id': 2398393345L, u'source_status_id_str': u'793277143685922816', u'expanded_url': u'https://twitter.com/WPXIVarnum/status/793277143685922816/video/1', u'display_url': u'pic.twitter.com/X6WoCObX4l', u'url': u'https://t.co/X6WoCObX4l', u'media_url_ht

As observed before, the returned object is very large and complex consisting of multiple tweets having multiple media entities within them. Here are some extracted values in simple form:

In [30]:
print timeline[0]['text']

RT @WPXIVarnum: A local county changing policies because of heroin overdoses @WPXI at 11. https://t.co/X6WoCObX4l


In [31]:
print timeline[0]['retweet_count']

1


In [32]:
print timeline[0]['id']

793277401342046209


### 6.2 Fetching Tweets

We now move on to one of the most useful features of Twitter API for data mining - search. This method scans the whole of Twitter to return tweets that match the provided `query`. `Count` restricts number of results (maximum is 100). Additional parameters which can be passed include `geocode`, `lang`, `until`. The returned tweets are included under the label 'statuses'.

Lets see what people are saying about the new Macbooks.

In [37]:
search_results = twitter.search(q='macbook', count=30, lang = 'en')
for tweet in search_results['statuses']:
    print 'Tweet from @%s Date: %s' % (tweet['user']['screen_name'],
                                       tweet['created_at'])
    print tweet['text']

Tweet from @edwarddesroche1 Date: Tue Nov 01 06:33:44 +0000 2016
I liked a @YouTube video from @ncixtechtips https://t.co/1qLACQIJOQ Radeon Pro 450, 455, 460 Specs, EA blocks Origin, Macbook Pro
Tweet from @rivernelsonmua Date: Tue Nov 01 06:33:39 +0000 2016
I entered @grav3yardgirl HALLOWEEN MACBOOK GIVEAWAY! CHECK IT OUT HERE! https://t.co/SqNxT6rkbi #grav3yardgirlgiveaway hdxhexhehxej
Tweet from @Geovannie Date: Tue Nov 01 06:33:36 +0000 2016
@keaton let me have a MacBook bro
Tweet from @blkentourageclt Date: Tue Nov 01 06:33:31 +0000 2016
#forbes The New MacBook's Incompatibility With The iPhone 7 Is Absolutely Absurd https://t.co/JwEoHfes5w
Tweet from @elfsweet Date: Tue Nov 01 06:33:27 +0000 2016
I entered @grav3yardgirl HALLOWEEN MACBOOK GIVEAWAY! CHECK IT OUT HERE! https://t.co/DbjUkIHyEH #grav3yardgirlgiveaway
Tweet from @elfsweet Date: Tue Nov 01 06:33:21 +0000 2016
i entered @grav3yardgirl HALLOWEEN MACBOOK GIVEAWAY! CHECK IT OUT HERE! https://t.co/DbjUkIHyEH
Tweet from @riv

## 7. Entities

Entities are structures which hold additional information about a Tweet or a User. Common entities within a Tweet include `hashtags`, `urls` , `media`, `user_mentions`. All the urls on Twitter are stored in entities only. 



### 7.1 Entities on the timeline

Here are the entities present in the first Tweet on our timeline:

In [38]:
print timeline[0]['entities']

{u'symbols': [], u'user_mentions': [{u'id': 2398393345L, u'indices': [3, 14], u'id_str': u'2398393345', u'screen_name': u'WPXIVarnum', u'name': u'Catherine Varnum'}, {u'id': 14085146, u'indices': [77, 82], u'id_str': u'14085146', u'screen_name': u'WPXI', u'name': u'WPXI'}], u'hashtags': [], u'urls': [], u'media': [{u'source_user_id': 2398393345L, u'source_status_id_str': u'793277143685922816', u'expanded_url': u'https://twitter.com/WPXIVarnum/status/793277143685922816/video/1', u'display_url': u'pic.twitter.com/X6WoCObX4l', u'url': u'https://t.co/X6WoCObX4l', u'media_url_https': u'https://pbs.twimg.com/ext_tw_video_thumb/793277110207078400/pu/img/AApQqIbiCD9_Eurm.jpg', u'source_user_id_str': u'2398393345', u'source_status_id': 793277143685922816L, u'id_str': u'793277110207078400', u'sizes': {u'small': {u'h': 340, u'resize': u'fit', u'w': 340}, u'large': {u'h': 640, u'resize': u'fit', u'w': 640}, u'medium': {u'h': 600, u'resize': u'fit', u'w': 600}, u'thumb': {u'h': 150, u'resize': u'cr

### 7.2 Updating status with media entity

Now we will attempt to do something cool that is update our status using a media entity (photo of our beloved CMU in this case). For that, we need to first upload the media file and refer it in the 'update_status' method with its `media_id`. On success, it returns the updated status as a Tweet.

In [41]:
photo = open('cmu.jpg', 'rb')
media_response = twitter.upload_media(media=photo)
mid = media_response['media_id']
new_media_status = twitter.update_status(status='Uploading image using Twython', media_ids=[mid])
print new_media_status['entities']['media']

[{u'expanded_url': u'https://twitter.com/pds_tutorial/status/793348009425670144/photo/1', u'display_url': u'pic.twitter.com/eI82qIM2wt', u'url': u'https://t.co/eI82qIM2wt', u'media_url_https': u'https://pbs.twimg.com/media/CwKJzqUWcAAyCQ_.jpg', u'id_str': u'793348004996608000', u'sizes': {u'small': {u'h': 259, u'resize': u'fit', u'w': 680}, u'large': {u'h': 547, u'resize': u'fit', u'w': 1437}, u'medium': {u'h': 457, u'resize': u'fit', u'w': 1200}, u'thumb': {u'h': 150, u'resize': u'crop', u'w': 150}}, u'indices': [30, 53], u'type': u'photo', u'id': 793348004996608000L, u'media_url': u'http://pbs.twimg.com/media/CwKJzqUWcAAyCQ_.jpg'}]


And it was a success!

<img src="media_tweet.jpg",width=800, height=800>

## 8. Places

Locations on Twitter are in the form of Places. Each place has a unique `id` alloted to it. It also has properties of `country` and `bounding_box` which is a list of GPS co-ordinates reprsenting the polygon which acts as a boundary to that region. 
 

### 8.1 Neighborhood search

Here we perform a search of all neighbourhoods in Pittsburgh. The `query` parameter can include landmarks as well such as 'Eiffel Tower' or 'Twitter HQ'. `granularity` denotes the positioning level of the results. It can be 'country', 'city' or 'neighborhood'. Individual `lat`, `long` and `accuracy` can also be passed to search_geo. 

In [47]:
neighborhoods = twitter.search_geo(query='Pittsburgh', granularity = 'neighborhood')
print len(neighborhoods['result']['places'])

3


In [48]:
for neighborhood in neighborhoods['result']['places']:
    print neighborhood['bounding_box']

{u'type': u'Polygon', u'coordinates': [[[-80.095586, 40.3615796], [-80.095586, 40.501198], [-79.8657933, 40.501198], [-79.8657933, 40.3615796], [-80.095586, 40.3615796]]]}
{u'type': u'Polygon', u'coordinates': [[[-84.4082574, 33.719334], [-84.4082574, 33.7447277], [-84.3932187, 33.7447277], [-84.3932187, 33.719334], [-84.4082574, 33.719334]]]}
{u'type': u'Polygon', u'coordinates': [[[-79.845093, 40.390824], [-79.845093, 40.405892], [-79.829844, 40.405892], [-79.829844, 40.390824], [-79.845093, 40.390824]]]}


## 9. Streaming API

The Twitter Streaming API is for getting a continuous stream of tweets in real time. The rate limits of the REST APIs do not apply to the Streaming API. As long as the connnection with Twitter is open, we will keep receiving tweets of interest as they are being posted in real time. This makes it a very powerful tool for data mining. 

In Twython, the `TwythonStreamer` is an abstraction of the Streaming API. It has to be instantiated using the 4 keys we generated before. The returned data (a Tweets object) is handled through 2 signals `on_success` and `on_error`. In most cases, we define our own Streamer class where we mention how we want data to be handled in the handlers to the 2 signals. Here is the code implmenting a custom Streamer class:  

In [65]:
from twython import TwythonStreamer

class MyStreamer(TwythonStreamer):
    #to restrict input size, we implement a counter
    count = 0;
    def on_success(self, inp_data):
        self.count +=1;
        if 'text' in inp_data:
            print self.count, inp_data['text']
    #disconnect after receiving 200 tweets
        if self.count == 200:
            self.disconnect()

    def on_error(self, status_code, inp_data):
        print status_code
        self.disconnect()

Some important points to keep in mind while using Streaming are that at anytime, 1 user can only have 1 Streaming Service open. In case of network disturbances during streaming, the user can reconnect immediately after network is back. However, if connection is denied due to reasons like multiple attempts with invalid credentials (all which return HTTP `status_code` > 400), there should be some cool off time before attempting to reconnect. 

TwythonStreamer has 3 ways to stream tweets:
- filter: return tweets with specific keywords
- sample: return a sample of all recent tweets
- firehose: return all recent tweets

Lets see what people around the world are saying about 'weather':

In [66]:
streamer = MyStreamer(CONSUMER_KEY, CONSUMER_SECRET,
                    OAUTH_TOKEN, OAUTH_TOKEN_SECRET)
streamer.statuses.filter(track='weather')

1 current weather in Bari: clear sky, 14°C
76% humidity, wind 3kmh, pressure 1020mb
2 RT @AEXEQB: $AEXE WE FOUND THE REASON. INDIA NEEDS MORE COAL DUE TO BAD WEATHER AFFECTING PRODUCTION https://t.co/pidhBoUC40… 
3 The weather is on point these days👌🏼 #pleasedontchange
4 RT @marqREXx: Everything is hard pag ganito yung weather:
Hard to get up
Hard to make food
Hard to take a shower
And hard 😂
5 Fun#driving#food#weather#yaariyan#gettogether#traffic :-p :-) — travelling to Dehradun : The City of Love from... https://t.co/95XY1cVjs8
6 Nakakatamad ang weather.
How can i #TIMYGetGoing ?
7 Of His return,Jesus spoke of weather: "There will be...dismay among nations,in perplexity at the roaring of the sea and the waves." Lk.21:25
8 Mostly Sunny tomorrow (Hi 79F | Lo 61F) -- enjoy the weather everyone!
9 RT @amber_jensen1: it's november and i'm still waiting on that no sun and cold weather https://t.co/5HQO3dl7cS
10 Fun#driving#food#weather#yaariyan#gettogether#traffic :-p :-) — travelling to D

## 10. Application - Realtime Sentiment Analysis from Twitter Data

We will now apply what we learnt in a realworld application - Sentiment Analysis. Say you are a high level executive in Samsung which has been marred recently by the issue of exploding phones (yikes!). A massive rollout has been issued and the company's reputation has taken a hit. You want to analyse what people across the world are saying about your company by analyzing sentiments of tweets. We do this by a 3 tiered process - searching for relevant tweets, processing the tweets, performing analysis. The functions are defined below with explanations in inline comments:   

In [13]:
import string
import re
import unicodedata
import sys
import vaderSentiment

In [3]:
def search_tweets(keyword):
    """ Mines Twitter for tweets with keyword provided
    Inputs:
        keyword: str: keyword to search for tweets
       
    Outputs:
        list(str): text of all the fetched tweets
    """
    twt_texts = []
    twitter = Twython(CONSUMER_KEY, CONSUMER_SECRET, OAUTH_TOKEN, OAUTH_TOKEN_SECRET)
    
    #We limit to 90 tweets for simplicity. Streaming API can be used for mining more data. 
    tweet_results = twitter.search(q=keyword, count = 90, lang = 'en')
     
    for twt in tweet_results['statuses']:
        twt_texts.append(twt['text'])
        
    return twt_texts

In [8]:
def clean_and_process(inp_tweets):
    """ Cleans and processes the tweets
    Inputs:
        inp_tweets: list(str): text of all the fetched tweets
       
    Outputs:
        list(str): cleaned and processed forms of the tweets
    """
    processed_twts = []
    
    for twt in inp_tweets:
        #Convert to lowercase
        twt = twt.lower()
        
        #remove punctuations
        regex = re.compile('[%s]' % re.escape(string.punctuation))
        twt1 = regex.sub('', twt)
        
        #remove weblinks (usually start with https). Also encode each word in utf-8 format. This is because certain
        #tweets contain characters which both python's default encoder and the sentiment analyser can't understand.
        #if encode is not used, it will cause error in the program
        words = twt1.split()
        sent = ''
        for wrd in words:
            if 'https' not in wrd:
                sent += wrd.encode('utf-8') + ' ' 
        
            
        processed_twts.append(sent)
    return processed_twts

In [9]:
#Please make sure vaderSentiment(https://github.com/cjhutto/vaderSentiment) is installed on your system. 
#Use pip install vaderSentiment

from vaderSentiment.vaderSentiment import sentiment as vaderSentiment

def perform_sentiment_analysis(processed_tweet):
    """ performs sentiment analysis
    Inputs:
        processed_tweets: str: single processed tweet 
       
    Outputs:
        dictionary of sentiment values: {'neg': val, 'neu': val, 'pos': val, 'compound': val} 
    """
              
        scores = vaderSentiment(processed_tweet)
        
        return scores
        

In [10]:
#Apply the above fundtions for Samsung

samsung_tweets = search_tweets('Samsung')

samsung_processed_tweets = clean_and_process(samsung_tweets)

number_tweets = len(samsung_processed_tweets)
for i in xrange(number_tweets):
    print samsung_processed_tweets[i]
    s = perform_sentiment_analysis(samsung_processed_tweets[i])
    print s
    
#Important Note:

#Though the code is perfect, the output won't appear here in the notebook, but in the terminal. 
#This is an issue with Jupyter Notebook. When .encode('utf-8') is used, this output and all future outputs of cells get 
#transferred to the terminal. To get back to the priginal state, you need to restart the software. 
#Hence, here I provide screenshots of the output on my terminal:

### Output:

<img src="samsung1.jpg",width=800, height=800>



<img src="samsung2.jpg",width=800, height=800>

## 11. Summary and Resources

The basics of Twitter Data Mining and Account Access were covered in this tutorial along with an example application. If you are interested in exploring further check out the following links:

1. Twython Developer interface: https://twython.readthedocs.io/en/latest/api.html#
2. Twitter Developer Docs: https://dev.twitter.com/docs
3. Twitter Streaming API: https://dev.twitter.com/streaming/public
4. OAuth on Twitter: https://dev.twitter.com/oauth
5. Sentiment Analysis with Vader: https://github.com/cjhutto/vaderSentiment