# LAB: Intro to the Twitter API


## Twitter API Credentials

To access Twitter programmatically, we need to login into the Twitter API. 

Think of this as something similar to logging into a website like MyUSF. 

In that case, you enter into the website your credentials, which are your user name (which is public information) and your password (which you only know). 

In this case, instead of entering the credentials by hand, we will be using code.

The other big difference is that, since Twitter uses a standard called [OAuth](https://en.wikipedia.org/wiki/OAuth) to manage user authentication, you will need four credentials instead of two. These are:

1. the _app key_, also called the _consumer key_, 
2. the _secret_ associated to the key, also called _app key secret_ or _consumer key secret_, 
3. the _access token_, and 
4. the _access token secret_.

The two secrets (#2 and \#4) are like passwords --- you should never share them. 

Where do we find these? We need to generate them. To generate them, we first need to create an app. 

## Creating an App

### 1. Go on [developer.twitter.com](//developer.twitter.com/)
![00.png](attachment:00.png)

### 2. Click on `Developer Portal`
![01.png](attachment:01.png)

### 3. Click on `Projects & Apps`
![02.png](attachment:02.png)

### 4. Click on `Overview`
![03.png](attachment:03.png)

### 5. Scroll down
![04.png](attachment:04.png)

### 6. Click on `Create App`
![05.png](attachment:05.png)

### 7. Enter `CourseApp-<YOUR USF ID>`
![06.png](attachment:06.png)

### 8. Copy the API key
![07.png](attachment:07.png)

### 9. Go on Jupyter -> New -> Text File and write `key:` and paste API key (repeat for API secret key)
![08.png](attachment:08.png)

### 10. Return on App page, click on `get user tokens`
![09.png](attachment:09.png)

### 11. Click on `Keys and tokens`
![10.png](attachment:10.png)

### 12. Click on `Generate`
![11.png](attachment:11.png)

### 13. Copy Access token and Access token secret. END!

## Installing Tweepy

To simplify network access to the API, we will use the [Tweepy package](https://www.tweepy.org/). 

You can install it into Anaconda with
```
conda install -c conda-forge tweepy
```
or alternatively if you use [pip](https://pypi.python.org/pypi/pip) with 
```
pip install twitter
```

## Getting started with Tweepy

Since we stored the credentials in a text file, we need to load them into a dictionary for easier access. Let's write a bit of code that reads the file `credentials.txt` and stores each entry into a dictionary called `cred`.

In [1]:
""" Read the credentials from credentials.txt and place them into the `cred` dictionary """
cred = {}

f = open("./guest_credentials.txt")
for line in f:
    name, value = line.split(":")
    cred[name] = value.strip()
f.close()

Now print the dictionary

In [2]:
print(cred)

{'key': '4kbHg9p4pLvdT7bLN6IR6mQX5', 'key_secret': 'CLPt7Mh9QVFkCdIpGqCZ478XDQWm13p79JA0dXolYFwKlG7KrF', 'token': '4659328459-Gzobrvp1Qbde5yvYAbVSVD2Ix1oKDTAk0HDl8LR', 'token_secret': 'Lss3BMNfIc2IZLykdAo87HQU6SH66ejcWBf8ltPrmqVWK', 'Bearer_token': 'AAAAAAAAAAAAAAAAAAAAAFrpHwEAAAAAz7ElEcg3kGoVdwZswAzExQsQTB0%3DpgbAH6uy2b00HxzabzCnBwVqDJStA5e7cnTThtd9oST8QNmIJe'}


To authenticate, we need a object to handle the OAuth workflow. Tweepy provides a class called `tweetpy.OAuthHandler` to do so. 

The constructor of the class takes two parameters, the app key and the app key secret:
```
auth = OAuthHandler(<KEY>, <KEY SECRET>)
```

Once we have created an OAuth handler object, we can set the access token information:
```
auth.set_access_token(<TOKEN>, <TOKEN SECRET>)
```

Finally, we can create an instance of the class `tweepy.API`. This object provides all the methods needed to call the Twitter API.

The constructor of this class takes the OAuth handler we just created as a parameter:
```
api = tweepy.API(auth)
```
As we create this object, Tweepy will send an authentication request to Twitter using the credentials stored in the OAuth handler. 

In [3]:
""" Create OAuth handler, assign it to API object """

import tweepy

auth = tweepy.OAuthHandler(cred["key"], cred["key_secret"])
auth.set_access_token(cred["token"], cred["token_secret"])
api = tweepy.API(auth)

Now print the `api` object:

In [4]:
print(api)

<tweepy.api.API object at 0x7f2d314a8208>


## Retrieving the home timeline

If the above goes well executing the cell below will show the tweets in your Twitter feed.

In [5]:
""" Print the tweets in your Twitter feed """

public_tweets = api.home_timeline()

print(80 * "-")
for tweet in public_tweets:
    print(tweet.text)
    print(80 * "-")

--------------------------------------------------------------------------------
Watch: BJP workers protest demanding re-opening of Shirdi Sai Baba temple.

Kajal Iyer with details. https://t.co/QrhH1MD9Gb
--------------------------------------------------------------------------------
#TamilNadu: The six accused, which includes a 75-year-old man, were arrested under various sections of the POCSO Ac… https://t.co/XD3Ho4vyYD
--------------------------------------------------------------------------------
Amazon, Flipkart sale: 15 things not to miss while buying a TV, AC, fridge and washing machine… https://t.co/cIOIvaZZIO
--------------------------------------------------------------------------------
RT @IndiaTodayTech: LG Velvet price in India, pre-order details leaked ahead of launch
 https://t.co/nCtxkTJeJM
--------------------------------------------------------------------------------
RT @ITGDsports: "It was a special knock. We got 195 purely because of the genius of that (#ABDevi

## Inspecting a tweet

In the code above `tweet.text` stores the text of the tweet. We can explore other attributes of the object stored in variable tweet by using the `dir()` function. It will return a list with the names of all its attributes and methods.

In [7]:
dir(tweet)

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_api',
 '_json',
 'author',
 'contributors',
 'coordinates',
 'created_at',
 'destroy',
 'entities',
 'extended_entities',
 'favorite',
 'favorite_count',
 'favorited',
 'geo',
 'id',
 'id_str',
 'in_reply_to_screen_name',
 'in_reply_to_status_id',
 'in_reply_to_status_id_str',
 'in_reply_to_user_id',
 'in_reply_to_user_id_str',
 'is_quote_status',
 'lang',
 'parse',
 'parse_list',
 'place',
 'possibly_sensitive',
 'possibly_sensitive_appealable',
 'retweet',
 'retweet_count',
 'retweeted',
 'retweets',
 'source',
 'source_url',
 'text',
 'truncated',
 'user']

We can ask what is the author of each tweet using `tweet.author`. This is another object, so we can explore its attributes/methods using the `dir()` function.

In [8]:
dir(tweet.author)

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_api',
 '_json',
 'contributors_enabled',
 'created_at',
 'default_profile',
 'default_profile_image',
 'description',
 'entities',
 'favourites_count',
 'follow',
 'follow_request_sent',
 'followers',
 'followers_count',
 'followers_ids',
 'following',
 'friends',
 'friends_count',
 'geo_enabled',
 'has_extended_profile',
 'id',
 'id_str',
 'is_translation_enabled',
 'is_translator',
 'lang',
 'listed_count',
 'lists',
 'lists_memberships',
 'lists_subscriptions',
 'location',
 'name',
 'notifications',
 'parse',
 'parse_list',
 'profile_background_color',
 'profile_background_image_url',
 'profile_back

Now let's modify the code above. Instead of printing the text of the tweet, let's print the name of the author:

In [9]:
""" Print the authors of tweets in your Twitter feed """

public_tweets = api.user_timeline()

print(80 * "-")
for tweet in public_tweets:
    print(tweet.author.name)
    print(tweet.text)
    print(80 * "-")

--------------------------------------------------------------------------------
Saumya Bhadani
RT @BrendanNyhan: New: Political audience diversity &amp; news reliability in algorithmic ranking 
https://t.co/kxWwh5rZKg

-News websites w/le…
--------------------------------------------------------------------------------


## Searching for tweets

To search tweets, we use the `tweepy.API.search()` method. In the simplest form, this method just takes a string (the query):

```
api.search("#USF")
```

In [None]:
""" Search tweets with hashtag #USF and print them """

q = "#BiVisibilityDay"
results = api.search(q)
print(80 * "-")
for tweet in results:
    print(tweet.text)
    print(80 * "-")

To get all results of the search, we need to use a cursor.

In [None]:
max_tweets = 100
print(80 * "-")
for status in tweepy.Cursor(api.search, q=q).items(max_tweets):
    print(status.text)
    print(80 * "-")

using the cursor, choose an hashtag and try to fetch as many tweets as possible. Count them and report how many tweets you got on this spreadsheet: 

https://docs.google.com/spreadsheets/d/1_OGcTD5jhioz2e_g5YgJ0M2XVMKps_klDk9a4_59-8I/edit?usp=sharing

### Hint

Set `max_tweet` to a very large number. To reduce the number of requests made by the cursor, increase the number of tweets retrieved at each request with the `count` parameter. See the documentation of the search function here:

    http://docs.tweepy.org/en/latest/api.html#search-methods


To understand how to use cursor, see the tutorial here:

    http://docs.tweepy.org/en/latest/cursor_tutorial.html

In [10]:
""" Search for an hashtag and count the total number of results fetched. """

q = "#BiVisibilityDay"
max_tweets = 1000000
#print(80 * "-")
counter = 0
for status in tweepy.Cursor(api.search, q=q, count=100).items(max_tweets):
    #print(status.text)
    #print(80 * "-")
    counter += 1

TweepError: Twitter error response: status code = 429

In [11]:
print(counter)

18000
