# Requests & API calls

This notebook has examples on how to place requests to URLs, how to read `html` tables with Pandas, and how to use the Tweepy library to interact with Twitter's API.
* Documentation about the `Requests` library can be found [here](https://2.python-requests.org//en/master/).
* Documentation on `Tweepy` can be found [here](http://docs.tweepy.org/en/v3.5.0/)

### Exploring the `requests` library and the `get` method

**Import the method `get` from the `requests` library**

In [2]:
from requests import get

**Set the `url` variable and `get` the response**

In [3]:
url = 'http://digg.com'
response = get(url)

**Explore the response object**

In [4]:
print(response)

<Response [200]>


In [6]:
print(type(response))

<class 'requests.models.Response'>


In [7]:
print(response.status_code)

200


In [8]:
print(response.headers)

{'Content-Encoding': 'gzip', 'Content-Type': 'text/html; charset=UTF-8', 'Date': 'Fri, 07 Jun 2019 00:32:50 GMT', 'Etag': '"4ac1dbe024f984c8a9b6a15314428906b8a80f94"', 'Server': 'TornadoServer/2.3', 'Set-Cookie': 'frontend.auid=oAdwocIkSCCDlWBpKlnCpA; Domain=digg.com; Path=/, preferred_view=desktop; Path=/, frontend.ts=1559867571; expires=Wed, 04 Dec 2019 00:32:50 GMT; Path=/, frontend.ts.hp=1559867571; expires=Wed, 04 Dec 2019 00:32:50 GMT; Path=/', 'Vary': 'Accept-Encoding', 'X-Frame-Options': 'SAMEORIGIN', 'transfer-encoding': 'chunked', 'Connection': 'keep-alive'}


**Get the expiration date for the cookie from the headers**

In [10]:
thisHeader = response.headers

In [11]:
cookieInfo = thisHeader['Set-Cookie'].split('; ')

In [13]:
cookieInfo[4]

'expires=Wed, 04 Dec 2019 00:32:50 GMT'

Or in just one line

In [14]:
response.headers['Set-Cookie'].split('; ')[4]

'expires=Wed, 04 Dec 2019 00:32:50 GMT'

**Print the header keys and their values**

In [15]:
for key in response.headers.keys():
    print(key, " ==> ", response.headers[key])

Content-Encoding  ==>  gzip
Content-Type  ==>  text/html; charset=UTF-8
Date  ==>  Fri, 07 Jun 2019 00:32:50 GMT
Etag  ==>  "4ac1dbe024f984c8a9b6a15314428906b8a80f94"
Server  ==>  TornadoServer/2.3
Set-Cookie  ==>  frontend.auid=oAdwocIkSCCDlWBpKlnCpA; Domain=digg.com; Path=/, preferred_view=desktop; Path=/, frontend.ts=1559867571; expires=Wed, 04 Dec 2019 00:32:50 GMT; Path=/, frontend.ts.hp=1559867571; expires=Wed, 04 Dec 2019 00:32:50 GMT; Path=/
Vary  ==>  Accept-Encoding
X-Frame-Options  ==>  SAMEORIGIN
transfer-encoding  ==>  chunked
Connection  ==>  keep-alive


**Explore the text in the response**

In [16]:
print(response.text)

<!DOCTYPE html>
<!--[if lt IE 7]> <html class="no-js ie6 oldie" lang="en"> <![endif]-->
<!--[if IE 7]> <html class="no-js ie7 oldie" lang="en"> <![endif]-->
<!--[if IE 8]> <html class="no-js ie8 oldie" lang="en"> <![endif]-->
<!--[if IE 9]> <html class="no-js ie9" lang="en"> <![endif]-->
<!--[if IE 10]> <html class="no-js ie10" lang="en"> <![endif]-->
<!--[if IE 11]> <html class="no-js ie11" lang="en"> <![endif]-->
<!--[if gt IE 11]><!--> <html class="no-js" lang="en"> <!--<![endif]-->
<head>
<link rel="dns-prefetch" href="//static.digg.com">
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<meta name="microtip" content="33NWbMsckoLcoVX6Tz9o6ksuBrJVeu3en2" data-currency="btc">



<meta name="viewport" content="initial-scale=1.0,user-scalable=no,maximum-scale=1,width=device-width" />


<script type="text/javascript">var _sf_startpt=(new Date()).getTime()</script>



<link rel="canonical" href="http://digg.com" class="js--canonical-url" />




<meta n

**Query another url (New York Times)**

In [17]:
url = 'http://nytimes.com'
response = get(url)

In [18]:
response.text



### Reading a table from a url with pandas

In [19]:
from pandas import read_html

In [20]:
wiki_sotu = read_html('https://en.wikipedia.org/wiki/State_of_the_Union')

In [21]:
print(type(wiki_sotu))

<class 'list'>


In [22]:
print(len(wiki_sotu))

8


In [23]:
print(wiki_sotu[1])

                                             0  \
0                    Franklin Delano Roosevelt   
1  Problems playing this file? See media help.   

                                                   1  
0  State of the Union (Four Freedoms) (January 6,...  
1        Problems playing this file? See media help.  


In [24]:
for item in wiki_sotu:
    print(item)

    0                                                  1
0 NaN  This article needs additional citations for ve...
                                             0  \
0                    Franklin Delano Roosevelt   
1  Problems playing this file? See media help.   

                                                   1  
0  State of the Union (Four Freedoms) (January 6,...  
1        Problems playing this file? See media help.  
    0                                                  1
0 NaN  This section does not cite any sources. Please...
         Date       President  Viewers, millions  Households, millions  \
0   2/05/2019    Donald Trump             46.789                33.616   
1   1/30/2018    Donald Trump             45.551                32.168   
2   2/28/2017    Donald Trump             47.741                33.857   
3   1/12/2016    Barack Obama             31.334                23.040   
4   1/20/2015    Barack Obama             31.710                23.137   
5   1/28/201

### Using the `requests` library and the `get` method to query APIs

**Querying the Google Suggested Queries API**

In [29]:
url = 'http://suggestqueries.google.com/complete/search?client=firefox&q=donald trump is'
response = get(url)

In [30]:
response.text

'["donald trump is",["donald trump is what number president","donald trump is the president song","donald trump israel twitter","donald trump is republican","donald trump is the 46 president","donald trump is a horse in a hospital"]]'

### Using the `tweepy` library to query Twitter's API

In [39]:
from tweepy import OAuthHandler, API

**Set key and token variables**

In [41]:
apikey = '' # Your credentials here
apiSecret = '' # Your credentials here
accessToken = '' # Your credentials here
accessSecrete = '' # Your credentials here

**Create the `api` object that will handle the authetication and communicate with Twitter**

In [43]:
auth = OAuthHandler(apikey, apiSecret)
auth.set_access_token(accessToken, accessSecrete)
api = API(auth)
print(type(api))

<class 'tweepy.api.API'>


**Get information about a user**

In [44]:
user = api.get_user('juanfrans')

In [45]:
data = user._json

In [46]:
print(data)

{'id': 40246947, 'id_str': '40246947', 'name': 'Juan Francisco Saldarriaga', 'screen_name': 'juanfrans', 'location': 'New York, NY', 'profile_location': None, 'description': 'Senior Data & Design Researcher \n@BrownInstitute', 'url': 'https://t.co/p3jzCgcI1L', 'entities': {'url': {'urls': [{'url': 'https://t.co/p3jzCgcI1L', 'expanded_url': 'http://juanfrans.com', 'display_url': 'juanfrans.com', 'indices': [0, 23]}]}, 'description': {'urls': []}}, 'protected': False, 'followers_count': 430, 'friends_count': 507, 'listed_count': 13, 'created_at': 'Fri May 15 14:11:01 +0000 2009', 'favourites_count': 142, 'utc_offset': None, 'time_zone': None, 'geo_enabled': False, 'verified': False, 'statuses_count': 1044, 'lang': None, 'status': {'created_at': 'Thu May 09 16:35:12 +0000 2019', 'id': 1126526040111710210, 'id_str': '1126526040111710210', 'text': "Technical question for #dataviz folks working with @p5xjs: how do I know if the mouse is over a bezier curve. Here'… https://t.co/WKUzdCntOi", '

In [47]:
user.needs_phone_verification

False

**Get information about another user**

In [48]:
user = api.get_user('@realDonaldTrump')

In [49]:
user._json

{'id': 25073877,
 'id_str': '25073877',
 'name': 'Donald J. Trump',
 'screen_name': 'realDonaldTrump',
 'location': 'Washington, DC',
 'profile_location': None,
 'description': '45th President of the United States of America🇺🇸',
 'url': 'https://t.co/OMxB0x7xC5',
 'entities': {'url': {'urls': [{'url': 'https://t.co/OMxB0x7xC5',
     'expanded_url': 'http://www.Instagram.com/realDonaldTrump',
     'display_url': 'Instagram.com/realDonaldTrump',
     'indices': [0, 23]}]},
  'description': {'urls': []}},
 'protected': False,
 'followers_count': 60863910,
 'friends_count': 47,
 'listed_count': 103937,
 'created_at': 'Wed Mar 18 13:46:38 +0000 2009',
 'favourites_count': 7,
 'utc_offset': None,
 'time_zone': None,
 'geo_enabled': True,
 'verified': True,
 'statuses_count': 42234,
 'lang': None,
 'status': {'created_at': 'Thu Jun 06 19:32:42 +0000 2019',
  'id': 1136717569413566465,
  'id_str': '1136717569413566465',
  'text': 'RT @IngrahamAngle: For my full interview with \u2066@realDonald

**Get Donald Trump's last 20 tweets**

In [50]:
tweets = api.user_timeline(screen_name='realDonaldTrump', count=20)

In [51]:
print(type(tweets))

<class 'tweepy.models.ResultSet'>


**Print first tweet in list**

In [53]:
tweets[0]._json['text']

'RT @IngrahamAngle: For my full interview with \u2066@realDonaldTrump from Normandy\u2069, tune in tonight #IngrahamAngle \u2066@FoxNews\u2069 10pET https://t.c…'

**Print each tweet in list**

In [55]:
for tweet in tweets:
    print(tweet._json['text'])
    print('--------')

RT @IngrahamAngle: For my full interview with ⁦@realDonaldTrump from Normandy⁩, tune in tonight #IngrahamAngle ⁦@FoxNews⁩ 10pET https://t.c…
--------
#DDay75thAnniversary https://t.co/GIsoLML4NP
--------
Just signed Disaster Aid Bill to help Americans who have been hit by recent catastrophic storms. So important for o… https://t.co/LP0G1lVKZk
--------
To the men who sit behind me, and to the boys who rest in the field before me: your example will never grow old. Yo… https://t.co/6vsql6p26N
--------
Today, we remember those who fell, and we honor all who fought, here in Normandy. They won back this ground for civ… https://t.co/Oq9h32zoGy
--------
So sorry to hear about the terrible accident involving our GREAT West Point Cadets. We mourn the loss of life and p… https://t.co/jALcSNtE9V
--------
#DDay75thAnniversary https://t.co/0fYfpvUghk
--------
Heading over to Normandy to celebrate some of the bravest that ever lived. We are eternally grateful!… https://t.co/a0CNWzUvaU
--------
A big 

### Creating a function that queries Twitter's API repeatedly, and creates a dataframe and updates it with new tweets

**Import time library to control how often the function runs**

In [64]:
import time

In [65]:
import pandas as pd

**Create emtpy dataframe with the necessary columns**

In [76]:
tweetsData = pd.DataFrame(columns=['created_at', 'tweet_id', 'text', 'user_id', 'user_name'])

**Query the API and for every tweet get the relevant information. After that, check to see if the `tweet_id` already exists in the destination dataframe, if it does, skip it, and if it doesn't, add it to the end of the dataframe.**

In [86]:
tweets = api.search('#trump')
for tweet in tweets:
    created_at = tweet._json['created_at']
    tweet_id = tweet._json['id']
    text = tweet._json['text']
    user_id = tweet._json['user']['id']
    user_name = tweet._json['user']['name']
    if tweet_id in tweetsData['id'].unique():
        pass
    else:
        thisRow = [created_at, tweet_id, text, user_id, user_name]
        tweetsData.loc[len(tweetsData) + 1] = thisRow
print(tweetsData)

                        created_at                   id  \
1   Fri Jun 07 01:19:07 +0000 2019  1136804749280468993   
2   Fri Jun 07 01:19:05 +0000 2019  1136804740120174592   
3   Fri Jun 07 01:19:03 +0000 2019  1136804732427804672   
4   Fri Jun 07 01:19:00 +0000 2019  1136804718334939136   
5   Fri Jun 07 01:18:47 +0000 2019  1136804662823325697   
6   Fri Jun 07 01:18:46 +0000 2019  1136804662223482881   
7   Fri Jun 07 01:18:43 +0000 2019  1136804647438508032   
8   Fri Jun 07 01:18:42 +0000 2019  1136804645031034881   
9   Fri Jun 07 01:19:30 +0000 2019  1136804846609346560   
10  Fri Jun 07 01:19:29 +0000 2019  1136804839499993089   
11  Fri Jun 07 01:19:21 +0000 2019  1136804805958127616   
12  Fri Jun 07 01:19:20 +0000 2019  1136804804632596482   
13  Fri Jun 07 01:19:18 +0000 2019  1136804795400896512   
14  Fri Jun 07 01:19:17 +0000 2019  1136804790267199488   
15  Fri Jun 07 01:19:15 +0000 2019  1136804783581532160   
16  Fri Jun 07 01:19:12 +0000 2019  1136804767542513664 