# Curriculum Acquire code

## Making HTTP Requests

- The way we interact with web sites and web servers (including RESTful JSON APIs) is through HTTP **requests** and **responses**.

- We can use the `requests` library to make http requests. This is somewhat the same as visiting a url in your browser, except that we can interact with the responses programatically in python.

In [1]:
import requests


We will use the `get` function from `requests` and pass it a url:



In [2]:
response = requests.get('http://request-inspector.glitch.me/')
response

<Response [200]>

<p>We get back a python object that represents an HTTP response.</p>
<p>The response object has several interesting properties:</p>
<ul>
<li><code>.ok</code>: a boolean that indicates that the response was successful (the server sent back a 200 response code)</li>
<li><code>.status_code</code>: a number indicating the HTTP response status code </li>
<li><code>.text</code>: the raw response text</li>
</ul>

In [3]:
response.ok


True

In [4]:
response.status_code


200

In [5]:
response.text


'{"method":"GET","query":{},"body":{}}'

<p>In this case, we see a string that contains HTML. HTML is what makes up web pages that are intended for humans to read. If you go to http://example.com, you'll see what the HTML in the above response looks like when rendered. Some other endpoints on the internet return JSON, which is usually intended to be worked with programatically.</p>


<h2 id="example-json-api">Example JSON API</h2>
<p>For an example of a JSON api, we'll interact with the <a href="https://aphorisms.glitch.me/">a quote generator</a>.</p>


In [6]:
url = 'https://swapi.dev/api/people/5'
response = requests.get(url)
print(response.text)

{"name":"Leia Organa","height":"150","mass":"49","hair_color":"brown","skin_color":"light","eye_color":"brown","birth_year":"19BBY","gender":"female","homeworld":"https://swapi.dev/api/planets/2/","films":["https://swapi.dev/api/films/1/","https://swapi.dev/api/films/2/","https://swapi.dev/api/films/3/","https://swapi.dev/api/films/6/"],"species":[],"vehicles":["https://swapi.dev/api/vehicles/30/"],"starships":[],"created":"2014-12-10T15:20:09.791000Z","edited":"2014-12-20T21:17:50.315000Z","url":"https://swapi.dev/api/people/5/"}


<p>Here we see that the repsonse we got back contains a JSON object (we could also verify this by visiting the URL in a web browser).</p>
<p>Since the response is JSON, we can use the <code>.json</code> method on the response object to get a data structure we can work with:</p>

In [7]:
data = response.json()
print(type(data))
data

<class 'dict'>


{'name': 'Leia Organa',
 'height': '150',
 'mass': '49',
 'hair_color': 'brown',
 'skin_color': 'light',
 'eye_color': 'brown',
 'birth_year': '19BBY',
 'gender': 'female',
 'homeworld': 'https://swapi.dev/api/planets/2/',
 'films': ['https://swapi.dev/api/films/1/',
  'https://swapi.dev/api/films/2/',
  'https://swapi.dev/api/films/3/',
  'https://swapi.dev/api/films/6/'],
 'species': [],
 'vehicles': ['https://swapi.dev/api/vehicles/30/'],
 'starships': [],
 'created': '2014-12-10T15:20:09.791000Z',
 'edited': '2014-12-20T21:17:50.315000Z',
 'url': 'https://swapi.dev/api/people/5/'}

<p>Now we have a dictionary that we can work with.</p>
<p>Let's now take a look at another api. We'll start by looking at just the base URL:  </p>


In [8]:
base_url = 'https://python.zgulde.net'
print(requests.get(base_url).text)

{"api":"/api/v1","help":"/documentation"}



In [9]:
{"api":"/api/v1","help":"/documentation"}


{'api': '/api/v1', 'help': '/documentation'}

<p>This API provides some documentation, so let's make a request so that we can take a look at it.</p>


In [10]:
response = requests.get(base_url + '/documentation')
print(response.json()['payload'])


The API accepts GET requests for all endpoints, where endpoints are prefixed
with

    /api/{version}

Where version is "v1"

Valid endpoints:

- /stores[/{store_id}]
- /items[/{item_id}]
- /sales[/{sale_id}]

All endpoints accept a `page` parameter that can be used to navigate through
the results.



<p>Based on this, let's take a look at the items. We'll make our request, and explore the shape of the response that we get back.</p>


In [11]:
response = requests.get('https://python.zgulde.net/api/v1/items')

data = response.json()
data.keys()

dict_keys(['payload', 'status'])

In [12]:
data['payload'].keys()


dict_keys(['items', 'max_page', 'next_page', 'page', 'previous_page'])

In [13]:
print('max_page: %s' % data['payload']['max_page'])
print('next_page: %s' % data['payload']['next_page'])

max_page: 3
next_page: /api/v1/items?page=2


<p>Here the response has some built-in properties that tell us how to get to subsequent pages.</p>
<p>Once we've drilled down into the data structure, we'll find that the entire response is a sort of wrapper around the <code>items</code> property:</p>


In [14]:
data['payload']['items'][:2]


[{'item_brand': 'Riceland',
  'item_id': 1,
  'item_name': 'Riceland American Jazmine Rice',
  'item_price': 0.84,
  'item_upc12': '35200264013',
  'item_upc14': '35200264013'},
 {'item_brand': 'Caress',
  'item_id': 2,
  'item_name': 'Caress Velvet Bliss Ultra Silkening Beauty Bar - 6 Ct',
  'item_price': 6.44,
  'item_upc12': '11111065925',
  'item_upc14': '11111065925'}]

<p>We can turn this data into a pandas dataframe:</p>


In [15]:
import pandas as pd

df = pd.DataFrame(data['payload']['items'])
df.head()

Unnamed: 0,item_brand,item_id,item_name,item_price,item_upc12,item_upc14
0,Riceland,1,Riceland American Jazmine Rice,0.84,35200264013,35200264013
1,Caress,2,Caress Velvet Bliss Ultra Silkening Beauty Bar - 6 Ct,6.44,11111065925,11111065925
2,Earths Best,3,Earths Best Organic Fruit Yogurt Smoothie Mixed Berry,2.43,23923330139,23923330139
3,Boars Head,4,Boars Head Sliced White American Cheese - 120 Ct,3.14,208528800007,208528800007
4,Back To Nature,5,Back To Nature Gluten Free White Cheddar Rice Thin Crackers,2.61,759283100036,759283100036


<p>Now that we've gotten the data from the first page, we can extract the data from the next page (as indicated by the API's response), and add it onto our dataframe:</p>


In [16]:
response = requests.get(base_url + data['payload']['next_page'])
data = response.json()

print('max_page: %s' % data['payload']['max_page'])
print('next_page: %s' % data['payload']['next_page'])

df = pd.concat([df, pd.DataFrame(data['payload']['items'])]).reset_index()

max_page: 3
next_page: /api/v1/items?page=3


<p>We'll repeat the process one more time:</p>


In [17]:
response = requests.get(base_url + data['payload']['next_page'])
data = response.json()

print('max_page: %s' % data['payload']['max_page'])
print('next_page: %s' % data['payload']['next_page'])

df = pd.concat([df, pd.DataFrame(data['payload']['items'])]).reset_index()

max_page: 3
next_page: None


<p>Now that the API says that the <code>next_page</code> is None, we'll stop making requests, and assume that we have all of the <code>items</code> data.</p>


In [18]:
df.shape

(50, 8)

In [19]:
df

Unnamed: 0,level_0,index,item_brand,item_id,item_name,item_price,item_upc12,item_upc14
0,0,0.0,Riceland,1,Riceland American Jazmine Rice,0.84,35200264013,35200264013
1,1,1.0,Caress,2,Caress Velvet Bliss Ultra Silkening Beauty Bar - 6 Ct,6.44,11111065925,11111065925
2,2,2.0,Earths Best,3,Earths Best Organic Fruit Yogurt Smoothie Mixed Berry,2.43,23923330139,23923330139
3,3,3.0,Boars Head,4,Boars Head Sliced White American Cheese - 120 Ct,3.14,208528800007,208528800007
4,4,4.0,Back To Nature,5,Back To Nature Gluten Free White Cheddar Rice Thin Crackers,2.61,759283100036,759283100036
...,...,...,...,...,...,...,...,...
45,5,,Mama Marys,46,Pizza Sauce,4.65,35457770664,35457770664
46,6,,Bear Naked,47,Bear Naked Fit Almond Crisp 100 Percent Natural Energy Cereal,7.38,884623708976,884623708976
47,7,,Dove,48,Dove Men + Care Antiperspirant Deodorant Cool Silver,3.72,79400271631,79400271631
48,8,,Easy-off,49,Easy-off Oven Cleaner Lemon Scent,9.54,62338879772,62338879772


<h2 id="further-reading">Further Reading</h2>
<ul>
<li><a href="https://www.dataquest.io/blog/python-api-tutorial/">Using APIs in Python</a></li>
<li><a href="https://www.smashingmagazine.com/2018/01/understanding-using-rest-api/">Understand and using REST APIs</a></li>
</ul>

---

## Dataquest API Tutorial

<p>In this Python API tutorial, we’ll learn how to retrieve data for data science projects. There are millions of APIs online which provide access to data. Websites like <a href="https://www.reddit.com/dev/api/" style="outline: none;">Reddit</a>, <a href="https://developer.twitter.com/en/docs.html" style="outline: none;">Twitter</a>, and <a href="https://developers.facebook.com/" style="outline: none;">Facebook</a> all offer certain data through their APIs.</p>
<p>To use an API, you make a request to a remote web server, and retrieve the data you need.</p>
<p>But why use an API instead of a static CSV dataset you can download from the web? APIs are useful in the following cases:</p>



<ul>
<li><strong>The data is changing quickly</strong>. An example of this is stock price data. It doesn’t really make sense to regenerate a dataset and download it every minute — this will take a lot of bandwidth, and be pretty slow.</li>
<li><strong>You want a small piece of a much larger set of data</strong>. Reddit comments are one example. What if you want to just pull your own comments on Reddit? It doesn’t make much sense to download the entire Reddit database, then filter just your own comments.</li>
<li><strong>There is repeated computation involved</strong>. Spotify has an API that can tell you the genre of a piece of music. You could theoretically create your own classifier, and use it to compute music categories, but you’ll never have as much data as Spotify does.</li>
</ul>
<p>In cases like the ones above, an API is the right solution. In this blog post, we’ll be querying a simple API to retrieve data about the <a href="https://en.wikipedia.org/wiki/International_Space_Station" style="outline: none;">International Space Station</a> (ISS).</p>
<h2>About this Python API Tutorial</h2>
<p>This tutorial is based on part of our interactive course on APIs and Webscraping in Python, which you can <a href="https://www.dataquest.io/course/apis-and-scraping/" style="outline: none;">start for free</a>.</p>
<p>For this tutorial, we assume that you know some of the fundamentals of working with data in Python. If you don’t, you might like to try our <a href="https://www.dataquest.io/course/python-for-data-science-fundamentals/" style="outline: none;">free Python Fundamentals course</a>.</p>
<p>If you’re looking for something more advanced, check out our <a href="https://www.dataquest.io/blog/last-fm-api-python/" style="outline: none;">Intermediate API tutorial</a>.</p>
<h2>What is an API?</h2>
<p>An API, or Application Programming Interface, is a server that you can use to retrieve and send data to using code. APIs are most commonly used to retrieve data, and that will be the focus of this beginner tutorial.</p>
<p>When we want to receive data from an API, we need to make a <strong>request</strong>. Requests are used all over the web. For instance, when you visited this blog post, your web browser made a request to the Dataquest web server, which responded with the content of this web page.</p>
<p><span></span></p>

<h2>Making API Requests in Python</h2>
<p>In order to work with APIs in Python, we need tools that will make those requests. In Python, the most common library for making requests and working with APIs is the <a href="https://2.python-requests.org/en/master/" style="outline: none;"><strong>requests</strong> library</a>. The requests library isn’t part of the standard Python library, so you’ll need to install it to get started.</p>


In [1]:
import requests


<h2>Making Our First API Request</h2>
<p>There are many different types of requests. The most commonly used one, a <strong>GET</strong> request, is used to retrieve data. Because we’ll just be working with retrieving data, our focus will be on making ‘get’ requests.</p>
<p>When we make a request, the response from the API comes with a <strong>response code</strong> which tells us whether our request was successful. Response codes are important because they immediately tell us if something went wrong.</p>


In [2]:
response=requests.get('https://api.open-notify.org/this-api-doesnt-exist')

ConnectionError: HTTPSConnectionPool(host='api.open-notify.org', port=443): Max retries exceeded with url: /this-api-doesnt-exist (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fcd88962370>: Failed to establish a new connection: [Errno 61] Connection refused'))

In [3]:
print(response.status_code)

NameError: name 'response' is not defined

<p><img src="https://web.archive.org/web/20200107014409im_/https://www.dataquest.io/wp-content/uploads/2019/09/api-request.svg" data-src="https://www.dataquest.io/wp-content/uploads/2019/09/api-request.svg" alt="" />To make a ‘GET’ request, we’ll use the <a href="https://2.python-requests.org/en/master/user/quickstart/#make-a-request" style="outline: none;"><code>requests.get()</code> function</a>, which requires one argument — the URL we want to make the request to. We’ll start by making a request to an API endpoint that doesn’t exist, so we can see what that response code looks like.</p>
<pre><code class="language-python">response = requests.get("https://api.open-notify.org/this-api-doesnt-exist")</code></pre>
<p>The <code>get()</code> function returns a <a href="https://2.python-requests.org/en/master/user/advanced/#request-and-response-objects" style="outline: none;"><code>response</code> object</a>. We can use the <code><a href="https://2.python-requests.org/en/master/user/quickstart/#response-status-codes" style="outline: none;">response.status_code</a></code> attribute to receive the status code for our request:</p>
<pre><code class="language-python">print(response.status_code)</code></pre>
<pre><code class="language-markup">404</code></pre>
<p>The ‘404’ status code might be familiar to you — it’s the status code that a server returns if it can’t find the file we requested. In this case, we asked for <code>this-api-doesnt-exist</code> which (surprise, surprise) didn’t exist!</p>
<p>Let’s learn a little more about common status codes.</p>

In [5]:
response = requests.get("http://api.open-notify.org/astros.json")
print(response.status_code)

200


In [6]:
import json


In [7]:
def jprint(obj):
    # create a formatted string of the Python JSON object
    text = json.dumps(obj, sort_keys=True, indent=4)
    print(text)


In [8]:

jprint(response.json())

{
    "message": "success",
    "number": 10,
    "people": [
        {
            "craft": "ISS",
            "name": "Mark Vande Hei"
        },
        {
            "craft": "ISS",
            "name": "Pyotr Dubrov"
        },
        {
            "craft": "ISS",
            "name": "Anton Shkaplerov"
        },
        {
            "craft": "Shenzhou 13",
            "name": "Zhai Zhigang"
        },
        {
            "craft": "Shenzhou 13",
            "name": "Wang Yaping"
        },
        {
            "craft": "Shenzhou 13",
            "name": "Ye Guangfu"
        },
        {
            "craft": "ISS",
            "name": "Raja Chari"
        },
        {
            "craft": "ISS",
            "name": "Tom Marshburn"
        },
        {
            "craft": "ISS",
            "name": "Kayla Barron"
        },
        {
            "craft": "ISS",
            "name": "Matthias Maurer"
        }
    ]
}


In [9]:
data = response.json()
data.keys()

dict_keys(['people', 'message', 'number'])

In [12]:
data

{'people': [{'craft': 'ISS', 'name': 'Mark Vande Hei'},
  {'craft': 'ISS', 'name': 'Pyotr Dubrov'},
  {'craft': 'ISS', 'name': 'Anton Shkaplerov'},
  {'craft': 'Shenzhou 13', 'name': 'Zhai Zhigang'},
  {'craft': 'Shenzhou 13', 'name': 'Wang Yaping'},
  {'craft': 'Shenzhou 13', 'name': 'Ye Guangfu'},
  {'craft': 'ISS', 'name': 'Raja Chari'},
  {'craft': 'ISS', 'name': 'Tom Marshburn'},
  {'craft': 'ISS', 'name': 'Kayla Barron'},
  {'craft': 'ISS', 'name': 'Matthias Maurer'}],
 'message': 'success',
 'number': 10}

In [13]:
astro_df = pd.DataFrame(data['people'])
astro_df.head()

Unnamed: 0,craft,name
0,ISS,Mark Vande Hei
1,ISS,Pyotr Dubrov
2,ISS,Anton Shkaplerov
3,Shenzhou 13,Zhai Zhigang
4,Shenzhou 13,Wang Yaping


---

<h2 id="what-is-a-rest-api">What Is A REST API</h2>

<p>Let’s say you’re trying to find videos about Batman on Youtube. You open up Youtube, type &ldquo;Batman&rdquo; into a search field, hit enter, and you see a list of videos about Batman. A REST API works in a similar way. You search for something, and you get a list of results back from the service you’re requesting from.</p>

<p>An <strong>API</strong> is an application programming interface. It is a set of rules that allow programs to talk to each other. The developer creates the API on the server and allows the client to talk to it.</p>

<p><strong>REST</strong> determines how the API looks like. It stands for &ldquo;Representational State Transfer&rdquo;. It is a set of rules that developers follow when they create their API. One of these rules states that you should be able to get a piece of data (called a resource) when you link to a specific URL.</p>

<p>Each URL is called a <strong>request</strong> while the data sent back to you is called a <strong>response</strong>.</p>


<p>It’s important to know that a request is made up of four things:</p>

<ol>
<li>The endpoint</li>
<li>The method</li>
<li>The headers</li>
<li>The data (or body)</li>
</ol>

<p><strong>The endpoint</strong> (or route) is the url you request for. It follows this structure:</p>

<pre><code class="language-bash">root-endpoint/<path>?<query-params>
</code></pre>

<p>The <strong>root-endpoint</strong> is the starting point of the API you’re requesting from. The root-endpoint of <a href="https://developer.github.com/v3/">Github’s API</a> is <code>https://api.github.com</code> while the root-endpoint <a href="https://dev.twitter.com/rest/reference">Twitter’s API</a> is <code>https://api.twitter.com</code>.</p>

<p>The <strong>path</strong> determines the resource you’re requesting for. Think of it like an automatic answering machine that asks you to press 1 for a service, press 2 for another service, 3 for yet another service and so on.</p>


<p>You can access paths just like you can link to parts of a website. For example, to get a list of all posts tagged under &ldquo;JavaScript&rdquo; on Smashing Magazine, you navigate to <code>https://www.smashingmagazine.com/tag/javascript/</code>. <code>https://www.smashingmagazine.com/</code> is the root-endpoint and <code>/tag/javascript</code> is the path.</p>

<p>To understand what paths are available to you, you need to look through the API documentation.
For example, let’s say you want to get a list of repositories by a certain user through Github’s API. The <a href="https://developer.github.com/v3/repos/#list-user-repositories">docs</a> tells you to use the the following path to do so:</p>

<pre><code class="language-bash">/users/:username/repos
</code></pre>

<p>Any colons (<code>:</code>) on a path denotes a variable. You should replace these values with actual values of when you send your request. In this case, you should replace <code>:username</code> with the actual username of the user you’re searching for. If I’m searching for my Github account, I’ll replace <code>:username</code> with <code>zellwk</code>.</p>

<p>The endpoint to get a list of my repos on Github is this:</p>

<pre><code class="language-bash">https://api.github.com/users/zellwk/repos</code></pre>


<p>The final part of an endpoint is <strong>query parameters</strong>. Technically, query parameters are not part of the REST architecture, but you’ll see lots of APIs use them. So, to help you completely understand how to read and use API’s we’re also going to talk about them.
Query parameters give you the option to modify your request with key-value pairs. They always begin with a question mark (<code>?</code>). Each parameter pair is then separated with an ampersand (<code>&amp;</code>), like this:</p>

<div class="break-out"><pre><code class="language-bash">?query1=value1&query2=value2
</code></pre></div>

<p>When you try to get a list of a user’s repositories on Github, you add three possible parameters to your request to modify the results given to you:</p>



In [14]:
repos = requests.get('https://api.github.com/users/jared-godar/repos?sort=pushed')


In [15]:
repos

<Response [200]>

In [16]:
repos=repos.json()

In [19]:
print(type(repos))

<class 'list'>


In [17]:
repos

[{'id': 446510779,
  'node_id': 'R_kgDOGp02uw',
  'name': 'time-series-exercises',
  'full_name': 'Jared-Godar/time-series-exercises',
  'private': False,
  'owner': {'login': 'Jared-Godar',
   'id': 16855088,
   'node_id': 'MDQ6VXNlcjE2ODU1MDg4',
   'avatar_url': 'https://avatars.githubusercontent.com/u/16855088?v=4',
   'gravatar_id': '',
   'url': 'https://api.github.com/users/Jared-Godar',
   'html_url': 'https://github.com/Jared-Godar',
   'followers_url': 'https://api.github.com/users/Jared-Godar/followers',
   'following_url': 'https://api.github.com/users/Jared-Godar/following{/other_user}',
   'gists_url': 'https://api.github.com/users/Jared-Godar/gists{/gist_id}',
   'starred_url': 'https://api.github.com/users/Jared-Godar/starred{/owner}{/repo}',
   'subscriptions_url': 'https://api.github.com/users/Jared-Godar/subscriptions',
   'organizations_url': 'https://api.github.com/users/Jared-Godar/orgs',
   'repos_url': 'https://api.github.com/users/Jared-Godar/repos',
   'events_

In [20]:
repos[0].keys()

dict_keys(['id', 'node_id', 'name', 'full_name', 'private', 'owner', 'html_url', 'description', 'fork', 'url', 'forks_url', 'keys_url', 'collaborators_url', 'teams_url', 'hooks_url', 'issue_events_url', 'events_url', 'assignees_url', 'branches_url', 'tags_url', 'blobs_url', 'git_tags_url', 'git_refs_url', 'trees_url', 'statuses_url', 'languages_url', 'stargazers_url', 'contributors_url', 'subscribers_url', 'subscription_url', 'commits_url', 'git_commits_url', 'comments_url', 'issue_comment_url', 'contents_url', 'compare_url', 'merges_url', 'archive_url', 'downloads_url', 'issues_url', 'pulls_url', 'milestones_url', 'notifications_url', 'labels_url', 'releases_url', 'deployments_url', 'created_at', 'updated_at', 'pushed_at', 'git_url', 'ssh_url', 'clone_url', 'svn_url', 'homepage', 'size', 'stargazers_count', 'watchers_count', 'language', 'has_issues', 'has_projects', 'has_downloads', 'has_wiki', 'has_pages', 'forks_count', 'mirror_url', 'archived', 'disabled', 'open_issues_count', 'lic