# APIs and the Requests Library

- What's a web API?
- What does REST mean?
- Introduction to Python's Requests library
- Work with APIs to pass and pull data into python programs

# Questions you'll be able to answer after this module


 - What is a REST API?
 - What is HATEOS, and why is it important?
 - How does the HTTP request/response cycle work?
 - What are the HTTP verbs, and how do they work?
 - How can we map create-read-update-delete (CRUD) operations onto the HTTP verbs?
 - What are some common HTTP response codes and what do they mean?

# What is REST, anyway?

**Re**presentational **S**tate **T**ransfer, from Roy Fielding's Thesis: https://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm

 - Architectural style that embraces the basic technologies of the web (HTTP and hyptertext)
 - *Not* a specification like SOAP, XMLRPC, or others


## Main characteristics of REST

 - There are some abstract _resources_ that you wish to expose to the web
 - Those _resources_ have one or more _representations_ (HTML, JSON, and XML are common)
 - Resources are unambiguously identified by a _resource identifier_ (URL)
 - _Hypermedia_ is used to link resources to one another using their URLs

# HATEOAS

**H**ypertext **a**s **t**he **e**ngine **o**f **a**pplication **s**tate

 - Applications can use the API without any prior knowledge of its URL structure
 - All resources are located using URLs that are present in other resources
 - There is an initial URL / resource from which you can discover all other URLs / resources in the API
 - Your client code and your server code can evolve independently

# REST and HTTP

(**H**yper**t**ext **t**ransfer **p**rotocol, for those who were wondering)

The web, as it's used by web browsers and random folks, is more or less a REST API.

 - You use a small number of entry points
 - All resources are fetched using URLs, usually by following a link in another resource
 - Your browser negotiates the exact representation of each resource, but generally fetches HTML and images

# HTTP Overview: The request

At a conceptual level, HTTP is just a series of requests and responses. An HTTP request consists of:

 - The request line (e.g. `GET / HTTP/1.1`)
   - Verb: `GET`
   - URL: `/`
   - HTTP version: `HTTP/1.1`
 - Headers for metadata about the request (e.g. `Host: www.cnn.com` or `Authorization: basic foobarbaz`)
 - (sometimes) a body containing a representation

# HTTP Overview: The verbs

 - GET - retrieve a resource by URL
 - PUT - replace a resource at a given URL with the content of this request
 - DELETE - delete the resource at this URL
 - POST - do something else
 - PATCH (proposed but not yet standard) - update the resource at a given URL with the content of this request

# HTTP Overview: The verbs

- Safe: does it leave the resource unmodified?
- Idempotent: if you issue the request a single time, is the resource the same as if you issue the request multiple times?
- Body: does the request include a resource representation in the body?


| Verb   | Safe? | Idempotent? | Body? |
|--------|-------|-------------|-------|
| GET    | yes   | yes         | no    |
| PUT    | no    | yes         | yes   |
| POST   | no    | no          | yes   |
| DELETE | no    | yes         | no    |
| PATCH  | no    | no          | yes   |

# JSON: Javascript Object Notation

```
{"JSON": {
    "can_contain": {
        "integers": 3, 
        "floating_point": 3.14,
        "strings": "foobar",
        "boolean": [true, false],
        "arrays": [],
        "sub-objects": {},
        "null": null
    }
}}
```

In [1]:
import json
s='''{"JSON": {
    "can_contain": {
        "integers": 3, 
        "floating_point": 3.14,
        "strings": "foobar",
        "boolean": [true, false],
        "arrays": [],
        "sub-objects": {},
        "null": null
    }
}}
'''

In [2]:
json.loads(s)

{'JSON': {'can_contain': {'integers': 3,
   'floating_point': 3.14,
   'strings': 'foobar',
   'boolean': [True, False],
   'arrays': [],
   'sub-objects': {},
   'null': None}}}

In [3]:
json.loads(json.dumps({'foo': 'bar'}))

{'foo': 'bar'}

# One more acronym: CRUD

**C**reate, **R**ead, **U**pdate, **D**elete: how we deal with data

A common pattern, and one we'll use in the REST APIs we create, is to map CRUD onto the HTTP verbs as follows:

| CRUD   | HTTP        |
|--------|-------------|
| Create | POST        |
| Read   | GET         |
| Update | PUT / PATCH |
| Delete | DELETE      |

# HTTP Overview: The response

 - The response line (e.g. `HTTP/1.1 200 OK`)
   - HTTP Version: `HTTP/1.1`
   - Response code: `200 OK`
 - Headers (e.g. `Cache-Control: max-age=60`)
 - (sometimes) a body containing a representation


# HTTP Overview: Response Codes

There are a lot of them. Below are the more common ones you'll use/see.

 - 1xx - Informational - you won't need to worry much about these
 - 2xx - Everything's OK!
   - 200 OK
   - 201 Created
   - 202 Accepted
   - 204 No Content
 - 3xx - Redirection
   - 301 Moved Permanently
   - 302 Found ('temporary redirect')
   - 303 See Other (another 'temporary redirect' -- this one is "more correct" than 302)


# HTTP Overview: Response Codes (error)

 - 4xx - There was a problem with the *client*
   - 400 Bad Request
   - 401 Unauthorized
   - 403 Forbidden
   - 404 Not Found
   - 405 Method Not Allowed
   - 409 Conflict
   - 410 Gone
 - 5xx - There was a problem with the *server*
   - 500 Internal Server Error
   - 502 Bad Gateway
   - 503 Service Unavailable
   - 504 Gateway Timeout

# HTTP/1.1 418 I'm a teapot

https://tools.ietf.org/html/rfc2324#section-2.3.2

# Introducing Requests

> HTTP for Humans

```
>>> r = requests.get('https://api.github.com/user', auth=('user', 'pass'))
>>> r.status_code
200
>>> r.headers['content-type']
'application/json; charset=utf8'
>>> r.encoding
'utf-8'
>>> r.text
u'{"type":"User"...'
>>> r.json()
{u'private_gists': 419, u'total_private_repos': 77, ...}
```

In [5]:
!pip install -U requests

Collecting requests
  Using cached https://files.pythonhosted.org/packages/7d/e3/20f3d364d6c8e5d2353c72a67778eb189176f08e873c9900e10c0287b84b/requests-2.21.0-py2.py3-none-any.whl
Collecting certifi>=2017.4.17 (from requests)
  Using cached https://files.pythonhosted.org/packages/9f/e0/accfc1b56b57e9750eba272e24c4dddeac86852c2bebd1236674d7887e8a/certifi-2018.11.29-py2.py3-none-any.whl
Collecting idna<2.9,>=2.5 (from requests)
  Using cached https://files.pythonhosted.org/packages/14/2c/cd551d81dbe15200be1cf41cd03869a46fe7226e7450af7a6545bfc474c9/idna-2.8-py2.py3-none-any.whl
Collecting chardet<3.1.0,>=3.0.2 (from requests)
  Using cached https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl
Installing collected packages: certifi, idna, chardet, requests
Successfully installed certifi-2018.11.29 chardet-3.0.4 idna-2.8 requests-2.21.0
[33mYou are using pip version 18.1, however version 19.0.2 is avail

In [4]:
import requests

In [5]:
resp = requests.get('https://api.github.com')
resp

<Response [200]>

In [6]:
resp.text

'{"current_user_url":"https://api.github.com/user","current_user_authorizations_html_url":"https://github.com/settings/connections/applications{/client_id}","authorizations_url":"https://api.github.com/authorizations","code_search_url":"https://api.github.com/search/code?q={query}{&page,per_page,sort,order}","commit_search_url":"https://api.github.com/search/commits?q={query}{&page,per_page,sort,order}","emails_url":"https://api.github.com/user/emails","emojis_url":"https://api.github.com/emojis","events_url":"https://api.github.com/events","feeds_url":"https://api.github.com/feeds","followers_url":"https://api.github.com/user/followers","following_url":"https://api.github.com/user/following{/target}","gists_url":"https://api.github.com/gists{/gist_id}","hub_url":"https://api.github.com/hub","issue_search_url":"https://api.github.com/search/issues?q={query}{&page,per_page,sort,order}","issues_url":"https://api.github.com/issues","keys_url":"https://api.github.com/user/keys","notificati

In [7]:
resp.headers

{'Server': 'GitHub.com', 'Date': 'Fri, 08 Mar 2019 11:44:05 GMT', 'Content-Type': 'application/json; charset=utf-8', 'Transfer-Encoding': 'chunked', 'Status': '200 OK', 'X-RateLimit-Limit': '60', 'X-RateLimit-Remaining': '59', 'X-RateLimit-Reset': '1552049045', 'Cache-Control': 'public, max-age=60, s-maxage=60', 'Vary': 'Accept', 'ETag': 'W/"7dc470913f1fe9bb6c7355b50a0737bc"', 'X-GitHub-Media-Type': 'github.v3; format=json', 'Access-Control-Expose-Headers': 'ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type', 'Access-Control-Allow-Origin': '*', 'Strict-Transport-Security': 'max-age=31536000; includeSubdomains; preload', 'X-Frame-Options': 'deny', 'X-Content-Type-Options': 'nosniff', 'X-XSS-Protection': '1; mode=block', 'Referrer-Policy': 'origin-when-cross-origin, strict-origin-when-cross-origin', 'Content-Security-Policy': "default-src 'none'", 'Con

In [8]:
resp.json()

{'current_user_url': 'https://api.github.com/user',
 'current_user_authorizations_html_url': 'https://github.com/settings/connections/applications{/client_id}',
 'authorizations_url': 'https://api.github.com/authorizations',
 'code_search_url': 'https://api.github.com/search/code?q={query}{&page,per_page,sort,order}',
 'commit_search_url': 'https://api.github.com/search/commits?q={query}{&page,per_page,sort,order}',
 'emails_url': 'https://api.github.com/user/emails',
 'emojis_url': 'https://api.github.com/emojis',
 'events_url': 'https://api.github.com/events',
 'feeds_url': 'https://api.github.com/feeds',
 'followers_url': 'https://api.github.com/user/followers',
 'following_url': 'https://api.github.com/user/following{/target}',
 'gists_url': 'https://api.github.com/gists{/gist_id}',
 'hub_url': 'https://api.github.com/hub',
 'issue_search_url': 'https://api.github.com/search/issues?q={query}{&page,per_page,sort,order}',
 'issues_url': 'https://api.github.com/issues',
 'keys_url': '

In [9]:
resp = requests.get('https://api.github.com/orgs/{org}'.format(
    org='Arborian'))
resp

<Response [200]>

In [10]:
resp.json()

{'login': 'Arborian',
 'id': 24494919,
 'node_id': 'MDEyOk9yZ2FuaXphdGlvbjI0NDk0OTE5',
 'url': 'https://api.github.com/orgs/Arborian',
 'repos_url': 'https://api.github.com/orgs/Arborian/repos',
 'events_url': 'https://api.github.com/orgs/Arborian/events',
 'hooks_url': 'https://api.github.com/orgs/Arborian/hooks',
 'issues_url': 'https://api.github.com/orgs/Arborian/issues',
 'members_url': 'https://api.github.com/orgs/Arborian/members{/member}',
 'public_members_url': 'https://api.github.com/orgs/Arborian/public_members{/member}',
 'avatar_url': 'https://avatars1.githubusercontent.com/u/24494919?v=4',
 'description': '',
 'name': 'Arborian Consulting',
 'company': None,
 'blog': 'http://www.arborian.com/',
 'location': 'Atlanta, GA',
 'email': 'info@arborian.com',
 'is_verified': False,
 'has_organization_projects': True,
 'has_repository_projects': True,
 'public_repos': 13,
 'public_gists': 0,
 'followers': 0,
 'following': 0,
 'html_url': 'https://github.com/Arborian',
 'created_a

# Postb.in - quick, throw-away websites to test API clients

Create bin

In [11]:
resp = requests.post('https://postb.in/api/bin')
resp

<Response [201]>

In [12]:
bin_data = resp.json()
bin_data

{'inserted': 1552045502522,
 'updated': 1552045502522,
 'binId': 'dvvLKprW',
 'expires': 1552047302522}

In [13]:
resp = requests.get('https://postb.in/api/bin/{binId}'.format_map(bin_data))
resp.raise_for_status()
resp.json()

{'inserted': 1552045502522,
 'updated': 1552045502522,
 'binId': 'dvvLKprW',
 'expires': 1552047302522}

In [14]:
bin_url = 'https://postb.in/{binId}'.format_map(bin_data)
bin_url

'https://postb.in/dvvLKprW'

In [15]:
resp = requests.post(bin_url, json={'name': 'Arborian'})

In [16]:
resp

<Response [200]>

In [17]:
reqId = resp.text
reqId

'CEKrgG0G'

In [18]:
req_url = 'https://postb.in/api/bin/{binId}/req/{reqId}'.format(
    binId=bin_data['binId'],
    reqId=reqId,
)
resp = requests.get(req_url)
resp

<Response [200]>

In [19]:
resp.json()

{'method': 'POST',
 'path': '/dvvLKprW',
 'headers': {'host': 'postb-in.herokuapp.com',
  'connection': 'close',
  'user-agent': 'python-requests/2.21.0',
  'accept-encoding': 'gzip, deflate',
  'accept': '*/*',
  'content-type': 'application/json',
  'fly-request-id': 'bMRgNjqaEV0yDVMabWWdqu0rcE',
  'fly-app': 'postbin-proxy',
  'connect-time': '0',
  'total-route-time': '0',
  'content-length': '20'},
 'query': {},
 'body': {'name': 'Arborian'},
 'ip': '103.68.105.254',
 'binId': 'dvvLKprW',
 'inserted': 1552045517133,
 'reqId': 'CEKrgG0G'}

# Requests Sessions

- Keep your code DRY (don't repeat yourself)
- Keep headers, authentication, etc. around
- Hold on to your cookies

In [20]:
sess = requests.Session()
sess.headers['Content-Type'] = 'application/json'
sess.auth = ('user', 'pass')

print(bin_url)
resp = sess.get(bin_url)
resp

https://postb.in/dvvLKprW


<Response [200]>

In [21]:
reqId = resp.text
reqId

'4kSNGBuy'

In [22]:
req_url = 'https://postb.in/api/bin/{binId}/req/{reqId}'.format(
    binId=bin_data['binId'],
    reqId=reqId,
)
resp = requests.get(req_url)
resp

<Response [200]>

In [53]:
resp.json()

{'method': 'GET',
 'path': '/1Tat5IFv',
 'headers': {'host': 'postb-in.herokuapp.com',
  'connection': 'close',
  'user-agent': 'python-requests/2.21.0',
  'accept': '*/*',
  'accept-encoding': 'gzip, deflate',
  'authorization': 'Basic dXNlcjpwYXNz',
  'content-type': 'application/json',
  'fly-request-id': 'bMDWhpUKl4j07sEjSrMzdKgLYM',
  'fly-app': 'postbin-proxy',
  'connect-time': '1',
  'total-route-time': '0'},
 'query': {},
 'body': {},
 'ip': '107.139.197.233',
 'binId': '1Tat5IFv',
 'inserted': 1550694502272,
 'reqId': 'tzH0Zz9X'}

In [23]:
import base64
hdr = resp.json()['headers']['authorization']
base64.b64decode(hdr.split()[-1])

b'user:pass'

# Handling errors

In [24]:
resp = sess.get('https://postb.in/foo')
resp

<Response [404]>

In [25]:
resp.content

b'404 - Not Found\n'

In [26]:
resp.status_code

404

In [27]:
try:
    resp.raise_for_status()
except Exception as err:
    print(f'Raised exception {err}')
    saved_err = err

Raised exception 404 Client Error: Not Found for url: https://postb.in/foo


In [28]:
saved_err.request

<PreparedRequest [GET]>

In [29]:
saved_err.response

<Response [404]>

In [30]:
type(saved_err)

requests.exceptions.HTTPError

# Basic Authentication with Requests

In [31]:
sess.auth = ('user', 'pass')
r = sess.get('https://api.github.com')
r.json()

{'message': 'Maximum number of login attempts exceeded. Please try again later.',
 'documentation_url': 'https://developer.github.com/v3'}

In [32]:
r

<Response [403]>

In [33]:
sess.auth = None
sess.get('https://api.github.com').raise_for_status()

# Using uri templates a-la Github

In [91]:
repos = sess.get('https://api.github.com/orgs/Arborian/repos').json()
repos[0]['commits_url']

'https://api.github.com/repos/Arborian/f2e/commits{/sha}'

In [70]:
!pip install uritemplate

Collecting uritemplate
  Using cached https://files.pythonhosted.org/packages/e5/7d/9d5a640c4f8bf2c8b1afc015e9a9d8de32e13c9016dcc4b0ec03481fb396/uritemplate-3.0.0-py2.py3-none-any.whl
Installing collected packages: uritemplate
Successfully installed uritemplate-3.0.0
[33mYou are using pip version 18.1, however version 19.0.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [71]:
import uritemplate

In [94]:
[r['name'] for r in repos]

['f2e',
 'ansible-class',
 'client-py',
 'supreload',
 'react-big-calendar',
 'fhirstorm',
 'ansible-role-arborian',
 'inventory',
 'flask-pymongo',
 'chapman2',
 'policies',
 'my-repo',
 'recommender']

In [111]:
# commits_url = repos[-1]['commits_url']
commits_url = 'https://api.github.com/repos/Arborian/recommender/commits{/sha}'

In [112]:
commits = sess.get(uritemplate.expand(commits_url)).json()

In [113]:
commits[0]['sha']

'3a0b9b6617dbad6ed79e01d04312840232943967'

In [114]:
uritemplate.expand(repos[0]['commits_url'], sha=commits[0]['sha'])

'https://api.github.com/repos/Arborian/f2e/commits/3a0b9b6617dbad6ed79e01d04312840232943967'

In [126]:
for x in range(100):
    commit = sess.get(uritemplate.expand(commits_url, sha=commits[0]['sha'])).json()

In [127]:
commits[0]['sha']

'3a0b9b6617dbad6ed79e01d04312840232943967'

# Github: create a personal access token

- https://github.com/settings/tokens
- Send in an Authorization header

In [128]:
access_token = '821479bf917a66190a9b8ef4b401b8ea68d21d85'

In [129]:
sess.headers['Authorization'] = 'token {}'.format(access_token)

In [132]:
commit = sess.get(uritemplate.expand(commits_url, sha=commits[0]['sha'])).json()

In [134]:
commit['commit']

{'author': {'name': 'Rick Copeland',
  'email': 'rick@arborian.com',
  'date': '2019-02-19T00:28:00Z'},
 'committer': {'name': 'Rick Copeland',
  'email': 'rick@arborian.com',
  'date': '2019-02-19T00:28:00Z'},
 'message': 'remove print statement',
 'tree': {'sha': 'a6607f3243618656931442d7d110c68dcf0a245f',
  'url': 'https://api.github.com/repos/Arborian/recommender/git/trees/a6607f3243618656931442d7d110c68dcf0a245f'},
 'url': 'https://api.github.com/repos/Arborian/recommender/git/commits/3a0b9b6617dbad6ed79e01d04312840232943967',
 'comment_count': 0,
 'verification': {'verified': False,
  'reason': 'unsigned',
  'signature': None,
  'payload': None}}

In [135]:
sess.get('https://api.github.com').json()

{'current_user_url': 'https://api.github.com/user',
 'current_user_authorizations_html_url': 'https://github.com/settings/connections/applications{/client_id}',
 'authorizations_url': 'https://api.github.com/authorizations',
 'code_search_url': 'https://api.github.com/search/code?q={query}{&page,per_page,sort,order}',
 'commit_search_url': 'https://api.github.com/search/commits?q={query}{&page,per_page,sort,order}',
 'emails_url': 'https://api.github.com/user/emails',
 'emojis_url': 'https://api.github.com/emojis',
 'events_url': 'https://api.github.com/events',
 'feeds_url': 'https://api.github.com/feeds',
 'followers_url': 'https://api.github.com/user/followers',
 'following_url': 'https://api.github.com/user/following{/target}',
 'gists_url': 'https://api.github.com/gists{/gist_id}',
 'hub_url': 'https://api.github.com/hub',
 'issue_search_url': 'https://api.github.com/search/issues?q={query}{&page,per_page,sort,order}',
 'issues_url': 'https://api.github.com/issues',
 'keys_url': '

In [136]:
sess.get('https://api.github.com/user').json()

{'login': 'rick446',
 'id': 196783,
 'node_id': 'MDQ6VXNlcjE5Njc4Mw==',
 'avatar_url': 'https://avatars2.githubusercontent.com/u/196783?v=4',
 'gravatar_id': '',
 'url': 'https://api.github.com/users/rick446',
 'html_url': 'https://github.com/rick446',
 'followers_url': 'https://api.github.com/users/rick446/followers',
 'following_url': 'https://api.github.com/users/rick446/following{/other_user}',
 'gists_url': 'https://api.github.com/users/rick446/gists{/gist_id}',
 'starred_url': 'https://api.github.com/users/rick446/starred{/owner}{/repo}',
 'subscriptions_url': 'https://api.github.com/users/rick446/subscriptions',
 'organizations_url': 'https://api.github.com/users/rick446/orgs',
 'repos_url': 'https://api.github.com/users/rick446/repos',
 'events_url': 'https://api.github.com/users/rick446/events{/privacy}',
 'received_events_url': 'https://api.github.com/users/rick446/received_events',
 'type': 'User',
 'site_admin': False,
 'name': 'Rick Copeland',
 'company': 'Arborian Consult

# Lab

Open the [API and Requests Lab][requests-lab]

[requests-lab]: ./requests-lab.ipynb