# Overview

In this notebook we'll look at how we can interact with APIs over HTTP using the `requests` package.

We will cover the following:

* How to make requests (including HTTP method)
* How to work with the response
* What kinds of HTTP errors we might encounter
* How to raise appropriate HTTP errors

In [15]:
import requests

## `requests`

[requests](https://requests.readthedocs.io/en/master/) is a library for making HTTP requests and handling their responses. It is a de facto standard in Python as it provides a neater interface that the built-in libraries.

We'll do a whirlwind tour of HTTP now, but I do recommend reading the [Mozilla Developer Network article on HTTP](https://developer.mozilla.org/en-US/docs/Web/HTTP/Overview)

### What is an HTTP request?

An HTTP request is a message sent to an HTTP Server (aka web server) for a given resource (expressed as a URL).

Most commonly these are the requests that your browser makes when you visit a website.

For example, consider the requests made when you visit https://news.ycombinator.com

![Requests Example](requests-hn.png)

Requests **must** also specify an HTTP Method. Typically this is either `GET` or `POST`.

Requests may optionally specify headers that can be useful for providing authentication or specifying expect content types.

In the simplest (and original configuration) almost all requests corresponded files on a webserver that would be returned. Somebody would write an HTML file and then when the URL for that file was requested, the webserver would return the file contents.

However this has evolved and now a URL (universal *resource* locator) frequently corresponds to some API method or logical action that the webserver will invoke and return the results of.

In either case the webserver will send back an HTTP response.

### What is an HTTP response?

A message sent in response to an HTTP request. This contains:

* An HTTP status: the result of the request (did it succeed, if so how, did it fail, if so how)
* Payload/response content/body: the actual data. In the case of a file request this returns the content of the requested file.
* Response headers: metadata pertaining to the payload and the request

![Responses Example](requests-hn-payload.png)

### HTTP Status Codes

Status codes are numbers that correspond to messages. We will use the notation 1xx to refer to 100 to 199.

**1xx** 
Informational messages (I have never received one of these)

**2xx**
Success messages. While there are different types of success, you almost always looking for:
200 - OK

**3xx**
Redirect messages. These are common but are frequently handled by your client library.
Essentially the resource you have asked for has moved and you should make a new request with the redirect data.

**4xx**
User error messages.

These happen when the webserver is working as expected but something is wrong with the request. If you resend the same request you will receive the same error because the issue is with the request, not the webserver.

Most frequently you might see:
403 - Forbidden: you are not authorized to access the URL
404 - File Not Found

**5xx**
Server error messages.

These happen when something has gone wrong on the webserver. Typically this either due to the server encountering an error while trying to respond (programming error in the API) or a timeout or the server being unreachable.

Depending on the status code, this family of errors can be handled by retrying. Sometimes the server timed out because it was too busy when you last made the request and it may succeed at a later point with the same request.

[More details on MDN](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status)

### Making some requests in Python

Let's take a look at what we get when we request https://news.ycombinator.com:

In [42]:
hacker_news_homepage_response = requests.get('https://news.ycombinator.com')
type(hacker_news_homepage_response)

requests.models.Response

We can inspect the status code:

In [8]:
hacker_news_homepage_response.status_code

200

In [43]:
hacker_news_homepage_response.reason

'OK'

We can inspect the response headers:

In [14]:
hacker_news_homepage_response.headers

{'Server': 'nginx', 'Date': 'Wed, 17 Mar 2021 09:26:59 GMT', 'Content-Type': 'text/html; charset=utf-8', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive', 'Vary': 'Accept-Encoding', 'Cache-Control': 'private; max-age=0', 'X-Frame-Options': 'DENY', 'X-Content-Type-Options': 'nosniff', 'X-XSS-Protection': '1; mode=block', 'Referrer-Policy': 'origin', 'Strict-Transport-Security': 'max-age=31556900', 'Content-Security-Policy': "default-src 'self'; script-src 'self' 'unsafe-inline' https://www.google.com/recaptcha/ https://www.gstatic.com/recaptcha/ https://cdnjs.cloudflare.com/; frame-src 'self' https://www.google.com/recaptcha/; style-src 'self' 'unsafe-inline'", 'Content-Encoding': 'gzip'}

The headers tell us some metadata about the response: when it was received, the server we spoke to (nginx), the type of content (text/html) and a bunch of other things that you don't always have to be aware of.

Let's examine the actual content.

In [17]:
hacker_news_homepage_response.content

b'<html lang="en" op="news"><head><meta name="referrer" content="origin"><meta name="viewport" content="width=device-width, initial-scale=1.0"><link rel="stylesheet" type="text/css" href="news.css?QjZXyB0Ejdsp1rLQatoS">\n        <link rel="shortcut icon" href="favicon.ico">\n          <link rel="alternate" type="application/rss+xml" title="RSS" href="rss">\n        <title>Hacker News</title></head><body><center><table id="hnmain" border="0" cellpadding="0" cellspacing="0" width="85%" bgcolor="#f6f6ef">\n        <tr><td bgcolor="#ff6600"><table border="0" cellpadding="0" cellspacing="0" width="100%" style="padding:2px"><tr><td style="width:18px;padding-right:4px"><a href="https://news.ycombinator.com"><img src="y18.gif" width="18" height="18" style="border:1px white solid;"></a></td>\n                  <td style="line-height:12pt; height:10px;"><span class="pagetop"><b class="hnname"><a href="news">Hacker News</a></b>\n              <a href="newest">new</a> | <a href="front">past</a> | 

This is the raw HTML that the webserver will display.

Let's try a request for a non-existent URL:

In [18]:
doomed_response = requests.get('https://news.ycombinator.com/foobar.html')

In [21]:
doomed_response.status_code

404

In [22]:
doomed_response.reason

'Not Found'

In [23]:
if doomed_response:
    print('doomed_response is truthy')
else:
    print('doomed response is not truthy')

doomed response is not truthy


An idiomatic way of raising errors:

In [24]:
def do_something(url):
    response = requests.get(url)
    
    if response:
        # use response.content or response.json()
        print('That worked!')
    else:
        response.raise_for_status()

In [44]:
do_something('https://news.ycombinator.com/foobar.html')

HTTPError: 404 Client Error: Not Found for url: https://news.ycombinator.com/foobar.html

In [27]:
do_something('https://news.ycombinator.com/')

That worked!


### Working with JSON responses

Here we'll be using a public API for medicine prices in South Africa: https://medicineprices.org.za/

In [28]:
myprodol_search_result = requests.get('https://medicineprices.org.za/api/v2/search-lite?q=myprodol')

In [31]:
print(myprodol_search_result.status_code)
myprodol_search_result.headers

200


{'Server': 'nginx', 'Date': 'Wed, 17 Mar 2021 12:01:07 GMT', 'Content-Type': 'application/json', 'Content-Length': '809', 'Connection': 'keep-alive', 'Expires': 'Wed, 17 Mar 2021 12:11:07 GMT', 'Cache-Control': 'max-age=600', 'Access-Control-Allow-Origin': '*'}

So far, so good. 

Note that the `Content-Type` header is telling us that the data returned is `application/json` (this value is a MIME type, meaning there are well-defined constants)

If we take a look at the raw content it won't be immediately useful to us:

In [32]:
myprodol_search_result.content

b'[{"id": 81131, "nappi_code": "745561021", "name": "Myprodol", "dosage_form": "capsule", "sep": "R 68.50", "number_of_generics": 7}, {"id": 81279, "nappi_code": "793744008", "name": "Myprodol", "dosage_form": "suspension", "sep": "R 153.38", "number_of_generics": 2}, {"id": 81129, "nappi_code": "745561004", "name": "Myprodol", "dosage_form": "capsule", "sep": "R 169.13", "number_of_generics": 7}, {"id": 80875, "nappi_code": "704896001", "name": "Myprodol  Tablets", "dosage_form": "tablet", "sep": "R 169.28", "number_of_generics": 7}, {"id": 80876, "nappi_code": "704896002", "name": "Myprodol Tablets", "dosage_form": "tablet", "sep": "R 469.55", "number_of_generics": 7}, {"id": 81130, "nappi_code": "745561020", "name": "Myprodol", "dosage_form": "capsule", "sep": "R 485.04", "number_of_generics": 7}]'

This is a binary representation of the JSON data as a string. To work with we need to parse the string.

Mercifully this is such a common task that `requests` will do this for us:

In [34]:
myprodol_json = myprodol_search_result.json()
myprodol_json

[{'id': 81131,
  'nappi_code': '745561021',
  'name': 'Myprodol',
  'dosage_form': 'capsule',
  'sep': 'R 68.50',
  'number_of_generics': 7},
 {'id': 81279,
  'nappi_code': '793744008',
  'name': 'Myprodol',
  'dosage_form': 'suspension',
  'sep': 'R 153.38',
  'number_of_generics': 2},
 {'id': 81129,
  'nappi_code': '745561004',
  'name': 'Myprodol',
  'dosage_form': 'capsule',
  'sep': 'R 169.13',
  'number_of_generics': 7},
 {'id': 80875,
  'nappi_code': '704896001',
  'name': 'Myprodol  Tablets',
  'dosage_form': 'tablet',
  'sep': 'R 169.28',
  'number_of_generics': 7},
 {'id': 80876,
  'nappi_code': '704896002',
  'name': 'Myprodol Tablets',
  'dosage_form': 'tablet',
  'sep': 'R 469.55',
  'number_of_generics': 7},
 {'id': 81130,
  'nappi_code': '745561020',
  'name': 'Myprodol',
  'dosage_form': 'capsule',
  'sep': 'R 485.04',
  'number_of_generics': 7}]

Now we have a dict of values and we can apply our regular Python tricks.

In [37]:
myprodol_tabs = [medicine for medicine in myprodol_json if medicine['dosage_form'] == 'tablet']

In [38]:
myprodol_tabs

[{'id': 80875,
  'nappi_code': '704896001',
  'name': 'Myprodol  Tablets',
  'dosage_form': 'tablet',
  'sep': 'R 169.28',
  'number_of_generics': 7},
 {'id': 80876,
  'nappi_code': '704896002',
  'name': 'Myprodol Tablets',
  'dosage_form': 'tablet',
  'sep': 'R 469.55',
  'number_of_generics': 7}]

### Query Parameters

You may have noticed that our previous example had our search string embedded in our URL in the [Query String portion](https://en.wikipedia.org/wiki/Query_string). 

This appeared as `?q=myprodol`

`?` signals the start of the query string. It is then followed by `name=value`

We could have multiple parameters (this only has any effect if the server looks for those parameters)

This would look like: `?q=myprodol&max_results=5`

Let's see how we can include parameters in requests *without doing any String manipulation*

In [45]:
improved_myprodol_results = requests.get('https://medicineprices.org.za/api/v2/search-lite', 
                                         params={'q': 'myprodol tablets'})

In [41]:
improved_myprodol_results.json()

[{'id': 81131,
  'nappi_code': '745561021',
  'name': 'Myprodol',
  'dosage_form': 'capsule',
  'sep': 'R 68.50',
  'number_of_generics': 7},
 {'id': 81279,
  'nappi_code': '793744008',
  'name': 'Myprodol',
  'dosage_form': 'suspension',
  'sep': 'R 153.38',
  'number_of_generics': 2},
 {'id': 81129,
  'nappi_code': '745561004',
  'name': 'Myprodol',
  'dosage_form': 'capsule',
  'sep': 'R 169.13',
  'number_of_generics': 7},
 {'id': 80875,
  'nappi_code': '704896001',
  'name': 'Myprodol  Tablets',
  'dosage_form': 'tablet',
  'sep': 'R 169.28',
  'number_of_generics': 7},
 {'id': 80876,
  'nappi_code': '704896002',
  'name': 'Myprodol Tablets',
  'dosage_form': 'tablet',
  'sep': 'R 469.55',
  'number_of_generics': 7},
 {'id': 81130,
  'nappi_code': '745561020',
  'name': 'Myprodol',
  'dosage_form': 'capsule',
  'sep': 'R 485.04',
  'number_of_generics': 7}]

Let us examine some real life API docs and how we might use them with our new knowledge:

https://developer.apple.com/documentation/appstoreconnectapi/download_sales_and_trends_reports