In case the `requests` module is not installed on your system, install it with:

```bash
pip install requests 
```

However, it should be part of the Anaconda distribution.

This material is based on http://docs.python-requests.org/en/master/user/quickstart/#quickstart and on chapter 17 in *Python Crash Course*, Eric Matthes.


We will have a look at the following *application programming interfaces (API)*:

  * http://openweathermap.org/api
  * https://developer.github.com/v3/
  
  

## URLs?

Usually, our URLs for working with remote *Representational state transfer (REST)* APIs consist of a host, a path and a query. In this tutorial we will work with the *Hypertext Transfer Protocol (HTTP)* only. 


```
                    hierarchical part
        ┌───────────────────┴─────────────────────┐
                    authority               path
        ┌───────────────┴───────────────┐┌───┴────┐
  abc://username:password@example.com:123/path/data?key=value&key2=value2#fragid1
  └┬┘   └───────┬───────┘ └────┬────┘ └┬┘           └─────────┬─────────┘ └──┬──┘
scheme  user information     host     port                  query         fragment
```


The example above is from https://en.wikipedia.org/wiki/Uniform_Resource_Identifier#Examples.

# Working with APIs on the CLI

On the CLI in a Unix environment, you have usually access either to `curl` or to `wget`. Both are similar and allow -amongst others- to interact with HTTP-based REST APIs.

In [5]:
%%bash

curl https://api.github.com/search/repositories?q=language:python&sort=stars

Couldn't find program: 'bash'


In [6]:
%%bash

wget -O - https://api.github.com/search/repositories?q=language:python&sort=stars

Couldn't find program: 'bash'


# Working with APIs from Python
 

## Make a Request

For this tutorial we are mostly collecting information with HTTPs `GET` request. Similarly, the `requests` module supports HTTP `POST`, `PUT`, `DELETE`, `HEAD`, and `OPTIONS` via corresponding functions in `requests`. See http://docs.python-requests.org/en/master/user/quickstart/#make-a-request for more details.

You can access the status code of an HTTP request via the `status_code` attribute. 

In [7]:
import requests


url = 'https://api.github.com/search/repositories?q=language:python&sort=stars'

r = requests.get(url)
print(type(r))
print(r.url)
print(r.status_code)
print(r.json())

results = r.json()['items']

<class 'requests.models.Response'>
https://api.github.com/search/repositories?q=language:python&sort=stars
200
{'total_count': 2813382, 'incomplete_results': False, 'items': [{'id': 21289110, 'node_id': 'MDEwOlJlcG9zaXRvcnkyMTI4OTExMA==', 'name': 'awesome-python', 'full_name': 'vinta/awesome-python', 'owner': {'login': 'vinta', 'id': 652070, 'node_id': 'MDQ6VXNlcjY1MjA3MA==', 'avatar_url': 'https://avatars2.githubusercontent.com/u/652070?v=4', 'gravatar_id': '', 'url': 'https://api.github.com/users/vinta', 'html_url': 'https://github.com/vinta', 'followers_url': 'https://api.github.com/users/vinta/followers', 'following_url': 'https://api.github.com/users/vinta/following{/other_user}', 'gists_url': 'https://api.github.com/users/vinta/gists{/gist_id}', 'starred_url': 'https://api.github.com/users/vinta/starred{/owner}{/repo}', 'subscriptions_url': 'https://api.github.com/users/vinta/subscriptions', 'organizations_url': 'https://api.github.com/users/vinta/orgs', 'repos_url': 'https://api

In [8]:
len(results)

30

In [9]:
summary = [(el['full_name'], el['stargazers_count'], el['html_url'], 
            el['description']) for el in results[:10]]

for name, stars, url, desc in summary:
    print(name)
    print(stars)
    print(url)
    print(desc)
    print('---------------')

vinta/awesome-python
52568
https://github.com/vinta/awesome-python
A curated list of awesome Python frameworks, libraries, software and resources
---------------
rg3/youtube-dl
39711
https://github.com/rg3/youtube-dl
Command-line program to download videos from YouTube.com and other video sites
---------------
toddmotto/public-apis
39482
https://github.com/toddmotto/public-apis
A collective list of public JSON APIs for use in web development.
---------------
tensorflow/models
38327
https://github.com/tensorflow/models
Models and examples built with TensorFlow
---------------
pallets/flask
37366
https://github.com/pallets/flask
The Python micro framework for building web applications.
---------------
nvbn/thefuck
36248
https://github.com/nvbn/thefuck
Magnificent app which corrects your previous console command.
---------------
jakubroztocil/httpie
36056
https://github.com/jakubroztocil/httpie
Modern command line HTTP client – user-friendly curl alternative with intuitive UI, JSON suppor

## Passing Parameters In URLs

For getting the weather forecast, we need to specify, for example for which place we forecast, in which format we want to receive the response, etc. All those paramters are passed as a dictionary into the `params` keyword argument.

In [10]:
import json
import api_keys
import requests


url = "http://api.openweathermap.org/data/2.5/forecast"
query = {'q': 'Copenhagen,dk', 
         'mode': 'json',                       
         'units': 'metric',
         'appid': api_keys.OWM_API_KEY}
r = requests.get(url, params=query)

r.json()

ModuleNotFoundError: No module named 'api_keys'

## Response Content

### As Text

`requests` will automatically decode content from the server. Most unicode charsets are seamlessly decoded.

When you make a request, `requests` makes educated guesses about the encoding of the response based on the HTTP headers. The text encoding guessed by `requests` is used when you access `r.text`.

In [None]:
import requests

# A call to the Github timeline
r = requests.get('https://api.github.com/events')
# response encoding
print(r.encoding)
# response content
print(r.text)

### JSON Response Content

`requests` has a builtin JSON decoder, which returns the JSON response decoded into a dictionary.

In [None]:
r.json()

### Binary Response Content

You can also access the response body as bytes, for example when you request a file or an image.

In [None]:
r.content

## Writing Response to a file

In [None]:
import requests


user_url = 'https://api.github.com/users/HelgeCPH'
r = requests.get(user_url)
img_url = r.json()['avatar_url']
print(img_url)
r = requests.get(img_url)

filename = './avatar.jpg'

with open(filename, 'wb') as fd:
    fd.write(r.content)

![](avatar.jpg)

### Download Large Files or Responses

In case you have a large file that you want to save, then it is a good idea to save the stream of data coming in, by chopping it into smaller blocks of data and saving them sequentially.

In [None]:
with open(filename, 'wb') as fd:
    for chunk in r.iter_content(chunk_size=1024):
        fd.write(chunk)

## Custom Headers, Authentication, Response Headers

If you want to send your request with a customized header, then you can just pass your header as a dictionary to the `headers` keyword argument of your request funtion call.

For example, one way to authenticate to the Github API, is by sending an API token in the header. Thereby, you increase the amount of possible requests to 5000 per hour. Not to make the following code run you have to first generate a Github API token (https://github.com/blog/1509-personal-api-tokens) and add it to our token module. There are many other possible ways for authorization, see http://docs.python-requests.org/en/master/user/authentication/#authentication.

The header of a response is accessible as a dictionary via the `headers` attribute on the response object. In the following example, we have to inspect the response header to get the links to more results, as the Github API returns results split accross many pages.

In [None]:
%%bash

echo "GITHUB_API_KEY = 'YOUR_API_KEY'" >> ./api_keys.py

In [None]:
import api_keys
import requests
from tqdm import tqdm
from datetime import datetime
from urllib.parse import urlparse


url = 'https://api.github.com/repos/pallets/flask/contributors'
headers = {'Authorization': 'token {}'.format(api_keys.GITHUB_API_KEY)}

r = requests.get(url, headers=headers)
    
print(r.headers['X-RateLimit-Remaining'])
print(r.headers['X-RateLimit-Reset'])
print(datetime.fromtimestamp(int(r.headers['X-RateLimit-Reset'])))

contributors = [(contrib['login'], contrib['contributions'], contrib['html_url'])
                for contrib in r.json()]
 
print(r.headers)

In [None]:
def gen_next_links(headers_link_str):
    next_page_str, last_page_str = headers_link_str.split(',')
    next_page_link = next_page_str.split(';')[0][1:-1]
    link_base = next_page_link[:-1]
    start_idx = int(urlparse(next_page_link).query.split('=')[1])
    last_page_link = last_page_str.split(';')[0][2:-1]
    end_idx = int(urlparse(last_page_link).query.split('=')[1])
    return [link_base + str(idx) for idx in range(start_idx, end_idx + 1)]


next_urls = gen_next_links(r.headers['Link'])
for next_url in tqdm(next_urls):
    r = requests.get(next_url, headers=headers)
    contributors += [(contrib['login'], contrib['contributions'], contrib['html_url'])
                     for contrib in r.json()]

In [None]:
import pygal


print('There are {} contributors to Flask.'.format(len(contributors)))

chart = pygal.Bar(x_label_rotation=80, show_legend=False, spacing=170, 
                  height=1000, width=4000)
chart.title = 'Contributions to Flask on GitHub'

names, no_contrib, _ = zip(*contributors)

values = []
for label, value, link in contributors:
    s_dict = {
    'value': value,
    'label': label,
    'xlink': {'href': link}}
    values.append(s_dict)


chart.x_labels = names
chart.add('', values) 
chart.render_to_file('contrib_flask.svg')

![](contrib_flask.svg)

#  A Small Detour... on Counting

In [None]:
gender = ['m', 'f', 'm', 'f', 'm']

f_count = 0
m_count = 0

for g in gender:
    if g == 'f':
        f_count += 1
    else:
        m_count += 1

In [None]:
sum([1 for g in gender if g == 'f'])

In [None]:
from collections import Counter


print(Counter(gender))

females = Counter(gender)['f']
females

In [None]:
import numpy as np


np.unique(gender, return_counts=True)

In [None]:
np.median(np.array([1, 3, 4, 20, 20, 20, 709]))