In case the `requests` module is not installed to your VM, install it with:

```bash
pip install requests 
```


This material is based on http://docs.python-requests.org/en/master/user/quickstart/#quickstart and on chapter 17 in *Python Crash Course*, Eric Matthes.


We will have a look at the following *application programming interfaces (API)*:

  * http://openweathermap.org/api
  * https://developer.github.com/v3/
  
  

## URLs?

Usually, our URLs for working with remote *Representational state transfer (REST)* APIs consist of a host, a path and a query. In this tutorial we will work with the *Hypertext Transfer Protocol (HTTP)* only. 


```
                    hierarchical part
        ┌───────────────────┴─────────────────────┐
                    authority               path
        ┌───────────────┴───────────────┐┌───┴────┐
  abc://username:password@example.com:123/path/data?key=value&key2=value2#fragid1
  └┬┘   └───────┬───────┘ └────┬────┘ └┬┘           └─────────┬─────────┘ └──┬──┘
scheme  user information     host     port                  query         fragment
```


The example above is from https://en.wikipedia.org/wiki/Uniform_Resource_Identifier#Examples.

# Working with APIs on the CLI

On the CLI in a Unix environment, you have usually access either to `curl` or to `wget`. Both are similar and allow -amongst others- to interact with HTTP-based REST APIs.

In [196]:
%%bash

curl https://api.github.com/search/repositories?q=language:python&sort=stars

{
  "total_count": 1559802,
  "incomplete_results": false,
  "items": [
    {
      "id": 21289110,
      "name": "awesome-python",
      "full_name": "vinta/awesome-python",
      "owner": {
        "login": "vinta",
        "id": 652070,
        "avatar_url": "https://avatars1.githubusercontent.com/u/652070?v=3",
        "gravatar_id": "",
        "url": "https://api.github.com/users/vinta",
        "html_url": "https://github.com/vinta",
        "followers_url": "https://api.github.com/users/vinta/followers",
        "following_url": "https://api.github.com/users/vinta/following{/other_user}",
        "gists_url": "https://api.github.com/users/vinta/gists{/gist_id}",
        "starred_url": "https://api.github.com/users/vinta/starred{/owner}{/repo}",
        "subscriptions_url": "https://api.github.com/users/vinta/subscriptions",
        "organizations_url": "https://api.github.com/users/vinta/orgs",
        "repos_url": "https://api.github.com/users/vinta/repos",
        "events_url

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0  0     0    0     0    0     0      0      0 --:--:--  0:00:02 --:--:--     0  0  163k    0   337    0     0    101      0  0:27:42  0:00:03  0:27:39   101  9  163k    9 16777    0     0   4192      0  0:00:40  0:00:04  0:00:36  4193 24  163k   24 41498    0     0   8383      0  0:00:20  0:00:04  0:00:16  8381 39  163k   39 66158    0     0  10585      0  0:00:15  0:00:06  0:00:09 13178 54  163k   54 90818    0     0  12834      0  0:00:13  0:00:07  0:00:06 18458 70  163k   70  115k    0     0  14717      0  0:00:11  0:00:08  0:00:03 24980 79  163k   79  129k    0     0  14735      0  0:00:11  0:00:09  0:00:02 23127 93  163k   93  152k    0     0  15685      0  0:00

In [197]:
%%bash

wget -O - https://api.github.com/search/repositories?q=language:python&sort=stars

{
  "total_count": 1559800,
  "incomplete_results": false,
  "items": [
    {
      "id": 21289110,
      "name": "awesome-python",
      "full_name": "vinta/awesome-python",
      "owner": {
        "login": "vinta",
        "id": 652070,
        "avatar_url": "https://avatars1.githubusercontent.com/u/652070?v=3",
        "gravatar_id": "",
        "url": "https://api.github.com/users/vinta",
        "html_url": "https://github.com/vinta",
        "followers_url": "https://api.github.com/users/vinta/followers",
        "following_url": "https://api.github.com/users/vinta/following{/other_user}",
        "gists_url": "https://api.github.com/users/vinta/gists{/gist_id}",
        "starred_url": "https://api.github.com/users/vinta/starred{/owner}{/repo}",
        "subscriptions_url": "https://api.github.com/users/vinta/subscriptions",
        "organizations_url": "https://api.github.com/users/vinta/orgs",
        "repos_url": "https://api.github.com/users/vinta/repos",
        "events_url

--2017-03-14 02:23:30--  https://api.github.com/search/repositories?q=language:python
Resolving api.github.com (api.github.com)... 192.30.253.116, 192.30.253.117
Connecting to api.github.com (api.github.com)|192.30.253.116|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 167872 (164K) [application/json]
Saving to: ‘STDOUT’

     0K .......... .......... .......... .......... .......... 30% 71.2K 2s
    50K .......... .......... .......... .......... .......... 60% 85.4K 1s
   100K .......... .......... .......... .......... .......... 91% 53.4K 0s
   150K .......... ...                                        100% 29.7K=2.7s

2017-03-14 02:23:34 (60.9 KB/s) - written to stdout [167872/167872]



# Working with APIs from Python
 

## Make a Request

For this tutorial we are mostly collecting information with HTTPs `GET` request. Similarly, the `requests` module supports HTTP `POST`, `PUT`, `DELETE`, `HEAD`, and `OPTIONS` via corresponding functions in `requests`. See http://docs.python-requests.org/en/master/user/quickstart/#make-a-request for more details.

You can access the status code of an HTTP request via the `status_code` attribute. 

In [198]:
import requests


url = 'https://api.github.com/search/repositories?q=language:python&sort=stars'

r = requests.get(url)

print(r.url)
print(r.status_code)
print(r.json())

results = r.json()['items']

https://api.github.com/search/repositories?q=language:python&sort=stars
200
{'total_count': 1559799, 'items': [{'open_issues': 228, 'languages_url': 'https://api.github.com/repos/vinta/awesome-python/languages', 'svn_url': 'https://github.com/vinta/awesome-python', 'downloads_url': 'https://api.github.com/repos/vinta/awesome-python/downloads', 'url': 'https://api.github.com/repos/vinta/awesome-python', 'hooks_url': 'https://api.github.com/repos/vinta/awesome-python/hooks', 'assignees_url': 'https://api.github.com/repos/vinta/awesome-python/assignees{/user}', 'stargazers_count': 30546, 'git_commits_url': 'https://api.github.com/repos/vinta/awesome-python/git/commits{/sha}', 'archive_url': 'https://api.github.com/repos/vinta/awesome-python/{archive_format}{/ref}', 'deployments_url': 'https://api.github.com/repos/vinta/awesome-python/deployments', 'milestones_url': 'https://api.github.com/repos/vinta/awesome-python/milestones{/number}', 'tags_url': 'https://api.github.com/repos/vinta/awes

In [199]:
summary = [(el['full_name'], el['stargazers_count'], el['html_url'], 
            el['description']) for el in results[:10]]

for name, stars, url, desc in summary:
    print(name)
    print(stars)
    print(url)
    print(desc)
    print('---------------')

vinta/awesome-python
30546
https://github.com/vinta/awesome-python
A curated list of awesome Python frameworks, libraries, software and resources
---------------
jakubroztocil/httpie
28658
https://github.com/jakubroztocil/httpie
Modern command line HTTP client – user-friendly curl alternative with intuitive UI, JSON support, syntax highlighting, wget-like downloads, extensions, etc.  https://httpie.org
---------------
pallets/flask
25806
https://github.com/pallets/flask
A microframework based on Werkzeug, Jinja2 and good intentions
---------------
nvbn/thefuck
24542
https://github.com/nvbn/thefuck
Magnificent app which corrects your previous console command.
---------------
rg3/youtube-dl
24442
https://github.com/rg3/youtube-dl
Command-line program to download videos from YouTube.com and other video sites
---------------
django/django
24365
https://github.com/django/django
The Web framework for perfectionists with deadlines.
---------------
kennethreitz/requests
23950
https://github.co

## Passing Parameters In URLs

For getting the weather forecast, we need to specify, for example for which place we forecast, in which format we want to receive the response, etc. All those paramters are passed as a dictionary into the `params` keyword argument.

In [201]:
import json
import api_keys
import requests


url = "http://api.openweathermap.org/data/2.5/forecast"
query = {'q': 'Copenhagen,dk', 
         'mode': 'json',                       
         'units': 'metric',
         'appid': api_keys.OWM_API_KEY}
r = requests.get(url, params=query)

r.json()

'http://api.openweathermap.org/data/2.5/forecast?units=metric&appid=e8f55087975961a9b30f29707a533690&mode=json&q=Copenhagen%2Cdk'

## Response Content

### As Text

`requests` will automatically decode content from the server. Most unicode charsets are seamlessly decoded.

When you make a request, `requests` makes educated guesses about the encoding of the response based on the HTTP headers. The text encoding guessed by `requests` is used when you access `r.text`.

In [202]:
import requests

# A call to the Github timeline
r = requests.get('https://api.github.com/events')
# response encoding
print(r.encoding)
# response content
print(r.text)

utf-8
[{"id":"5496780305","type":"CreateEvent","actor":{"id":26431507,"login":"lurtzzzz","display_login":"lurtzzzz","gravatar_id":"","url":"https://api.github.com/users/lurtzzzz","avatar_url":"https://avatars.githubusercontent.com/u/26431507?"},"repo":{"id":85055587,"name":"lurtzzzz/sdffd","url":"https://api.github.com/repos/lurtzzzz/sdffd"},"payload":{"ref":null,"ref_type":"repository","master_branch":"master","description":"https://pirateaccess.xyz/?load=/torrent/9988682/Brandon_Sanderson_-_Het_Laatste_Rijk._NL_Ebook._DMT_","pusher_type":"user"},"public":true,"created_at":"2017-03-15T09:48:32Z"},{"id":"5496780303","type":"PushEvent","actor":{"id":12061511,"login":"rasanjaya85","display_login":"rasanjaya85","gravatar_id":"","url":"https://api.github.com/users/rasanjaya85","avatar_url":"https://avatars.githubusercontent.com/u/12061511?"},"repo":{"id":84574364,"name":"rasanjaya85/weather-service","url":"https://api.github.com/repos/rasanjaya85/weather-service"},"payload":{"push_id":1615

### JSON Response Content

`requests` has a builtin JSON decoder, which returns the JSON response decoded into a dictionary.

In [None]:
r.json()

### Binary Response Content

You can also access the response body as bytes, for example when you request a file or an image.

In [203]:
r.content

b'[{"id":"5496780305","type":"CreateEvent","actor":{"id":26431507,"login":"lurtzzzz","display_login":"lurtzzzz","gravatar_id":"","url":"https://api.github.com/users/lurtzzzz","avatar_url":"https://avatars.githubusercontent.com/u/26431507?"},"repo":{"id":85055587,"name":"lurtzzzz/sdffd","url":"https://api.github.com/repos/lurtzzzz/sdffd"},"payload":{"ref":null,"ref_type":"repository","master_branch":"master","description":"https://pirateaccess.xyz/?load=/torrent/9988682/Brandon_Sanderson_-_Het_Laatste_Rijk._NL_Ebook._DMT_","pusher_type":"user"},"public":true,"created_at":"2017-03-15T09:48:32Z"},{"id":"5496780303","type":"PushEvent","actor":{"id":12061511,"login":"rasanjaya85","display_login":"rasanjaya85","gravatar_id":"","url":"https://api.github.com/users/rasanjaya85","avatar_url":"https://avatars.githubusercontent.com/u/12061511?"},"repo":{"id":84574364,"name":"rasanjaya85/weather-service","url":"https://api.github.com/repos/rasanjaya85/weather-service"},"payload":{"push_id":16151112

## Writing Response to a file

In [204]:
import requests


user_url = 'https://api.github.com/users/HelgeCPH'
r = requests.get(user_url)
img_url = r.json()['avatar_url']
print(img_url)
r = requests.get(img_url)

filename = './avatar.jpg'

with open(filename, 'wb') as fd:
    fd.write(r.content)

https://avatars3.githubusercontent.com/u/21216985?v=3


http://127.0.0.1:8888/files/avatar.jpg

### Download Large Files or Responses

In case you have a large file that you want to save, then it is a good idea to save the stream of data coming in, by chopping it into smaller blocks of data and saving them sequentially.

In [None]:
with open(filename, 'wb') as fd:
    for chunk in r.iter_content(chunk_size=1024):
        fd.write(chunk)

## Custom Headers, Authentication, Response Headers

If you want to send your request with a customized header, then you can just pass your header as a dictionary to the `headers` keyword argument of your request funtion call.

For example, one way to authenticate to the Github API, is by sending an API token in the header. Thereby, you increase the amount of possible requests to 5000 per hour. Not to make the following code run you have to first generate a Github API token (https://github.com/blog/1509-personal-api-tokens) and add it to our token module. There are many other possible ways for authorization, see http://docs.python-requests.org/en/master/user/authentication/#authentication.

The header of a response is accessible as a dictionary via the `headers` attribute on the response object. In the following example, we have to inspect the response header to get the links to more results, as the Github API returns results split accross many pages.

In [None]:
%%bash

echo "GITHUB_API_KEY = 'YOUR_API_KEY'" >> ./api_keys.py

In [206]:
import api_keys
import requests
from tqdm import tqdm
from datetime import datetime
from urllib.parse import urlparse


url = 'https://api.github.com/repos/pallets/flask/contributors'
headers = {'Authorization': 'token {}'.format(api_keys.GITHUB_API_KEY)}

r = requests.get(url, headers=headers)
    
print(r.headers['X-RateLimit-Remaining'])
print(r.headers['X-RateLimit-Reset'])
print(datetime.fromtimestamp(int(r.headers['X-RateLimit-Reset'])))

contributors = [(contrib['login'], contrib['contributions'], contrib['html_url'])
                for contrib in r.json()]
 
print(r.headers)

4998
1489575278
2017-03-15 10:54:38
{'X-Content-Type-Options': 'nosniff', 'Content-Type': 'application/json; charset=utf-8', 'Transfer-Encoding': 'chunked', 'Access-Control-Expose-Headers': 'ETag, Link, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval', 'X-GitHub-Request-Id': '9DBA:1B5A9:91B7E8C:BAAB84E:58C9101F', 'X-RateLimit-Remaining': '4998', 'Strict-Transport-Security': 'max-age=31536000; includeSubdomains; preload', 'Access-Control-Allow-Origin': '*', 'X-XSS-Protection': '1; mode=block', 'Content-Security-Policy': "default-src 'none'", 'X-OAuth-Scopes': '', 'Status': '200 OK', 'Date': 'Wed, 15 Mar 2017 09:57:51 GMT', 'X-RateLimit-Limit': '5000', 'Link': '<https://api.github.com/repositories/596892/contributors?page=2>; rel="next", <https://api.github.com/repositories/596892/contributors?page=13>; rel="last"', 'X-Accepted-OAuth-Scopes': '', 'X-Served-By': '2c18a09f3ac5e4dd1e004af7c5a94769', 'Cache-C

In [207]:
def gen_next_links(headers_link_str):
    next_page_str, last_page_str = headers_link_str.split(',')
    next_page_link = next_page_str.split(';')[0][1:-1]
    link_base = next_page_link[:-1]
    start_idx = int(urlparse(next_page_link).query.split('=')[1])
    last_page_link = last_page_str.split(';')[0][2:-1]
    end_idx = int(urlparse(last_page_link).query.split('=')[1])
    return [link_base + str(idx) for idx in range(start_idx, end_idx + 1)]


next_urls = gen_next_links(r.headers['Link'])
for next_url in tqdm(next_urls):
    r = requests.get(next_url, headers=headers)
    contributors += [(contrib['login'], contrib['contributions'], contrib['html_url'])
                     for contrib in r.json()]

100%|██████████| 12/12 [00:16<00:00,  1.05s/it]


In [208]:
import pygal


print('There are {} contributors to Flask.'.format(len(contributors)))

chart = pygal.Bar(x_label_rotation=80, show_legend=False, spacing=170, 
                  height=1000, width=4000)
chart.title = 'Contributions to Flask on GitHub'

names, no_contrib, _ = zip(*contributors)

values = []
for label, value, link in contributors:
    s_dict = {
    'value': value,
    'label': label,
    'xlink': {'href': link}}
    values.append(s_dict)


chart.x_labels = names
chart.add('', values) 
chart.render_to_file('contrib_flask.svg')

There are 382 contributors to Flask.


http://127.0.0.1:8888/files/contrib_flask.svg

#  A Small Detour... on Counting

In [175]:
gender = ['m','f','m','f','m']

f_count = 0
m_count = 0

for g in gender:
    if g == 'f':
        f_count += 1
    else:
        m_count += 1

In [182]:
sum([1 for g in gender if g == 'f'])

The slowest run took 8.11 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 442 ns per loop


In [185]:
from collections import Counter


print(Counter(gender))

females = Counter(gender)['f']
females

Counter({'m': 3, 'f': 2})


2

In [184]:
import numpy as np

np.unique(gender, return_counts=True)

2

In [195]:
np.median(np.array([1,3,4,20, 20, 20,709]))

# The Study Point Exercises!!!

Group *Enthusiastic Phone* presents their tasks on analyzing data on the Human Development Index and use of satellites https://github.com/stinaanita/python_data.



## Hand-in Guidelines
How is your hand-in expected to look like?

  * You push the source code computing your solutions to a repository on Github.
  * You create a README.md file that presents your solution, each result per questions, and it explains how to run your code to reproduce your results.
  * Inform Helge when you are done. At latest at 24:00 the March 21th.
  * You prepare a short (max. 10 minutes) presentation for the next session, so that the other students know what you have done and how you tackled the problem. Furthermore, based on this presentation group *Jealous Secretary* will choose the winner of this round.