![rmotr](https://user-images.githubusercontent.com/7065401/52071918-bda15380-2562-11e9-828c-7f95297e4a82.png)
<hr style="margin-bottom: 40px;">

<img src="https://user-images.githubusercontent.com/7065401/68501079-0695df00-023c-11ea-841f-455dac84a089.jpg"
    style="width:400px; float: right; margin: 0 40px 40px 40px;"></img>

# Fetching data from a REST API

In this lecture we'll learn how to fetch data from a REST API, parse the response and put it into a pandas `DataFrame`.

![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)

## Hands on! 

In [None]:
import pandas as pd
from io import StringIO
# StrinIO necessary since calling read_json on a string is deprecated

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Reading API response

We'll read a response from the ([CityBike API](http://api.citybik.es/v2/networks)), serialize it to JSON format and put it in a Pandas dataframe.

To do that the first thing we need to do is import `requests` module and use it to make a GET request to the defined `api_url`.

In [1]:
import requests

api_url = "http://api.citybik.es/v2/networks"

In [2]:
req = requests.get(api_url)

In [3]:
req

<Response [200]>

In [4]:
req.ok

True

In [5]:
req.status_code

200

In [6]:
req.text

'{"networks":[{"id":"abu-dhabi-careem-bike","name":"Abu Dhabi Careem BIKE","location":{"latitude":24.4866,"longitude":54.3728,"city":"Abu Dhabi","country":"AE"},"href":"/v2/networks/abu-dhabi-careem-bike","company":["Careem"],"gbfs_href":"https://dubai.publicbikesystem.net/customer/gbfs/v2/en/gbfs.json"},{"id":"acces-velo-saguenay","name":"Accès Vélo","location":{"latitude":48.433333,"longitude":-71.083333,"city":"Saguenay","country":"CA"},"href":"/v2/networks/acces-velo-saguenay","company":["PBSC Urban Solutions"],"gbfs_href":"https://saguenay.publicbikesystem.net/customer/gbfs/v2/gbfs.json"},{"id":"aksu","name":"Aksu","location":{"latitude":41.1664,"longitude":80.2617,"city":"阿克苏市 (Aksu City)","country":"CN"},"href":"/v2/networks/aksu","company":["阿克苏公共服务"]},{"id":"alba","name":"Alba","location":{"latitude":44.716667,"longitude":8.083333,"city":"Alba","country":"IT"},"href":"/v2/networks/alba","company":["Comunicare S.r.l."],"system":"Bicincittà","source":"https://www.bicincitta.com/

The data from CityBike API is returned using JSON format, we can try using the `read_json` method we saw on previous lecture.

But as the API response has nested elements, this raw `read_json` method will not be enough.

In [19]:
# pd.read_json(req.text).head()
# Wrapping in StringIO necessary since passing literal json is deprecated
pd.read_json(StringIO(req.text)).head()

Unnamed: 0,networks
0,"{'id': 'abu-dhabi-careem-bike', 'name': 'Abu D..."
1,"{'id': 'acces-velo-saguenay', 'name': 'Accès V..."
2,"{'id': 'aksu', 'name': 'Aksu', 'location': {'l..."
3,"{'id': 'alba', 'name': 'Alba', 'location': {'l..."
4,"{'id': 'albabici', 'name': 'AlbaBici', 'locati..."


The `request` object has a `json()` method to serialize response into JSON content.

In [20]:
json_dict = req.json()

In [21]:
json_dict

{'networks': [{'id': 'abu-dhabi-careem-bike',
   'name': 'Abu Dhabi Careem BIKE',
   'location': {'latitude': 24.4866,
    'longitude': 54.3728,
    'city': 'Abu Dhabi',
    'country': 'AE'},
   'href': '/v2/networks/abu-dhabi-careem-bike',
   'company': ['Careem'],
   'gbfs_href': 'https://dubai.publicbikesystem.net/customer/gbfs/v2/en/gbfs.json'},
  {'id': 'acces-velo-saguenay',
   'name': 'Accès Vélo',
   'location': {'latitude': 48.433333,
    'longitude': -71.083333,
    'city': 'Saguenay',
    'country': 'CA'},
   'href': '/v2/networks/acces-velo-saguenay',
   'company': ['PBSC Urban Solutions'],
   'gbfs_href': 'https://saguenay.publicbikesystem.net/customer/gbfs/v2/gbfs.json'},
  {'id': 'aksu',
   'name': 'Aksu',
   'location': {'latitude': 41.1664,
    'longitude': 80.2617,
    'city': '阿克苏市 (Aksu City)',
    'country': 'CN'},
   'href': '/v2/networks/aksu',
   'company': ['阿克苏公共服务']},
  {'id': 'alba',
   'name': 'Alba',
   'location': {'latitude': 44.716667,
    'longitude': 

In [22]:
json_dict['networks']

[{'id': 'abu-dhabi-careem-bike',
  'name': 'Abu Dhabi Careem BIKE',
  'location': {'latitude': 24.4866,
   'longitude': 54.3728,
   'city': 'Abu Dhabi',
   'country': 'AE'},
  'href': '/v2/networks/abu-dhabi-careem-bike',
  'company': ['Careem'],
  'gbfs_href': 'https://dubai.publicbikesystem.net/customer/gbfs/v2/en/gbfs.json'},
 {'id': 'acces-velo-saguenay',
  'name': 'Accès Vélo',
  'location': {'latitude': 48.433333,
   'longitude': -71.083333,
   'city': 'Saguenay',
   'country': 'CA'},
  'href': '/v2/networks/acces-velo-saguenay',
  'company': ['PBSC Urban Solutions'],
  'gbfs_href': 'https://saguenay.publicbikesystem.net/customer/gbfs/v2/gbfs.json'},
 {'id': 'aksu',
  'name': 'Aksu',
  'location': {'latitude': 41.1664,
   'longitude': 80.2617,
   'city': '阿克苏市 (Aksu City)',
   'country': 'CN'},
  'href': '/v2/networks/aksu',
  'company': ['阿克苏公共服务']},
 {'id': 'alba',
  'name': 'Alba',
  'location': {'latitude': 44.716667,
   'longitude': 8.083333,
   'city': 'Alba',
   'country':

In [23]:
citybikes = pd.DataFrame.from_dict(json_dict['networks'])

In [24]:
citybikes.head()

Unnamed: 0,id,name,location,href,company,gbfs_href,system,source,ebikes,license,scooters,instances
0,abu-dhabi-careem-bike,Abu Dhabi Careem BIKE,"{'latitude': 24.4866, 'longitude': 54.3728, 'c...",/v2/networks/abu-dhabi-careem-bike,[Careem],https://dubai.publicbikesystem.net/customer/gb...,,,,,,
1,acces-velo-saguenay,Accès Vélo,"{'latitude': 48.433333, 'longitude': -71.08333...",/v2/networks/acces-velo-saguenay,[PBSC Urban Solutions],https://saguenay.publicbikesystem.net/customer...,,,,,,
2,aksu,Aksu,"{'latitude': 41.1664, 'longitude': 80.2617, 'c...",/v2/networks/aksu,[阿克苏公共服务],,,,,,,
3,alba,Alba,"{'latitude': 44.716667, 'longitude': 8.083333,...",/v2/networks/alba,[Comunicare S.r.l.],,Bicincittà,https://www.bicincitta.com/frmLeStazioni.aspx?...,,,,
4,albabici,AlbaBici,"{'latitude': 38.9943, 'longitude': -1.8602, 'c...",/v2/networks/albabici,[Instituto Tecnológico de Castilla y León (ITCL)],,bicicard,,,,,


![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Unpacking columns

We can unpack `location` column using `json_normalize` as we saw on previous lecture.

In [26]:
# from pandas.io.json import json_normalize # Deprecated
from pandas import json_normalize

In [27]:
citybikes_unpacked = json_normalize(json_dict['networks'],
                                    sep='_')

In [16]:
citybikes_unpacked.head()

Unnamed: 0,company,href,id,name,location_city,location_country,location_latitude,location_longitude,source,license_name,license_url,gbfs_href
0,[ЗАО «СитиБайк»],/v2/networks/velobike-moscow,velobike-moscow,Velobike,Moscow,RU,55.75,37.616667,,,,
1,[Gobike A/S],/v2/networks/bycyklen,bycyklen,Bycyklen,Copenhagen,DK,55.673582,12.564984,,,,
2,[Gobike A/S],/v2/networks/nu-connect,nu-connect,Nu-Connect,Utrecht,NL,52.117,5.067,,,,
3,[Urban Infrastructure Partner],/v2/networks/baerum-bysykkel,baerum-bysykkel,Bysykkel,Bærum,NO,59.89455,10.546343,,,,
4,[Gobike A/S],/v2/networks/bysykkelen,bysykkelen,Bysykkelen,Stavanger,NO,58.969975,5.733107,,,,


![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Save to JSON file

Now we have the REST API response on a `DataFrame`, we can save it as a JSON file.

In [17]:
citybikes_unpacked.head()

Unnamed: 0,company,href,id,name,location_city,location_country,location_latitude,location_longitude,source,license_name,license_url,gbfs_href
0,[ЗАО «СитиБайк»],/v2/networks/velobike-moscow,velobike-moscow,Velobike,Moscow,RU,55.75,37.616667,,,,
1,[Gobike A/S],/v2/networks/bycyklen,bycyklen,Bycyklen,Copenhagen,DK,55.673582,12.564984,,,,
2,[Gobike A/S],/v2/networks/nu-connect,nu-connect,Nu-Connect,Utrecht,NL,52.117,5.067,,,,
3,[Urban Infrastructure Partner],/v2/networks/baerum-bysykkel,baerum-bysykkel,Bysykkel,Bærum,NO,59.89455,10.546343,,,,
4,[Gobike A/S],/v2/networks/bysykkelen,bysykkelen,Bysykkelen,Stavanger,NO,58.969975,5.733107,,,,


In [28]:
citybikes_unpacked.to_json('out.json')

In [29]:
pd.read_json('out.json').head()

Unnamed: 0,id,name,href,company,gbfs_href,location_latitude,location_longitude,location_city,location_country,system,source,ebikes,license_name,license_url,scooters,instances
0,abu-dhabi-careem-bike,Abu Dhabi Careem BIKE,/v2/networks/abu-dhabi-careem-bike,[Careem],https://dubai.publicbikesystem.net/customer/gb...,24.4866,54.3728,Abu Dhabi,AE,,,,,,,
1,acces-velo-saguenay,Accès Vélo,/v2/networks/acces-velo-saguenay,[PBSC Urban Solutions],https://saguenay.publicbikesystem.net/customer...,48.433333,-71.083333,Saguenay,CA,,,,,,,
2,aksu,Aksu,/v2/networks/aksu,[阿克苏公共服务],,41.1664,80.2617,阿克苏市 (Aksu City),CN,,,,,,,
3,alba,Alba,/v2/networks/alba,[Comunicare S.r.l.],,44.716667,8.083333,Alba,IT,Bicincittà,https://www.bicincitta.com/frmLeStazioni.aspx?...,,,,,
4,albabici,AlbaBici,/v2/networks/albabici,[Instituto Tecnológico de Castilla y León (ITCL)],,38.9943,-1.8602,Albacete,ES,bicicard,,,,,,


![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## More on data fetching

Another example using <b>Cryptowatch API</b> can be found in [this post](https://notebooks.ai/santiagobasulto/crypto-analysis-using-python-and-cryptowatch-api-79e06f1f).

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Reading an authenticated URL

To demonstrate authentication, we can use http://httpbin.org

In [30]:
r = requests.get('https://httpbin.org/basic-auth/myuser/mypasswd')

In [31]:
r.status_code

401

In [32]:
r = requests.get('https://httpbin.org/basic-auth/myuser/mypasswd',
                 auth=('myuser', 'mypasswd'))

In [33]:
r.status_code

200

In [34]:
r.json()

{'authenticated': True, 'user': 'myuser'}

![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)