## Section 2 - API's Scraping

API = ensemble d'outils et methodes qui autorisent differentes apps a interagir entre elles --> recuperer de la data dynamiquement

### Requete API

In [2]:
import requests

### Requete GET

In [15]:
#Requete pour obtenir la derniere position de la station ISS depuis l'API OpenNotify
response = requests.get("http://api.open-notify.org/iss-now.json") 
# On ajoute apres l'adresse de l'API, un point d'acces (endpoint) qui donne acces a des informations 
#(ici iss-now.json --> latitude et longitude de la station)

### Code Status

In [9]:
response

<Response [200]>

#### 200 - Tout est ok, le serveur retourne le resultat demande

In [10]:
status_code = response.status_code
print(status_code)

200


#### 301 - Le serveur redirige vers un autre parametre

#### 400 - Mauvaise requete

In [13]:
response = requests.get("http://api.open-notify.org/iss-pass.json")
status_code = response.status_code
print(status_code)

400


#### 401 - Erreur d'authentification

#### 403 - Vous n'etes pas autorise a acceder a l'API

#### 404 - Le serveur n'a pas trouve la ressource

In [14]:
response = requests.get("http://api.open-notify.org/iss-pass")
status_code = response.status_code
print(status_code)

404


## Parametres de requetes

In [16]:
# Latitude et longitude de Paris
parameters = {"lat":48.87, "lon":2.33}

In [17]:
# http://api.open-notify.org/iss-pass.json?Lat=48.87&Lon=2.33

In [18]:
response = requests.get("http://api.open-notify.org/iss-pass.json?",
                        params = parameters)

In [20]:
content = response.content
print(content)

b'{\n  "message": "success", \n  "request": {\n    "altitude": 100, \n    "datetime": 1505651260, \n    "latitude": 48.87, \n    "longitude": 2.33, \n    "passes": 5\n  }, \n  "response": [\n    {\n      "duration": 461, \n      "risetime": 1505683607\n    }, \n    {\n      "duration": 630, \n      "risetime": 1505689267\n    }, \n    {\n      "duration": 644, \n      "risetime": 1505695044\n    }, \n    {\n      "duration": 642, \n      "risetime": 1505700849\n    }, \n    {\n      "duration": 640, \n      "risetime": 1505706643\n    }\n  ]\n}\n'


## Format JSON

library JSON:
 - dumps -- prends en entree un objet Python et retourne une chaine de caracteres
 - loads -- prends en entree une chaine de caractere et retourne un objet Pyhton

### Exemple

In [39]:
# soit une liste de sports
sports = ["Tennis", "Foot", "Rugby"]
print(sports)

['Tennis', 'Foot', 'Rugby']


In [40]:
print(type(sports))

<class 'list'>


In [41]:
import json

In [42]:
# Methode json.dumps pour convertir en chaine de caracteres
sports_string = json.dumps(sports)
print (sports_string)

["Tennis", "Foot", "Rugby"]


In [43]:
print(type(sports_string))

<class 'str'>


In [44]:
# Convertir sports_string en list -- methode json.loads
sports2 = json.loads(sports_string)
print(type(sports2))

<class 'list'>


#### Training

Soit le dictionnaire ci-dessous:

> - Convertir en chaine de caracteres
> - Re-convertir en dictionnaire
> - Verifier les types

In [45]:
# Soit le dictionnaire contenant le nombre de licencies pour
# quelques sports en France en 2016
sports_number = {
    "Football": 1962241,
    "Tennis": 1039337,
    "Equitation": 663194,
    "Basketball" : 641367
}

In [46]:
sports_number_carac = json.dumps(sports_number)
print(sports_number_carac)
print(type(sports_number_carac))

{"Football": 1962241, "Tennis": 1039337, "Equitation": 663194, "Basketball": 641367}
<class 'str'>


In [47]:
sports_number_dic = json.loads(sports_number_carac)
print(sports_number_dic)
print(type(sports_number_dic))

{'Football': 1962241, 'Tennis': 1039337, 'Equitation': 663194, 'Basketball': 641367}
<class 'dict'>


### Obtenir un JSON depuis une requete

In [48]:
#methode JSON

In [49]:
# reprenons notre requete avec les coordonnees de Paris
parameters = {'lat':48.87, 'lon':2.33}
response = requests.get("http://api.open-notify.org/iss-pass.json",
                       params=parameters)

In [50]:
#Obtenir un objet Python
json_data = response.json()
print(type(json_data))

<class 'dict'>


In [51]:
print(json_data)

{'message': 'success', 'request': {'altitude': 100, 'datetime': 1505659333, 'latitude': 48.87, 'longitude': 2.33, 'passes': 5}, 'response': [{'duration': 461, 'risetime': 1505683607}, {'duration': 630, 'risetime': 1505689267}, {'duration': 644, 'risetime': 1505695044}, {'duration': 642, 'risetime': 1505700849}, {'duration': 640, 'risetime': 1505706643}]}


In [52]:
first_pass_duration = json_data['response']
print(first_pass_duration)

[{'duration': 461, 'risetime': 1505683607}, {'duration': 630, 'risetime': 1505689267}, {'duration': 644, 'risetime': 1505695044}, {'duration': 642, 'risetime': 1505700849}, {'duration': 640, 'risetime': 1505706643}]


In [53]:
first_pass_duration = json_data['response'][0]
print(first_pass_duration)

{'duration': 461, 'risetime': 1505683607}


In [54]:
first_pass_duration = json_data['response'][0]['duration']
print(first_pass_duration)

461


### Type de contenu

In [55]:
# .headers
# = metadata

In [56]:
print(response.headers)

{'Server': 'nginx/1.10.3', 'Date': 'Sun, 17 Sep 2017 14:42:50 GMT', 'Content-Type': 'application/json', 'Content-Length': '518', 'Connection': 'keep-alive', 'Via': '1.1 vegur'}


In [57]:
content_type = response.headers['content-type']
print(content_type)

application/json


## Trouver le nombre de personnes dans l'espace

In [58]:
#Appeler l'API
response = requests.get("http://api.open-notify.org/astros.json")
print(response)

<Response [200]>


In [59]:
json_data=response.json()
print(json_data)

{'number': 6, 'message': 'success', 'people': [{'name': 'Sergey Ryazanskiy', 'craft': 'ISS'}, {'name': 'Randy Bresnik', 'craft': 'ISS'}, {'name': 'Paolo Nespoli', 'craft': 'ISS'}, {'name': 'Alexander Misurkin', 'craft': 'ISS'}, {'name': 'Mark Vande Hei', 'craft': 'ISS'}, {'name': 'Joe Acaba', 'craft': 'ISS'}]}


In [60]:
humans = json_data['number']
print(humans)

6
