# Dataquest- APIs and web scrapping

In [8]:
import requests
from pprint import pprint

In [3]:
response = requests.get("http://api.open-notify.org/iss-now.json")
status_code = response.status_code
status_code

200

The server will send a __status code__ indicating the success or failure of your request. You can get the status code of the response from response.status_code.

The request we just made returned a status code of 200. Web servers return status codes every time they receive an API request. A status code provides information about what happened with a request. Here are some codes that are relevant to GET requests:

* 200 - Everything went okay, and the server returned a result (if any).
* 301 - The server is redirecting you to a different endpoint. This can happen when a company switches domain names, or an endpoint's name has changed.
* 401 - The server thinks you're not authenticated. This happens when you don't send the right credentials to access an API 
* 400 - The server thinks you made a bad request. This can happen when you don't send the information the API requires to process your request, among other things.
* 403 - The resource you're trying to access is forbidden; you don't have the right permissions to see it.
* 404 - The server didn't find the resource you tried to access.

In [9]:
# (the coordinates of San Francisco)
parameters = {"lat": 37.78, "lon": -122.41}

# Make a get request with the parameters.
response = requests.get("http://api.open-notify.org/iss-pass.json", params=parameters)

# Print the content of the response (the data the server returned)
pprint(response.content)

(b'{\n  "message": "success", \n  "request": {\n    "altitude": 100, \n    "dat'
 b'etime": 1502760212, \n    "latitude": 37.78, \n    "longitude": -122.41, \n'
 b'    "passes": 5\n  }, \n  "response": [\n    {\n      "duration": 607, \n'
 b'      "risetime": 1502762253\n    }, \n    {\n      "duration": 616, \n     '
 b' "risetime": 1502768039\n    }, \n    {\n      "duration": 84, \n      "rise'
 b'time": 1502774043\n    }, \n    {\n      "duration": 571, \n      "risetime"'
 b': 1502822197\n    }, \n    {\n      "duration": 632, \n      "risetime": 150'
 b'2827940\n    }\n  ]\n}\n')


You may have noticed that the content of the API response we received earlier was a string. Strings are the way we pass information back and forth through APIs, but it's hard to get the information we want out of them. How do we know how to decode the string we receive and work with it in Python?

Luckily, there's a format we call JSON. We mentioned it earlier in the mission. This format encodes data structures like lists and dictionaries as strings to ensure that machines can read them easily. JSON is the primary format for sending and receiving data through APIs.

Python offers great support for JSON through its json library. We can convert lists and dictionaries to JSON, and vice versa.

The JSON library has two main methods:

* dumps -- Takes in a Python object, and converts it to a string
* loads -- Takes a JSON string, and converts it to a Python object

In [10]:
# Make a list of fast food chains.
best_food_chains = ["Taco Bell", "Shake Shack", "Chipotle"]
print(type(best_food_chains))

# Import the JSON library.
import json

# Use json.dumps to convert best_food_chains to a string.
best_food_chains_string = json.dumps(best_food_chains)
print(type(best_food_chains_string))

# Convert best_food_chains_string back to a list.
print(type(json.loads(best_food_chains_string)))

# Make a dictionary
fast_food_franchise = {
    "Subway": 24722,
    "McDonalds": 14098,
    "Starbucks": 10821,
    "Pizza Hut": 7600
}

# We can also dump a dictionary to a string and load it.
fast_food_franchise_string = json.dumps(fast_food_franchise)
print(type(fast_food_franchise_string))

fast_food_franchise_2 = json.loads(fast_food_franchise_string)
print(type(fast_food_franchise_2))

<class 'list'>
<class 'str'>
<class 'list'>
<class 'str'>
<class 'dict'>


We can get the content of a response as a Python object by using the .json() method on the response.

In [14]:
# Make the same request we did two screens ago.
parameters = {"lat": 37.78, "lon": -122.41}
response = requests.get("http://api.open-notify.org/iss-pass.json", params=parameters)

# Get the response data as a Python object.  Verify that it's a dictionary.
json_data = response.json()
print(type(json_data))
pprint(json_data)

# Get the duration value of the ISS' first pass over San Francisco
# (this is the duration key of the first dictionary in the response list).

first_pass_duration = json_data["response"][0]["duration"]
print("\n The duration value of the ISS' first pass over San Francisco : ",first_pass_duration)

<class 'dict'>
{'message': 'success',
 'request': {'altitude': 100,
             'datetime': 1502760212,
             'latitude': 37.78,
             'longitude': -122.41,
             'passes': 5},
 'response': [{'duration': 607, 'risetime': 1502762253},
              {'duration': 616, 'risetime': 1502768039},
              {'duration': 84, 'risetime': 1502774043},
              {'duration': 571, 'risetime': 1502822197},
              {'duration': 632, 'risetime': 1502827940}]}

 The duration value of the ISS' first pass over San Francisco :  607


The server sends more than a status code and the data when it generates a response. It also sends __metadata containing information on how it generated the data and how to decode it. This information appears in the response headers__. We can access it using the .headers property that responses have.

The headers will appear as a dictionary. For now, the content-type within the headers is the most important key. It tells us the format of the response, and how to decode it. For the OpenNotify API, the format is JSON, which is why we could decode it with JSON earlier

In [16]:
# Headers is a dictionary
pprint(response.headers)

{'Server': 'nginx/1.10.3', 'Date': 'Tue, 15 Aug 2017 01:36:07 GMT', 'Content-Type': 'application/json', 'Content-Length': '520', 'Connection': 'keep-alive', 'Via': '1.1 vegur'}
