# In-Class Coding Lab: Web Services and APIs

### Overview

The web has long evolved from user-consumption to device consumption. In the early days of the web when you wanted to check the weather, you opened up your browser and visited a website. Nowadays your smart watch / smart phone retrieves the weather for you and displays it on the device. Your device can't predict the weather. It's simply consuming a weather based service. 

The key to making device consumption work are API's (Application Program Interfaces). Products we use everyday like smartphones, Amazon's Alexa, and gaming consoles all rely on API's. They seem "smart" and "powerful" but in actuality they're only interfacing with smart and powerful services in the cloud.

API consumption is the new reality of programming; it is why we cover it in this course. Once you undersand how to conusme API's you can write a program to do almost anything and harness the power of the internet to make your own programs look "smart" and "powerful." 

This lab covers how to properly use consume web service API's with Python. Here's what we will cover.

1. Understading requests and responses
1. Proper error handling
1. Parameter handling
1. Refactoring as a function


## Pre-Requisites: Let's install what we need for the remainder of the course:

NOTE: Run this cell. It will install several Python packages you will need. It might take 2-3 minutes to do the installs please be patient.

In [1]:
!conda install -y -q  pandas matplotlib beautifulsoup4
!pip install requests html5 lxml
!pip install plotly cufflinks folium





Package plan for installation in environment C:\ProgramData\Miniconda3:

The following NEW packages will be INSTALLED:

    beautifulsoup4:  4.6.0-py36hd4cc5e8_1             
    cycler:          0.10.0-py36h009560c_0            
    freetype:        2.8-h51f8f2c_1                   
    icu:             58.2-ha66f8fd_1                  
    jpeg:            9b-hb83a4c4_2                    
    kiwisolver:      1.0.1-py36h12c3424_0             
    libpng:          1.6.34-h79bbb47_0                
    matplotlib:      2.2.2-py36h153e9ff_0             
    pandas:          0.22.0-py36h6538335_0            
    pyqt:            5.6.0-py36hb5ed885_5             
    python-dateutil: 2.7.2-py36_0                     
    pytz:            2018.3-py36_0                    
    qt:              5.6.2-vc14h6f8c307_12            
    sip:             4.18.1-py36h9c25514_2            
    sqlite:          3.22.0-h9d3ae62_0                
    tornado:         5.0.1-py36_1                    


CondaIOError: Missing write permissions in: C:\ProgramData\Miniconda3
#
# You don't appear to have the necessary permissions to install packages
# into the install area 'C:\ProgramData\Miniconda3'.
# However you can clone this environment into your home directory and
# then make changes to it.
# This may be done using the command:
#
# $ conda create -n my_root --clone="C:\ProgramData\Miniconda3"






You are using pip version 9.0.1, however version 9.0.3 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.




You are using pip version 9.0.1, however version 9.0.3 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.


## Part 1: Understanding Requests and responses

In this part we learn about the Python requests module. http://docs.python-requests.org/en/master/user/quickstart/ 

This module makes it easy to write code to send HTTP requests over the internet and handle the responses. It will be the cornerstone of our API consumption in this course. While there are other modules which accomplish the same thing, `requests` is the most straightforward and easiest to use.

We'll begin by importing the modules we will need. We do this here so we won't need to include these lines in the other code we write in this lab.

In [2]:
# start by importing the modules we will need
import requests
import json 

### The request 

As you learned in class and your assigned readings, the HTTP protocol has **verbs** which consititue the type of request you will send to the remote resource, or **url**. Based on the url and request type, you will get a **response**.

The following line of code makes a **get** request (that's the HTTP verb) to Google's Geocoding API service. This service attempts to convert the address (in this case `Syracuse University`) into a set of coordinates global coordinates (Latitude and Longitude), so that location can be plotted on a map.


In [3]:
url = 'http://maps.googleapis.com/maps/api/geocode/json?address=Syracuse+University'
response = requests.get(url)

### The response 

The `get()` method returns a `Response` object variable. I called it `response` in this example but it could be called anything. 

The HTTP response consists of a *status code* and *body*. The status code lets you know if the request worked, while the body of the response contains the actual data. 


In [4]:
response.ok # did the request work?

True

In [5]:
response.text  # what's in the body of the response, as a raw string

'{\n   "results" : [\n      {\n         "address_components" : [\n            {\n               "long_name" : "Syracuse",\n               "short_name" : "Syracuse",\n               "types" : [ "locality", "political" ]\n            },\n            {\n               "long_name" : "Onondaga County",\n               "short_name" : "Onondaga County",\n               "types" : [ "administrative_area_level_2", "political" ]\n            },\n            {\n               "long_name" : "New York",\n               "short_name" : "NY",\n               "types" : [ "administrative_area_level_1", "political" ]\n            },\n            {\n               "long_name" : "United States",\n               "short_name" : "US",\n               "types" : [ "country", "political" ]\n            }\n         ],\n         "formatted_address" : "Syracuse, NY, USA",\n         "geometry" : {\n            "location" : {\n               "lat" : 43.0391534,\n               "lng" : -76.13511579999999\n            }

### Converting responses into Python object variables 

In the case of **web site url's** the response body is **HTML**. This should be rendered in a web browser. But we're dealing with Web Service API's so...

In the case of **web API url's** the response body could be in a variety of formats from **plain text**, to **XML** or **JSON**. In this course we will only focus on JSON format because as we've seen these translate easily into Python object variables.

Let's convert the response to a Python object variable. I this case it will be a Python dictionary

In [6]:
geodata = response.json()  # try to decode the response from JSON format
geodata                    # this is now a Python object variable

{'results': [{'address_components': [{'long_name': 'Syracuse',
     'short_name': 'Syracuse',
     'types': ['locality', 'political']},
    {'long_name': 'Onondaga County',
     'short_name': 'Onondaga County',
     'types': ['administrative_area_level_2', 'political']},
    {'long_name': 'New York',
     'short_name': 'NY',
     'types': ['administrative_area_level_1', 'political']},
    {'long_name': 'United States',
     'short_name': 'US',
     'types': ['country', 'political']}],
   'formatted_address': 'Syracuse, NY, USA',
   'geometry': {'location': {'lat': 43.0391534, 'lng': -76.1351158},
    'location_type': 'GEOMETRIC_CENTER',
    'viewport': {'northeast': {'lat': 43.04050238029149,
      'lng': -76.13376681970848},
     'southwest': {'lat': 43.03780441970849, 'lng': -76.13646478029149}}},
   'place_id': 'ChIJVcwsup_z2YkRTQhRUgaJYF4',
   'types': ['establishment', 'point_of_interest', 'university']}],
 'status': 'OK'}

With our Python object, we can now walk the python object to retrieve the latitude and longitude


In [7]:
coords = geodata['results'][0]['geometry']['location']
coords

{'lat': 43.0391534, 'lng': -76.1351158}

In the code above we "walked" the Python dictionary to get to the location

- `geodata['results']` is a list
- `geodata['results'][0]` is the first item in that list, a dictionary
- `geodata['results'][0]['geometry']` is a key which represents another dictionary
- `geodata['results'][0]['geometry']['location']` is a key which contains the dictionary we want!

It should be noted that this process will vary for each API you call, so its important to get accustomed to performing this task. You'll be doing it quite often. 

### Now You Try It!

Walk the `geodata` object variable and reteieve the value under the key `place_id` and the `formatted_address`

In [8]:
# todo:
# retrieve the place_id put in a variable
# retrieve the formatted_address put it in a variable
# print both of them out


place = geodata['results'][0]['place_id']
address = geodata['results'][0]['formatted_address']
print("ID: %s; Address: %s" % (place, address))

ID: ChIJVcwsup_z2YkRTQhRUgaJYF4; Address: Syracuse, NY, USA


## Part 2: Parameter Handling

In the example above we hard-coded "Syracuse University" into the request:
```
url = 'http://maps.googleapis.com/maps/api/geocode/json?address=Syracuse+University'
``` 
A better way to write this code is to allow for the input of any location and supply that to the service. To make this work we need to send parameters into the request as a dictionary. This way we can geolocate any address!

You'll notice that on the url, we are passing a **key-value pair** the key is `address` and the value is `Syracuse+University`. Python dictionaries are also key-value pairs, so:

In [9]:
url = 'http://maps.googleapis.com/maps/api/geocode/json'  # base URL without paramters after the "?"
options = { 'address' : 'Syracuse University'}            # options['address'] == 'Syracuse University'
response = requests.get(url, params = options)            
geodata = response.json()
coords = geodata['results'][0]['geometry']['location']
print("Address", options)
print("Coordinates", coords)
print("%s is located at (%f,%f)" %(options['address'], coords['lat'], coords['lng']))

Address {'address': 'Syracuse University'}
Coordinates {'lat': 43.0391534, 'lng': -76.1351158}
Syracuse University is located at (43.039153,-76.135116)


### Looking up any address

RECALL: For `requests.get(url, params = options)` the part that says `params = options` is called a **named argument**, which is Python's way of specifying an optional function argument.

With our parameter now outside the url, we can easily re-write this code to work for any location! Go ahead and execute the code and input `Queens, NY`. This will retrieve the coordinates `(40.728224,-73.794852)`

In [11]:
location = input("Enter a location: ")

url = 'http://maps.googleapis.com/maps/api/geocode/json'
options = { 'address' : location }  # no longer 'Syracuse University' but whatever you type!
response = requests.get(url, params = options)            
geodata = response.json()
coords = geodata['results'][0]['geometry']['location']
print("Address", options)
print("Coordinates", coords)
print("%s is located at (%f,%f)" %(location, coords['lat'], coords['lng']))

Enter a location: Hamburg, NY
Address {'address': 'Hamburg, NY'}
Coordinates {'lat': 42.7158927, 'lng': -78.8294768}
Hamburg, NY is located at (42.715893,-78.829477)


### So useful, it should be a function

One thing you'll come to realize quickly is that your API calls should be wrapped in functions. This promotes **readability** and **code re-use**. For example:

In [13]:
def get_coordinates_using_google(location):
    options = { 'address' : location }  
    response = requests.get(url, params = options)            
    geodata = response.json()
    coords = geodata['results'][0]['geometry']['location']
    return coords

# main program here:
location = input("Enter a location: ")
coords = get_coordinates_using_google(location)
print("%s is located at (%f,%f)" %(location, coords['lat'], coords['lng']))


Enter a location: Hamburg, NY
Hamburg, NY is located at (42.715893,-78.829477)


### Other request methods

Not every API we call uses the `get()` method. Some use `post()` because the amount of data you provide it too large to place on the url. 

An example of this is the **Text-Processing.com** sentiment analysis service. http://text-processing.com/docs/sentiment.html This service will detect the sentiment or mood of text. You give the service some text, and it tells you whether that text is positive, negative or neutral. 

In [14]:
# 'you suck' == 'negative'
url = 'http://text-processing.com/api/sentiment/'
options = { 'text' : 'you suck'}
response = requests.post(url, data = options)
sentiment = response.json()
sentiment

{'label': 'neg',
 'probability': {'neg': 0.520097595188211,
  'neutral': 0.3886824782142297,
  'pos': 0.479902404811789}}

In [15]:
# 'I love cheese' == 'positive'
url = 'http://text-processing.com/api/sentiment/'
options = { 'text' : 'I love cheese'}
response = requests.post(url, data = options)
sentiment = response.json()
sentiment

{'label': 'pos',
 'probability': {'neg': 0.3866732207796809,
  'neutral': 0.18366003088446245,
  'pos': 0.6133267792203191}}

In the examples provided we used the `post()` method instead of the `get()` method. the `post()` method has a named argument `data` which takes a dictionary of data. The key required by **text-processing.com** is `text` which hold the text you would like to process for sentiment.

We use a post in the event the text we wish to process is very long. Case in point:

In [16]:
tweet = "Arnold Schwarzenegger isn't voluntarily leaving the Apprentice, he was fired by his bad (pathetic) ratings, not by me. Sad end to great show"
url = 'http://text-processing.com/api/sentiment/'
options = { 'text' : tweet }
response = requests.post(url, data = options)
sentiment = response.json()
sentiment

{'label': 'neg',
 'probability': {'neg': 0.8574162710846805,
  'neutral': 0.41809934640572893,
  'pos': 0.14258372891531948}}


## Part 3: Proper Error Handling (In 3 Simple Rules)

When you write code that depends on other people's code from around the Internet, there's a lot that can go wrong. Therefore we perscribe the following advice:

```
Assume anything that CAN go wrong WILL go wrong
```


### Rule 1: Don't assume the internet 'always works'

The first rule of programming over a network is to NEVER assume the network is available. You need to assume the worst. No WiFi, user types in a bad url, the remote website is down, etc. 

We handle this in the `requests` module by catching the `requests.exceptions.RequestException` Here's an example:

In [17]:
url = "http://this is not a website"
try:

    response = requests.get(url)  # throws an exception when it cannot connect

# internet is broken
except requests.exceptions.RequestException as e:
    print("ERROR: Cannot connect to ", url)
    print("DETAILS:", e)

ERROR: Cannot connect to  http://this is not a website
DETAILS: HTTPConnectionPool(host='this%20is%20not%20a%20website', port=80): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x00000000078B70B8>: Failed to establish a new connection: [Errno 11004] getaddrinfo failed',))


### Rule 2: Don't assume the response you get back is valid

Assuming the internet is not broken (Rule 1) You should now check for HTTP response 200 which means the url responded successfully.  Other responses like 404 or 501 indicate an error occured and that means you should not keep processing the response.

Here's one way to do it:

In [18]:
url = 'http://www.syr.edu/mikeisawesum'  # this should 404
try:
    
    response = requests.get(url)
    
    if response.ok:  # same as response.status_code == 200
        data = response.text
    else: # Some other non 200 response code
        print("There was an Error requesting:", url, " HTTP Response Code: ", response.status_code)

# internet is broken
except requests.exceptions.RequestException as e: 
    print("ERROR: Cannot connect to ", url)
    print("DETAILS:", e)


There was an Error requesting: http://www.syr.edu/mikeisawesum  HTTP Response Code:  404


### Rule 2a: Use exceptions instead of if else in this case

Personally I don't like to use `if ... else` to handle an error. Instead, I prefer to instruct `requests` to throw an exception of `requests.exceptions.HTTPError` whenever the response is not ok. This makes the code you write a little cleaner.

Errors are rare occurences, and so I  don't like error handling cluttering up my code. 


In [19]:
url = 'http://www.syr.edu/mikeisawesum'  # this should 404
try:
    
    response = requests.get(url)  # throws an exception when it cannot connect
    response.raise_for_status()   # throws an exception when not 'ok'
    data = response.text

# response not ok
except requests.exceptions.HTTPError as e:
    print("ERROR: Response from ", url, 'was not ok.')
    print("DETAILS:", e)
        
# internet is broken
except requests.exceptions.RequestException as e: 
    print("ERROR: Cannot connect to ", url)
    print("DETAILS:", e)


ERROR: Response from  http://www.syr.edu/mikeisawesum was not ok.
DETAILS: 404 Client Error: Not Found for url: https://www.syracuse.edu/mikeisawesum


###  Rule 3: Don't assume the data you get back is the data you expect.

And finally, do not assume the data arriving the the `response` is the data you expected. Specifically when you try and decode the `JSON` don't assume that will go smoothly. Catch the `json.decoder.JSONDecodeError`.

In [20]:
url = 'http://www.syr.edu' # this is HTML, not JSON
try:

    response = requests.get(url)  # throws an exception when it cannot connect
    response.raise_for_status()   # throws an exception when not 'ok'
    data = response.json()        # throws an exception when cannot decode json
    
# cannot decode json
except json.decoder.JSONDecodeError as e: 
    print("ERROR: Cannot decode the response into json")
    print("DETAILS", e)

# response not ok
except requests.exceptions.HTTPError as e:
    print("ERROR: Response from ", url, 'was not ok.')
    print("DETAILS:", e)
        
# internet is broken
except requests.exceptions.RequestException as e: 
    print("ERROR: Cannot connect to ", url)
    print("DETAILS:", e)

ERROR: Cannot decode the response into json
DETAILS Expecting value: line 1 column 1 (char 0)


### Now You try it!

Using the last example above, write a program to input a location, call the `get_coordinates_using_google()` function, then print the coordindates. Make sure to handle all three types of exceptions!!!


In [23]:
#this had mixed results when using the same input. Perhaps there is a limit on the API? I waited a little while and tried again and it worked again, thinking we are limited to the number API calls a day/hour/etc.?
import requests
import json

def get_coordinates_using_google(location):
    options = { 'address' : location }  
    response = requests.get(url, params = options)            
    geodata = response.json()
    coords = geodata['results'][0]['geometry']['location']
    return coords

try:
    location = input("Enter a location: ")
    url = 'http://maps.googleapis.com/maps/api/geocode/json'
    coords = get_coordinates_using_google(location)
    print("%s is located at the following coordinates: (Latitude: %f,Longitude %f)" % (location, coords['lat'], coords['lng']))

except json.decoder.JSONDecodeError as e: 
    print("ERROR: Cannot decode the response into json")
    print("DETAILS", e)

except requests.exceptions.HTTPError as e:
    print("ERROR: Response from ", url, 'was not ok.')
    print("DETAILS:", e)

except requests.exceptions.RequestException as e: 
    print("ERROR: Cannot connect to ", url)
    print("DETAILS:", e)

Enter a location: Hamburg, NY
Hamburg, NY is located at the following coordinates: (Latitude: 42.715893,Longitude -78.829477)
