<a id="toc"></a>
# Table of Contents

## [Error Handling](#errors)
#### Syntax: Try/Except/Finally
## [APIs](#apis)
#### [Dealing with Proxies](#proxies)
#### [Status Codes](#status)
#### [Parsing Response Data (JSON)](#json)
#### [Using Our Census Data](#using)

<a id="errors"></a>
## Error Handling
[Back to Table of Contents](#toc)

When writing code, there are many times that we encounter what are called **exceptions** or **errors**. These can halt the execution of your code and cause terrible headaches. However, if we use them correctly, the can be useful tools that can help the logic flow of our program.

### Syntax: Try/Except/Finally

We can maneuver exceptions using a logical structure in Python called a Try/Except/Finally block.

For example:

`try:
    [some code happens here that generates an error]
except:
    [code which is activated if an error happens]
finally:
    [code which will run whether or not there is an error]`
    
Let's try it out!

In [None]:
def throw_an_error():
    raise(ValueError("bad value"))

    
throw_an_error()

In [None]:
try:
    print("program running...")
except:
    print("Program encountered and error but we're still cool!")
finally:
    print("Program completed!")

<a id="apis"></a>
## APIs
[Back to Table of Contents](#toc)

An API or _Application Programming Interface_ is a tool for applications to talk to talk to each other. It's a commonly used term and has multiple meanings. For our purposes, we're talking about _web-based APIs which return data_. We use the _requests_ library to send requests to web-APIs as well as websites.

<a id="proxies"></a>
### Dealing with Proxies

There is an attached file for this lesson called **settings.py** - inside of it, you'll see 4 variables:

- `CENSUS_KEY`
- `USERNAME`
- `PASS`
- `PROXIES`

The CENSUS_KEY comes from the Census website ([click this link to request one](https://api.census.gov/data/key_signup.html) - it only takes a minute).

I've included USERNAME, PASS, and PROXIES in case your organization enforces an internet proxy. You may not know if it does until you encounter a "status code" of 407 (more on status codes below). Fill in your settings file with the Active Directory information you use for authentication on your computer (USERNAME and PASS). **Keep this file in a secure place and do not publish it to a version control system (like GitHub)**.

<a id="status"></a>
### Status Codes

When a _request_ is sent to a web-API a response is generated and sent back. The response comes with a bunch of data in what is called _header_ but, for our purposes, we're only worried about the **status code** of the reponse because that can tell us if we're having issues connecting or requesting data. Most people have encountered the _404 Not Found_ code while browsing the web at some point, but there are mahy codes that we can encounter. Here are a few basic ones you should know:

- `200 OK` - Standard response for successful HTTP requests.
- `400 Bad Request` - The server cannot or will not process the request due to an apparent client error.
- `403 Forbidden` - The request was valid, but the server is refusing action.
- `404 Not Found` - The requested resource could not be found but may be available in the future.
- `407 Proxy Authentication Required` - The client must first authenticate itself with the proxy.

[You can find more information about various codes here on Wikipedia.](https://en.wikipedia.org/wiki/List_of_HTTP_status_codes)

### Parsing Response Data (JSON)

Web-APIs can respond in many different data formats but the most commonly used one is **JSON** which stands for _JavaScript Object Notation_. JSON is a convenient data format because it is _human-readable_ and easy for machines to _parse_ and create as well. It relies on name/value pairs which is conveniently similar to the Python _dictionary_ datatype. You can read more about it on the [JSON.org](www.json.org/) website.

Here's an example:

`{
    "name": "William Brown",
    "address": {
        "streetAddress": "123 Abc Drive",
        "city": "Cool Guy City",
        "state": "FL",
        "postalCode": "33133-3100"
    },
    "phoneNumbers": [
        {
            "type": "mobile",
            "number": "305-555-1234"
        },
        {
            "type": "office",
            "number": "305-555-4567"
        }
    ]
}`

Let's make our first request and get some Census data on different state populations. [You can learn more about the Census API here.](https://www.census.gov/data/developers/data-sets.html)

In [1]:
import requests # We use the requests library as it contains the tools we need
from settings import PROXIES, CENSUS_KEY # We import the information we need from our settings file


# Here is the API URL we'll use to access data from the Census Bureau
URL = "https://api.census.gov/data/2018/pep/population?get=GEONAME,POP&for=state:*"


# We use parameters to access various objects
# First we need to make sure that the Census Bureau knows we're registered so we send our 'key'.
# If we were interested in asking other questions, we would add other parameters
PARAMS = {'key': CENSUS_KEY}

def get_census_state_pop_data():
    print('Sending Census state population request...')
    r = requests.get(url=URL, proxies=PROXIES, verify=False) 
    print('Request sent. Status code: {0}'.format(r.status_code))
    # extracting data in json format 
    return r.json()

census_data = get_census_state_pop_data()
census_data


Sending Census state population request...




Request sent. Status code: 200


[['GEONAME', 'POP', 'state'],
 ['Alabama', '4887871', '01'],
 ['Alaska', '737438', '02'],
 ['Arizona', '7171646', '04'],
 ['Arkansas', '3013825', '05'],
 ['California', '39557045', '06'],
 ['Colorado', '5695564', '08'],
 ['Connecticut', '3572665', '09'],
 ['Delaware', '967171', '10'],
 ['District of Columbia', '702455', '11'],
 ['Florida', '21299325', '12'],
 ['Georgia', '10519475', '13'],
 ['Hawaii', '1420491', '15'],
 ['Idaho', '1754208', '16'],
 ['Illinois', '12741080', '17'],
 ['Indiana', '6691878', '18'],
 ['Iowa', '3156145', '19'],
 ['Kansas', '2911505', '20'],
 ['Kentucky', '4468402', '21'],
 ['Louisiana', '4659978', '22'],
 ['Maine', '1338404', '23'],
 ['Maryland', '6042718', '24'],
 ['Massachusetts', '6902149', '25'],
 ['Michigan', '9995915', '26'],
 ['Minnesota', '5611179', '27'],
 ['Mississippi', '2986530', '28'],
 ['Missouri', '6126452', '29'],
 ['Montana', '1062305', '30'],
 ['Nebraska', '1929268', '31'],
 ['Nevada', '3034392', '32'],
 ['New Hampshire', '1356458', '33'],
 

When the above code has run, we can see that we've captured a ton of information about the population of different states. However, this isn't readily useable for us so we need to further parse it out.

Now that we know the format of the data (a list of lists), we can write a function to break it down into a dict we can use.

In [3]:
def census_dict_from_list(census_list):
    state_dict = dict()
    for state_lst in census_list:
        try:
            state_dict[state_lst[0]] = int(state_lst[1])
        except ValueError:
            print("Error: Value {0} for key {1} is invalid.".format(state_lst[1], state_lst[0]))
    return state_dict

# let's give it a test run
states_dict = census_dict_from_list(get_census_state_pop_data())
states_dict

Sending Census state population request...




Request sent. Status code: 200
Error: Value POP for key GEONAME is invalid.


{'Alabama': 4887871,
 'Alaska': 737438,
 'Arizona': 7171646,
 'Arkansas': 3013825,
 'California': 39557045,
 'Colorado': 5695564,
 'Connecticut': 3572665,
 'Delaware': 967171,
 'District of Columbia': 702455,
 'Florida': 21299325,
 'Georgia': 10519475,
 'Hawaii': 1420491,
 'Idaho': 1754208,
 'Illinois': 12741080,
 'Indiana': 6691878,
 'Iowa': 3156145,
 'Kansas': 2911505,
 'Kentucky': 4468402,
 'Louisiana': 4659978,
 'Maine': 1338404,
 'Maryland': 6042718,
 'Massachusetts': 6902149,
 'Michigan': 9995915,
 'Minnesota': 5611179,
 'Mississippi': 2986530,
 'Missouri': 6126452,
 'Montana': 1062305,
 'Nebraska': 1929268,
 'Nevada': 3034392,
 'New Hampshire': 1356458,
 'New Jersey': 8908520,
 'New Mexico': 2095428,
 'New York': 19542209,
 'North Carolina': 10383620,
 'North Dakota': 760077,
 'Ohio': 11689442,
 'Oklahoma': 3943079,
 'Oregon': 4190713,
 'Pennsylvania': 12807060,
 'Rhode Island': 1057315,
 'South Carolina': 5084127,
 'South Dakota': 882235,
 'Tennessee': 6770010,
 'Texas': 287018

In [4]:
import pandas as pd

df_emps_in_st = pd.read_excel('EmployeesPerState.xlsx', sheet_name='EmployeesPerState')

df_emps_in_st

Unnamed: 0,State,Employees
0,Wyoming,200.0
1,Vermont,312.0
2,District of Columbia,326.0
3,Alaska,333.0
4,North Dakota,333.0
5,South Dakota,354.0
6,Delaware,368.0
7,Rhode Island,389.0
8,Montana,389.0
9,Maine,438.0


<a id="using"></a>
### Using our Census Data
[Back to Table of Contents](#toc)

Let's calculate two things:

- How many stores we have per state (1 store per 60k people)
- How many employees per store

In [5]:
# let's create a function we can apply to our new column

def lookup_state(state):
    return states_dict[state]

df_emps_in_st["population"] = df_emps_in_st["State"].apply(lookup_state)
df_emps_in_st

Unnamed: 0,State,Employees,population
0,Wyoming,200.0,577737
1,Vermont,312.0,626299
2,District of Columbia,326.0,702455
3,Alaska,333.0,737438
4,North Dakota,333.0,760077
5,South Dakota,354.0,882235
6,Delaware,368.0,967171
7,Rhode Island,389.0,1057315
8,Montana,389.0,1062305
9,Maine,438.0,1338404


In [6]:
# now lets calculate the number of stores

df_emps_in_st = df_emps_in_st.assign(num_stores = lambda x: x['population'] / 60000)
df_emps_in_st = df_emps_in_st.round({'num_stores': 2})
df_emps_in_st

Unnamed: 0,State,Employees,population,num_stores
0,Wyoming,200.0,577737,9.63
1,Vermont,312.0,626299,10.44
2,District of Columbia,326.0,702455,11.71
3,Alaska,333.0,737438,12.29
4,North Dakota,333.0,760077,12.67
5,South Dakota,354.0,882235,14.7
6,Delaware,368.0,967171,16.12
7,Rhode Island,389.0,1057315,17.62
8,Montana,389.0,1062305,17.71
9,Maine,438.0,1338404,22.31


In [7]:
# Calculate the number of employees per store
df_emps_in_st = df_emps_in_st.assign(emps_per_store = lambda x: x['Employees'] / x['num_stores'])
df_emps_in_st = df_emps_in_st.round({'emps_per_store': 2})
df_emps_in_st

Unnamed: 0,State,Employees,population,num_stores,emps_per_store
0,Wyoming,200.0,577737,9.63,20.77
1,Vermont,312.0,626299,10.44,29.89
2,District of Columbia,326.0,702455,11.71,27.84
3,Alaska,333.0,737438,12.29,27.1
4,North Dakota,333.0,760077,12.67,26.28
5,South Dakota,354.0,882235,14.7,24.08
6,Delaware,368.0,967171,16.12,22.83
7,Rhode Island,389.0,1057315,17.62,22.08
8,Montana,389.0,1062305,17.71,21.96
9,Maine,438.0,1338404,22.31,19.63
