# Banking Deserts and Unemployment

In this activity your broad task is to answer the question: "Is there a relationship between poverty, age, and population with the number of banks in a given area?" To help accomplish this task, you've been given census data and a count of the financial institutions within 700 randomly selected zip codes across the country. We'll use this data to create layered maps using GMaps to see if we can visualize a relationship.

[Atlantic article discussing topic](https://www.theatlantic.com/business/archive/2016/03/banking-desert-ny-fed/473436/)

---

### Instructions

* Using [zip_bank_data.csv](../Resources/zip_bank_data.csv) and your new knowledge of the US Census API, add a column for `Unemployment Rate` to the csv.

* Using `gmap` create the following three figures:

  * A map with a `heatmap_layer` of the poverty rate for each city.
  
    <img src="../Images/heatmap.png" width=30%></img>

  * A map with a `symbol_layer` for the number of banks located at that city.
  
    <img src="../Images/bank_map.png" width=30%></img>

  * A map that includes both the poverty `heatmap_layer` and the bank `symbol_layer`.
  
    <img src="../Images/final_map.png" width=30%></img>
    
### Documentation

##### [GMaps tutorials and examples](https://jupyter-gmaps.readthedocs.io/en/latest/tutorial.html#getting-started)
##### [GMaps API docs for functions and parameter descriptions](https://jupyter-gmaps.readthedocs.io/en/latest/api.html)

### Hints

* Test your code with only 5-10 cities at a time while debugging your code.

* For reference, use the docs for the [layers](http://jupyter-gmaps.readthedocs.io/en/latest/api.html#figures-and-layers) and as a [refresher](http://jupyter-gmaps.readthedocs.io/en/latest/tutorial.html) for setting up the maps.

* Be sure to handle zoom on the heat map.

* At this point, you should not need to perform any new requests to Google's APIs.

* To format the info boxes on your symbol_layer, look to use string formatting with list comprehension.

---

In [None]:
# Dependencies
from census import Census
import gmaps
import pandas as pd

# import API keys
from config import (census_key, gkey)

# Census API Key
c = Census(census_key, year=2013)

## Data Retrieval

#### Documentation
* [Census API wrapper documentation](https://github.com/CommerceDataService/census-wrapper)
* [Available Census fields and labels](https://gist.github.com/afhaque/60558290d6efd892351c4b64e5c01e9b)

In [None]:
# Run Census Search to retrieve data on all zip codes (2013 ACS5 Census)
census_data = c.acs5.get(("B01003_001E", "B23025_005E"), {
                         'for': 'zip code tabulation area:*'})

# Convert to DataFrame
unemployment_df = pd.DataFrame(census_data)

# Column Renaming
unemployment_df = unemployment_df.rename(columns={"B01003_001E": "Population",
                                      "zip code tabulation area": "Zipcode",
                                      "B23025_005E": "Unemployment Count"})

unemployment_df.head()

## Calculate the `Unemployment Rate` and add it as a new column to the DataFrame

$$ Unemployment Rate = \frac{Unemployment Count}{Population} $$

In [None]:
# Create an Unemployment Rate column that is calculated using Unemployment Count and Population


# Create the final DataFrame with only columns for Zipcode and Unemployment Rate (in that order)


# View final DataFrame
print("Number of zip codes in data: " + str(len(unemployment_df)))
unemployment_df.head()

## Load *zip_bank_data.csv* into a DataFrame

* This file contains rows for 700 randomly selected zip codes from across the county.
* The available columns reflect the same demographic data that we pull from census in a previous exercise, but we've also included: **`Bank Count`**, **`Lat`**, and **`Lng`**.

In [None]:
# Read demographic and bank data from CSV
demo_df = pd.read_csv("../Resources/zip_bank_data.csv", encoding="utf-8")

# Visualize
demo_df.head()

## Merge the two data sets along zip code

#### HINTS
* When thinking about the type of merge/join to use, think about this: *We want to keep all of the data from the master DataFrame, but we only need the unemployment rate for the 700 randomly selected zip codes.*
* Refer to the [first exercise from Lesson 4.3](https://rice.bootcampcontent.com/Rice-Coding-Bootcamp/RICEHOU201811DATA2/blob/master/class-tth/04-Pandas/4.3%20-%20Data%20Parsing%20with%20Pandas/Activities/01-Ins_Merging/Solved/Merging.ipynb) to review how merges/joins work. 

In [None]:
# When merging 2 DataFrames, the merge column must be of the same datatype in both DataFrames
# The census API returned the Zipcode column as an object, so we should convert it  to an int first




print("Do the Zipcode dtypes match?: " + str(unemployment_df['Zipcode'].dtype == demo_df['Zipcode'].dtype))

In [None]:
# Merge the master and unemployment DataFrames on the Zipcode column



# Visualize
final_demo_df.head()

## Create a Heatmap of poverty rate

#### Configure gmaps with API key

In [None]:
gmaps.configure(api_key=gkey)

#### Create a new variable to contain `Lat` and `Lng` and another new variable to contain the `Poverty Rate`

In [None]:
# Store 'Lat' and 'Lng' into a variable called 'locations'



# Store 'Poverty Rate' in a variable called 'poverty_rate'




#### Create a poverty Heatmap layer

In [None]:
# Create the mapping figure



# Create the heatmap layer using locations and poverty_rate



# Add heatmap layer to figure



# Display the figure




## Create a map of the number of banks

To plot the bank count, we'll use a `symbol_layer`. A `symbol_layer` allows us to edit the look of the data point on the map. 

Using this type of layer, we'll use the number of banks for a particular zip code to increase/decrease the transparency of the data point (more banks = darker point). To do this, we need to rescale the number of banks down to a range of 0 to 1. By doing this, we maintain the distribution of the data, 

[Symbol layer documentation](https://jupyter-gmaps.readthedocs.io/en/latest/tutorial.html#markers-and-symbols)

In [None]:
# Create a variable to store the bank count
bank_count = final_demo_df["Bank Count"]

# Normalize the bank counts from 0 to 1 and store that in a new variable
norm_count = (bank_count - min(bank_count)) / (max(bank_count) - min(bank_count))

# view the distribution of normalized bank counts
norm_count.hist()

In [None]:
# Create bank symbol layer



# Create a mapping figure



# Add the layer to the figure



# Display the figure




## Create a map that combines both layers

#### Hint
* You do not need to recreate the layers. Try adding the ones that you've already created to a new figure.

In [None]:
# Create and display a map with both layers created above
