<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Take-notice!" data-toc-modified-id="Take-notice!-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Take notice!</a></span></li><li><span><a href="#Geocoging-Template" data-toc-modified-id="Geocoging-Template-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Geocoging Template</a></span><ul class="toc-item"><li><span><a href="#Import-libraries" data-toc-modified-id="Import-libraries-2.1"><span class="toc-item-num">2.1&nbsp;&nbsp;</span>Import libraries</a></span></li><li><span><a href="#Data-exploration-and-cleanup" data-toc-modified-id="Data-exploration-and-cleanup-2.2"><span class="toc-item-num">2.2&nbsp;&nbsp;</span>Data exploration and cleanup</a></span></li><li><span><a href="#Trim-the-data" data-toc-modified-id="Trim-the-data-2.3"><span class="toc-item-num">2.3&nbsp;&nbsp;</span>Trim the data</a></span></li><li><span><a href="#Add-the-lat/lon-columns" data-toc-modified-id="Add-the-lat/lon-columns-2.4"><span class="toc-item-num">2.4&nbsp;&nbsp;</span>Add the lat/lon columns</a></span></li><li><span><a href="#Let's-just-use-20-random-rows" data-toc-modified-id="Let's-just-use-20-random-rows-2.5"><span class="toc-item-num">2.5&nbsp;&nbsp;</span>Let's just use 20 random rows</a></span></li><li><span><a href="#Loop-and-geocode" data-toc-modified-id="Loop-and-geocode-2.6"><span class="toc-item-num">2.6&nbsp;&nbsp;</span>Loop and geocode</a></span></li><li><span><a href="#Convert-to-geodataframe" data-toc-modified-id="Convert-to-geodataframe-2.7"><span class="toc-item-num">2.7&nbsp;&nbsp;</span>Convert to geodataframe</a></span></li><li><span><a href="#Add-base-layer-capability" data-toc-modified-id="Add-base-layer-capability-2.8"><span class="toc-item-num">2.8&nbsp;&nbsp;</span>Add base layer capability</a></span></li><li><span><a href="#Map-it!" data-toc-modified-id="Map-it!-2.9"><span class="toc-item-num">2.9&nbsp;&nbsp;</span>Map it!</a></span></li></ul></li></ul></div>

<div class="alert alert-danger">

<h1>Take notice!</h1>
<ul>
    <li>Make sure you are working with a copy and not the original notebook file</li>
    <li>This class will be recorded</li>
</ul>
    
</div>

# Geocoging Template

*Special thanks to **Jayne** for providing the data and workflow for this template!*

<img src="images/geocode.png">

Your data may have addresses with no other geographic identifyers (such as FIPS codes or latitude/longitude coordinates). In such cases, it is necessary to **geocode** your table, and convert the addresses to geographic coordinates.

This template is designed to:
* take in a table that has a column with addresses
* clean the table so that only relevant columns are left
* loop through every row of the table and geocode them
* convert the geocoded table into a geodataframe
* map it!

## Import libraries

In [None]:
# to download osm dataimport osmnx as ox
import osmnx as ox

# to manipulate and visualize spatial dataimport geopandas as gpd
import geopandas as gpd

# to provide basemaps import contextily as ctx
import contextily as ctx

# to plot things with plotlyimport plotly.express as px
import plotly.express as px

# we import this so we can process this csv file
import pandas as pd

projects = pd.read_csv('LAProjects.csv')

## Data exploration and cleanup

In [None]:
projects.head()

In [None]:
# filtering it for only LA based projects
la_projects = projects.loc[projects['Project City'] == 'Los Angeles'].copy()
la_projects.head()

## Trim the data

In [None]:
# clean the columns
la_projects_trimmed = la_projects[['Type of tax credit funding',
 'Project Name',
 'Project Address',
 'Project City',
 'Project Zip Code',
 'Project County',
 'Census Tract',
 'Housing Type',
 'Total Units',
 'Low Income Units',
 'Annual Federal Award',
 'Total State Award',
]].copy()

In [None]:
# show a preview of the first 5 rows.
la_projects_trimmed.head()

## Add the lat/lon columns

We add empty lat/lon columns to our dataframe as placeholders for the geocoding.

In [None]:
la_projects_trimmed['lat'] = pd.Series(dtype='float')
la_projects_trimmed['lon'] = pd.Series(dtype='float')

In [None]:
la_projects_trimmed.head()

## Let's just use 20 random rows

This sample data is pretty big, so let's just geocode 20 random rows.

In [None]:
la_projects_trimmed=la_projects_trimmed.sample(20)
la_projects_trimmed

## Loop and geocode

Here, we begin the loop on our cleaned up and trimmed down dataframe. Note that we use the `try` `except` to [catch errors and exceptions](https://docs.python.org/3/tutorial/errors.html) in our geocoding process.

In [None]:
# loop through list and add to dataframe with lat/lon's
for index, row in la_projects_trimmed.iterrows():

    # identify the address column
    address = row['Project Address']
    
    try:
        
        # geocode it
        geocoded_address = ox.geocoder.geocode(address)

        # add it to the dataframe
        la_projects_trimmed.at[index,'lat']=geocoded_address[0]
        la_projects_trimmed.at[index,'lon']=geocoded_address[1]

        # print the output
        print(address + ' geocoded to ' + str(geocoded_address[0])+' '+str(geocoded_address[1]))        

    except:
        print('Could not geocode '+ address)
        pass


In [None]:
la_projects_trimmed

## Convert to geodataframe

The resulting table is a pandas dataframe. Let's convert it to a geodataframe. Since the addresses were geocoded to their latitude and longitude coordinates, we specify here to use the WGS84 geographic coordinate system using `EPSG:4326`.

In [None]:
# convert pandas dataframe to geodataframe
la_projects_trimmed = gpd.GeoDataFrame(la_projects_trimmed, 
                                     crs="EPSG:4326",
                                     geometry=gpd.points_from_xy(la_projects_trimmed.lon, la_projects_trimmed.lat))


## Add base layer capability
Next, we reproject it to a web mercator projection to allow contextily tiles to work as a base layer.

In [None]:
# reproject to web mercator
la_projects_trimmed = la_projects_trimmed.to_crs(epsg=3857)

## Map it!

In [None]:
ax = la_projects_trimmed.plot(figsize=(12,12),
                              column='Type of tax credit funding',
                              legend=True,
                              cmap='Set1',
                             markersize=60)

ax.axis('off')
ctx.add_basemap(ax,source=ctx.providers.CartoDB.Positron)