## First Steps with GeoPandas

In [None]:
%matplotlib inline
import geopandas as gpd

In [None]:
geo_df = gpd.read_file("data/cb_2015_us_state_20m.shp")
geo_df.plot()

In [None]:
geo_df.set_index(geo_df["STATEFP"].astype(int), inplace = True)
geo_df.head(5)

Mask out Alaska, Hawaii, and the territories.

In [None]:
geo_df = geo_df[contiguous]

Try plotting with different projections!

In [None]:
print(geo_df.crs)

# try 2163 (albers), 3857 (web), 4269 (plate)
ax = geo_df.to_crs(epsg=2163).plot()
ax.set_axis_off()

## Single Mothers by State

Download and format the state-by-state data; format the dataframe.  Set the State ID as the index as an integer, and rename the data column and cast it as a float.

In [None]:
import requests, pandas as pd
j = requests.get("http://api.census.gov/data/2014/acs5/profile?for=state:*&get=DP02_0037PE").json()
smom_df = pd.DataFrame(j[1:], columns = j[0])
smom_df["state"] = smom_df["state"].astype(int)
smom_df.set_index("state", inplace = True)
smom_df["DP02_0037PE"] = smom_df["DP02_0037PE"].astype(float)
smom_df.rename(columns = {"DP02_0037PE" : "Percent Mothers Unmarried"}, inplace = True)
smom_df.head()

Merge the single mothers dataset onto the states

In [None]:
geo_merge.set_index("NAME")["Percent Mothers Unmarried"].sort_values(ascending = False).plot(kind = "bar", figsize = (15, 3))

Plot the fraction of children born to a single mother, by state.
* Use the Albers Equal Area projection (2163).
* Use `scheme = "quantiles"` and play with alpha (opacity) and the color maps.

## Pennsylvania Election Returns

* Import pandas and geopandas, and the democratic vote shares from the last election.
* See `Advanced.ipynb` for the (not actually very advanced) scraping from the PA elections site.

In [None]:
import pandas as pd, geopandas as gpd

demvote_df = pd.read_csv("pa_demshare.csv", index_col = "county")
demvote_df.head()

In [None]:
counties = gpd.read_file("data/cb_2015_us_county_20m.shp")

Select out JUST Pennsylvania, and set the index to the county, in the same format as `demvote_df`.

Now merge the counties and the vote shares together, using the county name index.

Now Plot 'em!!

In [None]:
ax = counties.plot()
ax.set_axis_off()

* Let's again make a Choropleth map, this time with `equal_interval`.
* This time, an appropriate CRS is 3651, for southern Pennsylvania ([spatial reference](http://spatialreference.org/ref/epsg/3651/)).

Here, as above, we download data frome the census.  This time, it's the percent of adults with a bachelor's degree, at the county level.

In [None]:
import requests, pandas as pd
j = requests.get("http://api.census.gov/data/2014/acs5/profile?for=county:*&in=state:42&get=NAME,DP02_0067PE").json()
educ_df = pd.DataFrame(j[1:], columns = j[0])
educ_df["county"] = educ_df["NAME"].str.lower()
educ_df["county"] = educ_df["county"].str.replace(" county, pennsylvania", "")
educ_df.set_index("county", inplace = True)
educ_df["DP02_0067PE"] = educ_df["DP02_0067PE"].astype(float)
educ_df.rename(columns = {"DP02_0067PE" : "Bachelor's Degree"}, inplace = True)
educ_df.head()

### Merge and Plot Bachelor's v. Share
Note again, the merging key is fundamentally geographical, though we're doing it with attributes.

In [None]:
merged = demvote_df.join(educ_df, how = "inner")
merged.plot(kind = "scatter", x = "Democratic Two-Party Vote Share", y = "Bachelor's Degree")

# Spatial Joins

### Census Tracts
Import the tracts for chicago

In [None]:
import pandas as pd, geopandas as gpd

In [None]:
tract_df = gpd.read_file("data/cb_2014_17_tract_500k.shp")
tract_df = tract_df[tract_df["COUNTYFP"] == "031"]
tract_df.rename(columns = {"NAME" : "Census Tract"}, inplace = True)

* Take a look at `first_degree_murders.csv`.
* There is no "geometry" column, but there _are_ latitudes and longitudes.
* Import it, and make the geometry.

In [None]:
from shapely.geometry import Point

crime_df = pd.read_csv("first_degree_murders.csv", usecols = [19, 20])
crime_df.dropna(inplace = True)

geometry = [Point(xy) for xy in zip(crime_df.Longitude, crime_df.Latitude)]
crime_coords = gpd.GeoDataFrame(crime_df, crs = tract_df.crs, geometry=geometry)

Now use the spatial join syntax, to associate the points to census tracts.

In [None]:
# located_crimes = 

In [None]:
located_crimes.plot()

We now have census tracts for each point.  To make a choropleth, we want to count/group over census tracts, and then attribute merge.  Since we're counting, any old column will do...

Then merge back onto the tracts dataframe, so that we can plot

The census tracts are a little too small.  

### Again with Community Areas
Let's do the same thing again...

Import it...

In [None]:
commu_df = gpd.read_file("community_areas.geojson")

Create the dataframe of crime coordinates again, but this time matching the community area CRS.

As above, do the spatial join:

Finally, groupby and count, and merge back 

Merge back on to the community areas and plot!!

## Interactive Web Maps!?  AWESOME!!!

In [None]:
import folium

m = folium.Map([39.828175, -98.5795], 
               tiles='cartodbpositron', 
               zoom_start=4, max_zoom=14, min_zoom=4)

ft = "Percent Mothers Unmarried"
cmap = folium.colormap.linear.YlOrRd.scale(geo_merge[ft].min(), geo_merge[ft].max())

folium.GeoJson(geo_merge,
               style_function=lambda feature: {
                'fillColor': cmap(feature['properties'][ft]),
                'fillOpacity' : 0.6,
                'weight' : 2, 'color' : 'black'
               }).add_to(m)

cmap.caption = 'Percent Children Born to Single Mothers'
cmap.add_to(m)

m.save("us_single_mothers.html")
m

## Spatial Associations and Geocoding

In [None]:
from geopy.geocoders import Nominatim
geolocator = Nominatim()
location = geolocator.geocode("6021 S. Kimbark Ave, Chicago")
location

In [None]:
from shapely.geometry import Point
pt = Point(-87.5940494865461, 41.7851555)

In [None]:
state_df = gpd.read_file("data/cb_2015_us_state_20m.shp")
state_df[state_df.contains(pt)]["NAME"]

In [None]:
tract_df = gpd.read_file("data/cb_2014_17_tract_500k.shp")
tract_df = tract_df[tract_df["COUNTYFP"] == "031"]
tract_df.head()

In [None]:
tract_df[tract_df.contains(pt)]["NAME"]