# Interactive Geospatial Guide

Purpose: Create interactive, geospatial visualizations with minimal python knowledge and packages.

# 1. Install Packages

If you're working on Anaconda, you'll only need to install geopandas.

In [None]:
## Anaconda installs

# Installs geopandas
!conda install geopandas

# Updates all other packages
!conda update --all

If you're not working on Anaconda, you'll likely need a few more packages.

In [None]:
## Non-Anaconda installs

# Installs geopandas
!pip install geopandas
!pip install bokeh
!pip install pandas
!pip install matplotlib


# Updates all other packages
!conda update --all

# 2. Loading Data & Packages

Packages: pandas, matplotlib, geopandas, json, bokeh

Data: County-level statistics for the DC, Maryland, and Virginia Area.

In [94]:
## Package imports

# For handling data generally
import pandas as pd

# Base plotting package
import matplotlib.pyplot as plt
%matplotlib inline

# For importing shapefiles
import geopandas as gpd

# For converting dataframes to json files
import json

# For visualizing geospatial data
import bokeh

All data used in this guide can be obtained via public sources.

In [95]:
## Data imports

# Read in the federal data
data = pd.read_csv("data/federal_data.csv", index_col = 0)
data.head()

Unnamed: 0,GEOID,year,units,units_sf,units_2_4,units_mf,name,state,land_area,inequality_index,...,private_hospitals,non_profit_hospitals,tribal_hospitals,exp_homelessness,votes_dem_percent,votes_rep_percent,votes_green_percent,votes_lib_percent,votes_other_percent,rural_level
9315,11001,1990,368,180,180,162,Washington,DC,,,...,,,,,,,,,,1.0
9316,11001,1991,333,83,83,236,Washington,DC,,,...,,,,,,,,,,
9317,11001,1992,132,92,92,26,Washington,DC,,,...,,,,,,,,,,
9318,11001,1993,305,99,142,163,Washington,DC,,,...,,,,,,,,,,
9319,11001,1994,210,96,96,114,Washington,DC,,,...,,,,,,,,,,


These data contain a variety of features extracted from a range of US federal agencies. A codebook for the features can be found in the data zipfile. For this tutorial, we will be using the Census's inequality index in 2019.

In [96]:
# Subset data to only GEOID, year, and inequality index and save as a new dataset
df = data[["GEOID", "year", "state", "inequality_index"]]

In [97]:
# Remove rows missing inequality index and not in 2019
df = df.loc[(~df["inequality_index"].isna()) & 
            (df["year"] == 2019)]

Now we'll import the county shapefiles!

In [98]:
# Read in the corresponding spatial data
counties_usa = gpd.read_file('data/shapefiles/census_counties.shp')
counties_usa.head()

Unnamed: 0,STATEFP,COUNTYFP,COUNTYNS,AFFGEOID,GEOID,NAME,LSAD,ALAND,AWATER,geometry
0,21,7,516850,0500000US21007,21007,Ballard,6,639387454,69473325,"POLYGON ((-89.18137 37.04630, -89.17938 37.053..."
1,21,17,516855,0500000US21017,21017,Bourbon,6,750439351,4829777,"POLYGON ((-84.44266 38.28324, -84.44114 38.283..."
2,21,31,516862,0500000US21031,21031,Butler,6,1103571974,13943044,"POLYGON ((-86.94486 37.07341, -86.94346 37.074..."
3,21,65,516879,0500000US21065,21065,Estill,6,655509930,6516335,"POLYGON ((-84.12662 37.64540, -84.12483 37.646..."
4,21,69,516881,0500000US21069,21069,Fleming,6,902727151,7182793,"POLYGON ((-83.98428 38.44549, -83.98246 38.450..."


We're only working with DC, Maryland, and Virginia here, so we'll subset to States with FIPS 11, 24, and 51.

In [99]:
# Subset the shapefiles to DC, Maryland, and Virginia
counties_usa = counties_usa.loc[counties_usa["STATEFP"].isin(["11", "24", "51"])]

Now, we'll subset the columns to only GEOID and geometry.

In [100]:
# Subset to GEOID and geometry
counties_usa = counties_usa[["GEOID", "geometry"]]

In [101]:
# Convert GEOID to integers
counties_usa["GEOID"] = counties_usa["GEOID"].astype(int)

Finally, we'll merge the county shapes to the federal data and drop GEOID and year.

***NOTE:*** The shapefile MUST be on the left in the merge.

In [102]:
# Merge in shapefiles
df = counties_usa.merge(df, 
              how = "left",
              left_on = "GEOID", right_on = "GEOID")
df = df[["inequality_index", "state", "geometry"]]
df.head()

Unnamed: 0,inequality_index,state,geometry
0,0.5269,DC,"POLYGON ((-77.11976 38.93434, -77.11253 38.940..."
1,0.4133,MD,"POLYGON ((-76.84036 39.10314, -76.83678 39.104..."
2,0.41,VA,"POLYGON ((-79.53328 38.15614, -79.53273 38.157..."
3,0.4677,VA,"POLYGON ((-78.90459 37.02229, -78.90401 37.022..."
4,0.6002,VA,"POLYGON ((-82.55383 37.20284, -82.55037 37.204..."


# 2. Basic Visualization

Goal: Create and fine-tune a simple Bokeh visualization.

First, let's convert our shapefile to a JSON.

In [103]:
# Convert to JSON format for plotting
from bokeh.models import GeoJSONDataSource
df_geo = GeoJSONDataSource(geojson = 
                           df.to_json())    # "default_handler" ensures to_json can handle GEOID

Second, lets make sure Bokeh plots in this notebook. 

In [104]:
# Ensures all plots are outputted to the notebook
from bokeh.io import output_notebook, show

In [105]:
output_notebook()    # Very important! Run a plot before this and see what happens

## A. Shapes & Colors

Here, two plots will be made:
1. Colored by state (a categorical variable)
2. Colored by inequality (a continuous variable)

In [118]:
from bokeh.plotting import figure
from bokeh.models import CategoricalColorMapper    # For coloring counties by state
from bokeh.palettes import brewer                  # For selecting county colors

# Dark2 provides a qualitative, colorblind friendly color palette
# 3 specifies the number of categories
palette = brewer['Dark2'][3] 

# Maps colors to states
mapper = CategoricalColorMapper(palette=palette, 
                                factors=["DC", "MD", "VA"])

# Create figure object.
p = figure(title = '', 
           plot_height = 600 ,
           plot_width = 950, 
           toolbar_location = 'below',
           tools = "pan, wheel_zoom, box_zoom, reset, save")

p.xgrid.grid_line_color = None
p.ygrid.grid_line_color = None

# Add patch renderer to figure.
states = p.patches('xs','ys', source = df_geo,
                   fill_color = {"field" : "state",
                                 "transform" : mapper},
                   line_color = "gray", 
                   line_width = 0.25, 
                   fill_alpha = 1
                  )

show(p)

In [76]:
# Create figure object.
p = figure(title = 'Lead Levels in Water Samples, 2018', 
           plot_height = 600 ,
           plot_width = 950, 
           toolbar_location = 'below',
           tools = “pan, wheel_zoom, box_zoom, reset”)
p.xgrid.grid_line_color = None
p.ygrid.grid_line_color = None
# Add patch renderer to figure.
states = p.patches('xs','ys', source = geosource,
                   fill_color = None,
                   line_color = ‘gray’, 
                   line_width = 0.25, 
                   fill_alpha = 1)
# Create hover tool
p.add_tools(HoverTool(renderers = [states],
                      tooltips = [('State','@NAME'),
                                ('Population','@POPESTIMATE2018')]))
show(p)

SyntaxError: invalid character in identifier (Temp/ipykernel_2752/2063884247.py, line 6)

# 3. Additional Tools

Goal: Explore other interactive tools in Bokah.

## A. Select

## B. Hover

## C. Range

## D. Draw

# 4. Saving & Sharing

Goal: Storing and presenting interactive visualizations.

## A. Plots

## B. Documents

## C. Bokeh Server Apps