## Mapping US household income.

In this demo we will explore, how Lets-Plot Geocoding API and spatial plots are used to create choropleth maps without a use of any additional shapefiles or GeoJSON files.

#### Data

The dataset contains household `mean` income as well as name of City, County and State.

> **Note:** This dataset is incomplete, not all US counties are present in this dataset.


#### Tasks completed in this notebook:
 - Load and clean the data.
 - Use **state geocoder** to fetch the US states boundaries.
 - Use simple **map join** to create a choropleth map of the US states.
 - Use **county geocoder** to fetch the US counties boundaries.
 - Use complex (2-key form) **map join** to create a choropleth map of the US counties.
 - Add the `geom_livemap()` layer to choropleth map to make it interactive.
 
 
To learn more about the geocoding API in Lets-Plot go to [lets-plot.org](https://lets-plot.org/). 

In [None]:
import pandas as pd

from lets_plot import *
from lets_plot.geo_data import *

LetsPlot.setup_html()

In [None]:
import lets_plot
lets_plot.__version__

### Data

In order to keep plots simple we will remove the states of Alaska, Hawaii and Puerto Rico from our dataset.

These 3 states are located far away from the rest [48 contiguous states](https://en.wikipedia.org/wiki/Contiguous_United_States) (aka CONUS) and need to be shown on separate plots. 

In [None]:
income_dat = pd.read_csv("https://raw.githubusercontent.com/JetBrains/lets-plot-docs/master/data/US_household_income_2017.csv", 
                         encoding='latin-1')
income_dat.head(3)

In [None]:
income_dat = income_dat[~income_dat["State_Name"].isin(["Alaska", "Hawaii", "Puerto Rico"])]

In [None]:
income_dat = income_dat[income_dat["Mean"] > 0]
mean_US = income_dat["Mean"].describe()["mean"]
mean_US

### Map of the US states

In [None]:
# Create geocoder for the 48 contiguous states.
state_gcoder = geocode_states("US-48")
state_gcoder.get_geocodes().head(3)

#### Simple blank map

In [None]:
ggplot() + geom_map(map=state_gcoder)

### Choropleth map - states

In [None]:
# Compute mean income by the US state.
mean_income_state = income_dat.groupby("State_Name", as_index=False)["Mean"].mean()
mean_income_state.head(3)

In [None]:
# Define some setting to use on plots later on:
#
# - A gradient color palette. We will borrow color codes from the Brewer's 'PiYG' palette:
#   https://colorbrewer2.org/#type=diverging&scheme=PiYG&n=11
#   We will be using the US mean income as a `midpoint` for the color scale.
map_fill_colors = scale_fill_gradient2(name="", low="#8e0152",mid="#f7f7f7",high="#276419", midpoint=mean_US,
                                       format=".2s",
                                       guide=guide_colorbar(barheight=10, barwidth=300))

# - Remove axis.
# - Define plot coordinate system and size.
map_settings = (theme(axis="blank", panel_grid='blank',
                      legend_direction='horizontal', legend_position="bottom") + 
                map_fill_colors + 
                coord_map() +
                ggsize(700, 400))

In [None]:
# Use `geom_polygon` to create choropleth.
# - pass state geocoder to the `map` parameter.
# - specify the "State_Name" variable (from the dataset) as a single key in the `map_join` parameter.
(ggplot(mean_income_state) + 
 geom_polygon(aes(fill="Mean"), map=state_gcoder, map_join="State_Name", color="white") + 
 map_settings)

#### Adjusting geocoder resolution

Plot of this size looks too pixelated with the resolution used by default.

To create a better looking choropleth use the `inc_res()` function.

We will also configure a better looking tooltips.

In [None]:
tooltip_state=(layer_tooltips()
          .format('Mean', '.2s')
          .title('@State_Name')
          .line('Mean income|$@Mean'))

(ggplot(mean_income_state) + 
 geom_polygon(aes(fill="Mean"), 
              map=state_gcoder.inc_res(), 
              map_join="State_Name", 
              tooltips=tooltip_state,
              color="white") + 
 map_settings)

### Choropleth map - counties

In [None]:
# Compute mean income by the US county.
# Note: the resulting dataframe two key variables: "County" and "State_Name".
#       Later we will use these two variables to 'join' this dataframe with counties geocoder data.
mean_income_county = income_dat.groupby(["State_Name","County"], as_index=False)["Mean"].mean()
mean_income_county.head(3)

In [None]:
# Create geocoder for the US counties.
# Note: in addition to county names we are using here the `states()` function.
#       The `states()` allows us to tell geocoder to use names of states as parent qualifiers for county names.
#       This is necessary because names counties in the US are not unique, i.e. different states can easy have 
#       counties with identical names.
county_gcoder = (geocode_counties(mean_income_county["County"])
    .states(mean_income_county["State_Name"])
    .ignore_all_errors())
county_gcoder.get_geocodes().head(3)

In [None]:
# Configure tooltip.
tooltip_county=(layer_tooltips()
          .format('Mean', '.2s')
          .title('@County\n@State_Name')
          .line('Mean income|$@Mean')
          .color("black"))

In [None]:
# Again, use `geom_polygon` to create choropleth.
# - pass county geocoder to the `map` parameter.
# - specify the "County" and "State_Name" variables as a hieratchical key in the `map_join` parameter.
#   Note: the order of keys in hierarchical key is important.
#         Lets-Plot expects the same order as it is in a US street address, i.e.: city, state, country. 
(ggplot(mean_income_county) + 
 geom_polygon(aes(fill="Mean"), 
              map=county_gcoder, 
              map_join=[["County", "State_Name"]], 
              tooltips=tooltip_county, 
              color="white") + 
 map_settings)

### An Interactive Map

Finally, let's add the `geom_livemap()` layer to create an interactive map which you can zoom in and out, and pan.

In [None]:
(ggplot(mean_income_county) + 
 geom_livemap() +
 geom_polygon(aes(fill="Mean"), 
              map=county_gcoder, 
              map_join=[["County", "State_Name"]], 
              tooltips=tooltip_county, color="white") + 
 map_settings +theme(legend_position=[.5, 0.05]))