![alt text](https://global-uploads.webflow.com/5baafc2653bd67278f206724/5be267a03f7813daf821b31e_safegraph-logo-hidpi%403x-p-500.png)

## Changes In Physical Distancing and Personal Mobility in California

This notebook is a proof-of-concept to show how the mobility of residents of California is changing in recent days, based on a dataset called [Social Distancing Metrics](https://docs.safegraph.com/docs/social-distancing-metrics) from [SafeGraph](https://safegraph.com).

Tools/ tech in this notebook: 
* pyspark, python pandas, geopandas (for maps)


**[Ryan Fox Squire](https://www.linkedin.com/in/ryanfoxsquire/) | Data Scientist @ [SafeGraph](https://safegraph.com/)**

ryan@safegraph.com

3/30/2020


**How to get this data**
* SafeGraph is actively donating data and resources to governments, researchers, academics and other organizations working for the public good in response to Covid19. [Click here to get involved](https://docs.google.com/forms/d/e/1FAIpQLSc501xfAzEPADOwRmsdHmu-v8aN14jnKHBmEmdJJcTgRLddqw/viewform). 

## Load libraries

In [2]:
! pip install geopandas
! pip install mapclassify
! pip install descartes

Collecting descartes
  Downloading descartes-1.1.0-py3-none-any.whl (5.8 kB)
Installing collected packages: descartes
Successfully installed descartes-1.1.0


In [2]:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import pyspark


#### Read in  [Social Distancing Metrics](https://docs.safegraph.com/docs/social-distancing-metrics) data from SafeGraph

(contact ryan@safegraph.com for access to data)

Read in prototype of Social Distancing Metrics (SDM), available from SafeGraph, updated daily with a 3 day lag. 

* California has 23,159 census block groups (CBGs) and 58 counties. 
* A CBG is a high resolution census area containing ~ 1000 households. 
* SDM has a number of interesting summary metrics at the level of each CBG.
* SDM includes info like what fraction of people are leaving their home census block groups. 
* Here we use SDM columns `device_count` and `completely_home_device_count`

In [11]:
from pyspark.sql import SparkSession

spark = SparkSession \
    .builder \
    .appName("Python Spark SQL basic example") \
    .config("spark.some.config.option", "some-value") \
    .getOrCreate()

In [13]:
sdm_spark_raw = spark.read.option("header", "true").csv("s3://sg-c19-response/social-distancing/v2/2020/*/*/*.csv.gz")

Py4JJavaError: An error occurred while calling o72.csv.
: java.io.IOException: No FileSystem for scheme: s3
	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2660)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
	at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$org$apache$spark$sql$execution$datasources$DataSource$$checkAndGlobPathIfNecessary$1.apply(DataSource.scala:547)
	at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$org$apache$spark$sql$execution$datasources$DataSource$$checkAndGlobPathIfNecessary$1.apply(DataSource.scala:545)
	at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
	at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
	at scala.collection.immutable.List.foreach(List.scala:392)
	at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
	at scala.collection.immutable.List.flatMap(List.scala:355)
	at org.apache.spark.sql.execution.datasources.DataSource.org$apache$spark$sql$execution$datasources$DataSource$$checkAndGlobPathIfNecessary(DataSource.scala:545)
	at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:359)
	at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
	at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:619)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:282)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:238)
	at java.lang.Thread.run(Thread.java:748)


In [9]:
display(sdm_spark_raw.limit(5))

origin_census_block_group,date_range_start,date_range_end,device_count,distance_traveled_from_home,bucketed_distance_traveled,median_dwell_at_bucketed_distance_traveled,completely_home_device_count,median_home_dwell_time,bucketed_home_dwell_time,at_home_by_each_hour,part_time_work_behavior_devices,full_time_work_behavior_devices
10150007002,2020-02-03T00:00:00-06:00,2020-02-04T00:00:00-06:00,94,8828,"{""16001-50000"":16,"">50000"":3,""<1000"":8,""2001-8000"":19,""1001-2000"":4,""8001-16000"":17}","{""16001-50000"":69,"">50000"":30,""<1000"":42,""2001-8000"":68,""1001-2000"":48,""8001-16000"":88}",21,480,"{""721-1080"":16,""361-720"":23,""61-360"":12,""<60"":24,"">1080"":14}","[49,46,46,43,44,37,38,31,24,17,19,18,16,20,18,26,31,32,29,35,41,45,46,46]",13,16
10299598001,2020-02-03T00:00:00-06:00,2020-02-04T00:00:00-06:00,172,18999,"{""16001-50000"":60,"">50000"":23,""<1000"":7,""2001-8000"":13,""1001-2000"":4,""8001-16000"":35}","{""16001-50000"":61,"">50000"":89,""<1000"":33,""2001-8000"":37,""1001-2000"":53,""8001-16000"":109}",24,715,"{""721-1080"":55,""361-720"":39,""61-360"":16,""<60"":33,"">1080"":23}","[113,113,109,111,103,95,83,60,36,34,28,31,26,30,33,53,67,77,91,101,111,115,119,120]",33,26
10299598001,2020-02-03T00:00:00-05:00,2020-02-04T00:00:00-05:00,172,18999,"{""16001-50000"":60,"">50000"":23,""<1000"":7,""2001-8000"":13,""1001-2000"":4,""8001-16000"":35}","{""16001-50000"":61,"">50000"":89,""<1000"":33,""2001-8000"":37,""1001-2000"":53,""8001-16000"":109}",24,715,"{""721-1080"":55,""361-720"":39,""61-360"":16,""<60"":33,"">1080"":23}","[113,113,109,111,103,95,83,60,36,34,28,31,26,30,33,53,67,77,91,101,111,115,119,120]",33,26
10730109006,2020-02-03T00:00:00-06:00,2020-02-04T00:00:00-06:00,39,7652,"{""16001-50000"":4,"">50000"":1,""<1000"":1,""2001-8000"":12,""1001-2000"":6,""8001-16000"":9}","{""16001-50000"":46,"">50000"":4,""<1000"":150,""2001-8000"":54,""1001-2000"":11,""8001-16000"":110}",8,750,"{""721-1080"":14,""361-720"":3,""61-360"":2,""<60"":10,"">1080"":6}","[19,23,21,22,25,19,21,19,14,13,10,8,10,12,12,10,14,13,15,14,17,20,19,19]",7,7
11250103023,2020-02-03T00:00:00-06:00,2020-02-04T00:00:00-06:00,157,10480,"{""16001-50000"":19,"">50000"":10,""<1000"":12,""2001-8000"":30,""1001-2000"":10,""8001-16000"":46}","{""16001-50000"":132,"">50000"":28,""<1000"":29,""2001-8000"":45,""1001-2000"":74,""8001-16000"":30}",18,650,"{""721-1080"":52,""361-720"":35,""61-360"":18,""<60"":36,"">1080"":16}","[98,98,94,94,93,90,83,57,26,23,19,16,19,21,26,35,46,48,71,85,89,92,101,104]",26,34


In [11]:
# %md ### read in census data
# * We may want to combine PDM with census data like county population
# * Read in census data for county and census block group populations  (all census data is available in [a convenient CSV download here](https://www.safegraph.com/open-census-data)). 

In [1]:
cbg_fips_codes = spark.read.csv("s3://safegraph-perm/ryan/datasets/openCensusData/metadata/cbg_fips_codes.csv", header=True).toPandas()
cbg_fips_codes['county_fips'] = cbg_fips_codes.state_fips + cbg_fips_codes.county_fips
cbg_fips_codes.head()

NameError: name 'spark' is not defined

In [13]:
cbg_population = spark.read.csv("s3://safegraph-perm/ryan/datasets/openCensusData/data/cbg_b01.csv", header=True).toPandas()
cbg_population['county_fips'] = cbg_population['census_block_group'].str.slice(start=0, stop=5)
cbg_population['population'] = cbg_population['B01001e1'].astype('int')
columns = ['census_block_group', 'county_fips', 'population']
cbg_population = cbg_population[columns].copy()
cbg_population.head()

Unnamed: 0,census_block_group,county_fips,population
0,10010201001,1001,745
1,10010201002,1001,1265
2,10010202001,1001,960
3,10010202002,1001,1236
4,10010203001,1001,2364


In [14]:
sdm_spark = sdm_spark_raw.select("origin_census_block_group", "date_range_start", "date_range_end", 
                                 "device_count", "completely_home_device_count", "part_time_work_behavior_devices", 
                                 "full_time_work_behavior_devices")

sdm_df = sdm_spark.toPandas()

# convert numerical columns
int_columns = ['device_count', 'completely_home_device_count']
for int_col in int_columns:
  sdm_df[int_col] = sdm_df[int_col].astype('int')

#datetime columns
sdm_df['date_start'] = sdm_df.date_range_start.str.slice(start=0, stop=10)
sdm_df['dt'] = pd.to_datetime(sdm_df['date_start'])
sdm_df['week'] = sdm_df.dt.dt.week 

# join county_fips for county names and states
sdm_df['county_fips'] = sdm_df.origin_census_block_group.str.slice(start=0, stop=5) # county is the first 5 digits of the CBG
sdm_df = sdm_df.merge(cbg_fips_codes, on='county_fips', how='left')
sdm_df.head(3)

Unnamed: 0,origin_census_block_group,date_range_start,date_range_end,device_count,completely_home_device_count,part_time_work_behavior_devices,full_time_work_behavior_devices,date_start,dt,week,county_fips,state,state_fips,county,class_code
0,10150007002,2020-02-03T00:00:00-06:00,2020-02-04T00:00:00-06:00,94,21,13,16,2020-02-03,2020-02-03,6,1015,AL,1,Calhoun County,H1
1,10299598001,2020-02-03T00:00:00-06:00,2020-02-04T00:00:00-06:00,172,24,33,26,2020-02-03,2020-02-03,6,1029,AL,1,Cleburne County,H1
2,10299598001,2020-02-03T00:00:00-05:00,2020-02-04T00:00:00-05:00,172,24,33,26,2020-02-03,2020-02-03,6,1029,AL,1,Cleburne County,H1


### Aggreagte by County and Compute Metrics from PDM columns

In [16]:
ca_df = sdm_df[sdm_df.state=='CA'].copy()
sdm_columns = ['device_count', 'completely_home_device_count']  # 'part_time_work_behavior_devices', 'full_time_work_behavior_devices'
geo_groupby= 'county_fips'
ca_by_county = ca_df.groupby([geo_groupby, 'week', 'date_start'])[sdm_columns].sum().sort_values(by=[geo_groupby, 'week', 'date_start'], ascending=True).reset_index()

# compute new metrics
ca_by_county['leaving_home'] = ca_by_county['device_count'] - ca_by_county['completely_home_device_count']
ca_by_county['pct_leaving_home'] = ca_by_county['leaving_home'] / ca_by_county['device_count'] * 100
  
ca_by_county.head()

Unnamed: 0,county_fips,week,date_start,device_count,completely_home_device_count,leaving_home,pct_leaving_home
0,6001,5,2020-02-01,61401,16141,45260,73.712155
1,6001,5,2020-02-02,61653,19925,41728,67.682027
2,6001,6,2020-02-03,63435,13872,49563,78.131946
3,6001,6,2020-02-04,60596,13589,47007,77.574427
4,6001,6,2020-02-05,59947,12902,47045,78.477655


# Results

1. Visualize % Residents Leaving Home Day by Day (raw data)
2. Visualize % Residents Leaving Home Day by Day (smoothed, compared to February Baseline)
3. Visualize 1 and 2 on Maps

In [18]:
def make_plot(df, metric, series, x_axis='date_start', ylim=None, legend=False, ylabel=None):
  plt.rcParams['figure.figsize'] = [8, 5]
  df2plot = df.pivot(index=x_axis, columns=series, values=metric).reset_index()
  f = plt.figure()
  df2plot.plot(x=x_axis, ylim=ylim, legend=legend, ax=f.gca())
  if(ylabel):
    plt.ylabel(ylabel)
  else:
    plt.ylabel(metric)
  return(f)

In [19]:
df = ca_by_county.copy()
df2plot = df.pivot(index='date_start', columns='county_fips', values='pct_leaving_home').reset_index()
df2plot.head()

county_fips,date_start,06001,06003,06005,06007,06009,06011,06013,06015,06017,06019,06021,06023,06025,06027,06029,06031,06033,06035,06037,06039,06041,06043,06045,06047,06049,06051,06053,06055,06057,06059,06061,06063,06065,06067,06069,06071,06073,06075,06077,06079,06081,06083,06085,06087,06089,06091,06093,06095,06097,06099,06101,06103,06105,06107,06109,06111,06113,06115
0,2020-02-01,73.712155,82.539683,71.772429,72.709671,71.199654,71.482412,74.809322,66.052842,75.467633,72.538138,73.381877,71.106758,71.027431,71.794872,71.324568,70.916454,68.212181,70.334572,74.18728,71.928166,76.760739,69.460227,68.353994,72.790737,67.875648,75.163399,72.565338,75.331192,75.631399,78.317468,77.36258,70.595691,74.353562,73.18406,76.910244,74.045665,76.181433,73.349459,73.213366,77.84752,77.628521,75.42029,74.184905,75.149303,72.805206,61.261261,64.321839,74.475607,75.0,73.46492,73.558689,72.666859,69.098712,71.389896,73.299958,76.809878,75.754201,70.546487
1,2020-02-02,67.682027,57.377049,62.996778,67.659012,64.132809,64.888337,68.472276,64.948454,67.552624,67.261355,66.827309,63.6,65.052398,64.411028,66.763047,66.750967,59.812383,64.117647,70.260953,65.382176,73.444613,64.705882,66.184074,67.917973,60.779221,72.873194,67.353927,69.109731,66.824752,72.768235,70.491236,59.771574,68.226703,67.09695,68.277241,67.927922,69.640332,68.811821,66.939836,71.439405,70.992551,68.21162,68.572756,68.68071,65.857202,55.555556,59.332113,67.794111,68.427574,66.760386,66.991701,66.197183,59.459459,66.849723,64.344942,69.863882,69.137303,64.744646
2,2020-02-03,78.131946,87.692308,76.89008,78.032522,74.989456,75.980392,78.814924,77.406523,78.911043,76.835811,78.988942,75.969597,74.512252,77.311961,75.204709,74.787234,72.254335,75.663717,76.867149,75.298554,79.142469,76.19699,73.019126,75.433867,70.437018,81.337481,75.290719,77.964727,78.038033,80.35939,80.262084,69.191919,76.869981,76.024258,79.54023,75.852202,78.456211,75.044767,75.725959,79.921438,79.28475,77.104269,78.104249,77.295584,75.984211,65.6,69.265846,77.048499,78.191959,76.246883,76.919918,77.709968,73.127753,74.166538,74.605154,78.689993,78.691626,76.234639
3,2020-02-04,77.574427,77.358491,75.258918,77.073595,75.641624,75.753604,78.779709,72.894737,79.466667,76.257172,76.709957,74.69981,72.70746,76.769025,74.51238,74.308364,70.431211,74.569319,76.925338,75.037594,77.986595,70.78125,71.853211,74.703277,64.20765,75.822368,74.226324,76.826632,77.848723,80.358795,81.57847,70.936639,77.230326,76.204223,80.682226,76.241727,79.207991,76.767436,76.662417,79.859178,80.273334,78.26087,78.830793,78.39377,75.673899,68.181818,68.631271,78.055242,78.59602,76.868878,76.213275,78.16128,75.46729,74.916532,75.265487,79.357494,79.62924,75.57764
4,2020-02-05,78.477655,76.923077,75.432526,78.78022,74.474886,77.747253,79.608916,73.153779,80.672701,77.215836,76.404494,75.817133,73.491124,79.448276,75.773948,75.516848,71.937262,77.593032,77.777807,76.125672,80.282242,76.443769,74.171939,76.020363,71.720117,76.833333,76.078215,78.531196,79.533484,81.565264,82.347881,73.18117,78.002324,76.768706,81.025825,77.010175,79.442166,77.429121,76.229508,81.068041,80.723531,78.467527,78.760529,77.914182,76.284672,71.84466,68.941642,78.543948,79.152692,77.023722,78.479894,76.457749,68.779343,75.192788,76.858777,79.858394,79.864061,75.229124


In [20]:
metric = 'pct_leaving_home'
make_plot(ca_by_county, metric, 'county_fips', ylim=(0,100), ylabel='% Residents Leaving Home')
display(plt.show())

In [21]:
geo_groupby = 'county_fips'
ca_select = ca_by_county[[geo_groupby, 'week', 'date_start', 'device_count', 'leaving_home', 'pct_leaving_home']]

metric = 'pct_leaving_home'
ca_select = ca_select.sort_values(by=[geo_groupby,'date_start'])

window_size=7
for group in np.sort(ca_select[geo_groupby].unique()):
  ca_select.loc[ca_select[geo_groupby]==group, metric] =  ca_select.loc[ca_select[geo_groupby]==group, metric].copy().rolling(window=window_size, center=False).mean()
baseline_week = 7 # 7== week of Feb 10th
week10avg = ca_select[ca_select.week==baseline_week].groupby([geo_groupby])[metric].mean().to_frame(name='baseline').reset_index()

ca_select = ca_select.merge(week10avg, on=geo_groupby)
ca_select[metric+'_baselined'] = (ca_select[metric] -  ca_select['baseline'])
make_plot(ca_select[ca_select.date_start>'2020-02-07'], metric+'_baselined', geo_groupby, x_axis='date_start', ylim=(-30,30), ylabel='% Residents Leaving Home \n(Change Since Wk of Feb 10th)')
display(plt.show())

## % Residents Leaving Home On a Map

In [23]:
import geopandas as gpd

In [24]:
# County geometry (boundary) data is available from the government and hosted many places on the internet
counties_raw = gpd.read_file('https://raw.githubusercontent.com/plotly/datasets/master/geojson-counties-fips.json')
counties_raw.head()

Unnamed: 0,id,GEO_ID,STATE,COUNTY,NAME,LSAD,CENSUSAREA,geometry
0,1001,0500000US01001,1,1,Autauga,County,594.436,"POLYGON ((-86.49677 32.34444, -86.71790 32.402..."
1,1009,0500000US01009,1,9,Blount,County,644.776,"POLYGON ((-86.57780 33.76532, -86.75914 33.840..."
2,1017,0500000US01017,1,17,Chambers,County,596.531,"POLYGON ((-85.18413 32.87053, -85.12342 32.772..."
3,1021,0500000US01021,1,21,Chilton,County,692.854,"POLYGON ((-86.51734 33.02057, -86.51596 32.929..."
4,1033,0500000US01033,1,33,Colbert,County,592.619,"POLYGON ((-88.13999 34.58170, -88.13925 34.587..."


Join geos to dataframe

Also, we pick a specific window to visualize on the map (week ending Mar 23rd)

In [26]:
date2plot = '2020-03-24' # This will be the end-date of a 7 day average
data2plot = ca_select[ca_select.date_start == date2plot].copy()
map_df = counties_raw.merge(data2plot, left_on='id', right_on='county_fips')
map_df.head(2)

Unnamed: 0,id,GEO_ID,STATE,COUNTY,NAME,LSAD,CENSUSAREA,geometry,county_fips,week,date_start,device_count,leaving_home,pct_leaving_home,baseline,pct_leaving_home_baselined
0,6005,0500000US06005,6,5,Amador,County,594.583,"POLYGON ((-120.99550 38.22541, -121.02708 38.3...",6005,13,2020-03-24,1547,960,62.623171,74.94573,-12.322558
1,6021,0500000US06021,6,21,Glenn,County,1313.947,"POLYGON ((-122.93765 39.79816, -122.04647 39.7...",6021,13,2020-03-24,1069,674,65.705567,76.380681,-10.675114


## Make Maps

* We make 3 maps about % Residents Leaving Home 
  * 1) Baseline in February
  * 2) Mid March (many fewer residents are leaving home)
  * 3) Change between 1 and 2
  
Important Caveat: Small counties have smaller sample sizes and therefore produce estimates that are higher variance and less reliable. Expect the smallest counties to have the most extreme values. One way to reduce the variance of our estimates would be to use hierchical modeling / partial-pooling, but this is not implemented here.

In [28]:
def plot_chloropleth_map(gpd, column2plot, cmap, scheme='Quantiles', scheme_kwds = None, legend_title=None):
    # for schemes options see: https://github.com/pysal/mapclassify
    plt.rcParams['figure.figsize'] = [8, 8]
    f = plt.figure()
    gpd.plot(column=column2plot, 
                 legend=True, legend_kwds={'loc': 'upper right', 'title':legend_title },
                 cmap=cmap, 
                 scheme=scheme, classification_kwds=scheme_kwds, 
                 edgecolor='black',
                 ax=f.gca()) 
    plt.show()
    return(True)

# These functions were required to correct some buggyness of the geopandas plot colormap scheme functions so that we can have the same colormap across maps
def get_hack_polygon(name):
  # this is a kluge to force the floor or ceiling on the colormap during visualization
  hack_df = pd.DataFrame({'NAME' : [name], 'geometry': pd.Series(['POLYGON((-120.30831 33.99925,-120.27810 33.98331,-120.29732 33.96737, -120.34951 33.97420,-120.30831 33.99925))'])})   # This is a made-up polygon next to catalina island
  hack_df['geometry'] = hack_df['geometry'].apply(wkt.loads)
  hack_gpd = gpd.GeoDataFrame(hack_df, geometry=hack_df['geometry'])
  return(hack_gpd)

from shapely import wkt
def add_kluge_polygon_with_fixed_value(df_, metric, forced_value, name='hack_polygon'):
  df = pd.concat([df_, get_hack_polygon(name=name)], sort=True).reset_index()
  df.loc[df.NAME==name, metric] = forced_value
  return(df)

def set_floor_and_ceiling_colormap(df_, metric, bins):
  df = df_.copy()
  df = add_kluge_polygon_with_fixed_value(df, metric, my_bins[0], name='bottom')
  df = add_kluge_polygon_with_fixed_value(df, metric, my_bins[-1], name='top')# This forces the color-map to extend to bottom
  return(df)


In [29]:
my_bins = [50, 55, 60, 65, 70, 75, 80, 85]
this_color_map = plt.cm.get_cmap('Greens') 
metric='baseline'
gdp_plot = set_floor_and_ceiling_colormap(map_df, metric, my_bins)
plot_chloropleth_map(gdp_plot, 
                     metric, 
                     this_color_map, 
                     scheme='EqualInterval',
                     scheme_kwds={'k':len(my_bins)-1}, 
                     legend_title='Fraction Residents Leaving \nHome Daily (baseline)'.format(date2plot))
display(plt.show())

In [30]:
my_bins = [50.0, 55.0, 60.0, 65.0, 70.0, 75.0, 80.0, 85.0]
this_color_map = plt.cm.get_cmap('Greens') 
metric='pct_leaving_home'
gdp_plot = set_floor_and_ceiling_colormap(map_df, metric, my_bins)
plot_chloropleth_map(gdp_plot, 
                     metric, 
                     this_color_map, 
                     scheme='EqualInterval',
                     scheme_kwds={'k':len(my_bins)-1}, 
                     legend_title='Fraction Residents Leaving \nHome Daily (Wk ending {0})'.format(date2plot))
display(plt.show())

In [31]:
inverse = True
my_bins = [-30, -25, -20, -15, -10, -5, 0]
this_color_map = plt.cm.get_cmap('Reds') # 

metric = 'pct_leaving_home_baselined'
if(inverse):
  gdp_plot = set_floor_and_ceiling_colormap(map_df, metric, my_bins)
  my_bins = np.flip(np.array(my_bins)*-1).tolist()
  gdp_plot[metric+'_inv'] = gdp_plot[metric]*-1
  
plot_chloropleth_map(gdp_plot, 
                     metric+'_inv', 
                     this_color_map, 
                     scheme='EqualInterval',
                     scheme_kwds={'k':len(my_bins)-1}, 
                     legend_title='% Decrease in Residents Leaving \nHome (since Wk of Feb 10th)')
display(plt.show())

## TO DO:

* CA counties have drastically different sample sizes. Implement a hierchical model (partial pooling across counties) to shrink the extreme rate estimates from small sample size. 
* Using Weekly Patterns, build a similar view of change in visits to point-of-interest. POI visits should be correlated with residents leaving home, but does it provide additional useful insights?

## Contact SafeGraph

* SafeGraph is actively donating data and resources to governments, researchers, academics and other organizations working for the public good in response to Covid19. [Click here to get involved](https://docs.google.com/forms/d/e/1FAIpQLSc501xfAzEPADOwRmsdHmu-v8aN14jnKHBmEmdJJcTgRLddqw/viewform). 


* datastories@safegraph.com
* ryan@safegraph.com