# New Cases of COVID-19 per Day per <span style="color:blue">100,000 (100k)</span> Population

The purpose of this notebook is to show the current relative risk (by US county) of becoming infected today with COVID-19.  <span style="color:red"><b>Do not conclude that no new cases (darkest green on the map) indicates absolute safety.</b>  Even though a county may be green (no new cases) the virus may still be spreading.  The map is based on confirmed cases; the virus can spread with no symptoms that would prompt testing and confirmation.  There is also a delay between exposure to the virus and the appearance of symptoms.</span>

**<p style="color:blue">Once the interactive version of the notebook appears, wait for a map to appear at the bottom of the notebook.  JavaScript needs to be enabled in your browser for the map to work.</p>**

#### *Current* New Cases per Day
The total cases of COVID-19 are a poor measure of current risk if the coronavirus is not spreading. <span style="color:red">This notebook uses an average of **new** cases over the last seven days, with days weighted equally.</span>

#### *Per 100k Population
Ten new cases per day is far more significant in a population of a thousand than in a population of 100k. The per-capita numbers were calculated using 100k to be consistent with the JHU and IHME sites. Dividing new cases by 100k permits each county, whether rural or urban, to be compared on an equal basis. The name of this notebook was not changed so that previous hyperlinks will continue to work.

#### "Red" Counties
<p style="color:red">The color scale of the map ranges from green (for no new cases) to red (for 25 or more cases per day per 100k). Counties with more than 25 cases per day per 100k are also show as red. As the pandemic eases, there should be less red in the map.</p>

#### Troubleshooting
* If something goes wrong, first try selecting "Cell" from the menu, and then "Run All" from the submenu.
* If all else fails, just shut down this Notebook ("File" and "Close and Halt" from the menu), and close the tab, and then click on the original address to start a new remote server.
* This Notebook relies on a JavaScript library called Leaflet; if JavaScript is disabled in your browser, a map cannot be produced.

In [None]:
%%javascript
IPython.OutputArea.auto_scroll_threshold = 99999

In [None]:
import os
import json
from math import sqrt, pow
import numpy as np
import geopandas as gpd
import pandas as pd
import folium
import branca.colormap as cm
import us

In [None]:
# Read in COVID-19 and population data
# days for moving average
NDAYS = 7

# read in the data from JHU github
url = 'https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_US.csv'
df = pd.read_csv(url,index_col=0)

# filter out all but county-level data
df = df[((df['FIPS'] > 1000) & (df['FIPS'] < 60000))]

# select useful columns and rename them, replacing state names with abbreviations
df_meta = df.iloc[:, [3, 4, 5, 7, 8]]  # all rows, selected columns
df_date = df.iloc[:,11:]               # all rows, columns 11 to the end
df = pd.merge(df_meta, df_date, on='UID')
df.rename(columns = {'Admin2':'County', 'Province_State':'State', 'Long_':'Lon'}, inplace = True)
df.State = [us.states.lookup(name).abbr for name in df.State]  # abbreviate state name

# load the population data (2018 data from US Census Bureau)
csv_file_name = 'data/combined_fips_population.csv'
population_dict = pd.read_csv(csv_file_name, header=None, index_col=0, squeeze=True).to_dict()

In [None]:
# Define functions for calculated fields
# convert floating point FIPS number to a string
def convert_fips_to_string(row):
    return f"{np.int64(row['FIPS']):05d}"

# concatenated state abbreviation and county name for display in tooltip
def prepare_state_county_label(row):
    return f"{row['State']}:{row['County']}"
    
# get the county population
def get_population(row):
    fips = row['FIPS']
    if fips in population_dict:
        return population_dict[row['FIPS']]
    else:
        population_dict[fips] = 10.0  # low populations, high cases per 100,000
        return population_dict[fips]
    
# get the total cases in a county
def get_total(row):
    return row[-1]

# get the total cases per 100k in a county
def get_total_per_100k(row):
    return row[-1]/(population_dict[row['FIPS']]/100000.0)

# get the latest number of new cases; usually yesterday
def get_new_cases_yesterday(row):
    return row[-1] - row[-2]

# define weighted average for a numpy array at least NDAYS + 1 long
def weighted_average(data, days):
    assert isinstance(data, np.ndarray)
    assert len(data.shape) == 1
    assert data.shape[0] >= days
    
    # set up weights
    #weights = np.arange(days + 1, dtype=np.float)
    weights = np.ones(days, dtype=np.float)
    weights = weights / np.sum(weights)
  
    # calculated the weighted average; return
    weighted_avg = np.sum(weights[-days:]*data)
    return weighted_avg

# define a weighted average number of new cases per day
def new_per_day(row):
   
    # calculate daily new cases
    cases_arr = row[-NDAYS-1:].to_numpy(dtype='float64')
    new_cases_per_day_arr = cases_arr[-NDAYS:]-cases_arr[-NDAYS-1:-1]
      
    # calculated the weighted average; return
    avg_new_cases_per_day = weighted_average(new_cases_per_day_arr, NDAYS)
    return avg_new_cases_per_day

# define a weighted average number of new cases per day per 100k
def new_per_day_per_100k(row):
    return row['NewPD']/(row['Population']/100000.0)

In [None]:
# Add and populate calculated fields
# insert new columns
df.insert(5, 'FIPSStr', '')
df.insert(6, 'STCounty', '')
df.insert(7, 'Population', 0)
df.insert(8, 'Total', 0)
df.insert(9, 'TotalP100k', 0)
df.insert(10, 'NewYesterday', 0)
df.insert(11, 'NewPD', 0)
df.insert(12, 'NewPDP100k', 0)
  
# calculate new columns
df['FIPSStr'] = df.apply(convert_fips_to_string, axis=1)
df['STCounty'] = df.apply(prepare_state_county_label, axis=1)
df['Population'] =df.apply(get_population, axis=1)
df['Total'] = df.apply(get_total, axis=1)
df['TotalP100k'] = df.apply(get_total_per_100k, axis=1)
df['NewYesterday'] = df.apply(get_new_cases_yesterday, axis=1)
df['NewPD'] = df.apply(new_per_day, axis=1)
df['NewPDP100k'] = df.apply(new_per_day_per_100k, axis=1)

LAST_DATE = df.columns[-1]

In [None]:
# Load GeoJSON data and merge with COVID-19 and population data
county_path = os.path.join('data', 'county.json')
county_geo = json.loads(open(county_path).read())
counties = gpd.GeoDataFrame.from_features(county_geo, crs='EPSG:4326')
counties = counties.merge(df, how = 'inner', left_on = 'GEOID', right_on = 'FIPSStr')

In [None]:
# create a map
m = folium.Map(location=[39.758056, -96.00000], tiles='CartoDB positron', zoom_start=4, prefer_canvas=True)

# red value is approximate 90th%tile of counties on 5/19/20
color_map = cm.LinearColormap(vmin=0.0, vmax=25.0, colors=['#CCE2CC', '#FBFBD5', '#FAC9CC'], 
                                                    caption='New Cases per Day per 100K').add_to(m)

new_series = counties.set_index('FIPSStr')['NewPDP100k']

tooltip = folium.GeoJsonTooltip(
    fields=["STCounty", "Population", "Total", "TotalP100k", "NewYesterday", "NewPD", "NewPDP100k"],
    aliases=["State:County", "Population", "Total Cases", "Total Cases/100k", 
             f"New Cases {LAST_DATE}", f"New/Day ({NDAYS}d avg)", f"New/Day/100k ({NDAYS}d avg)"],
    localize=True, sticky=True, labels=True,
    style="""
        background-color: #F0EFEF;
        border: 1px solid black;
        border-radius: 3px;
        box-shadow: 3px;
        line-height: 80%;
    """
)

def style_function(feature):
    geoid = feature['properties']['FIPSStr']
    try:
        newPTPD = new_series[geoid]
        return  {"fillColor": color_map(newPTPD), "fillOpacity":0.4, "color":'#black',
                 "opacity":0.2, "weight":1}
    except:
        return {}

g = folium.GeoJson(
    counties, style_function=style_function, tooltip=tooltip).add_to(m)
m

#### Data Sources
This Notebook downloads the latest confirmed case data from Johns Hopkins University each time it is run.  The data is updated daily by JHU (usually late at night), so the numbers of cases in this Notebook may be up to one day out of date. 

See this map for a continuously updated version of the COVID-19 data:
https://coronavirus.jhu.edu/map.html

Population data is from the US Census Bureau for 2018.

#### How This Notebook Works
When you open the original address for this notebook in a browser (by clicking on a link or pasting the address into your browser), a free service called binder.org starts a remote Jupyter Notebook server, and once the server has started, connects your browser to it using a temporary address.  The server start-up process can take a minute or so.  While waiting, a snapshot of this notebook is displayed.  

Once the Jupyter Notebook loads in your browser, the "cells" of the Notebook are run; the last cell displays a map. This process could take 10-30 seconds or so.  By default, all the cells are hidden except this one and the one containing the map. To show the hidden cells, click on the map or the whitespace above the map, and then click the **^** button on the toolbar of the Notebook.  You can tell if all the cells are displayed if they are numbered sequentially (numbers are in the upper left margin of the cell).

<p style="color:blue">The remote server will shut down with 10 minutes of inactivity.</p>

The computer language used to create this Notebook is Python. All of the libraries used are free.  You can experiment by modifying the code in the cells and rerunning all the cells.

To send suggestions for this visualization, please click here:
<A HREF="mailto:covidhotspot@elegambda.com">covidhotspot@elegambda.com</A>. 