# Covid-19 Data Exploration

Based on COVID-19 Open Data Demo.  Data from [Johns Hopkins Coronavirus Resource Center](https://coronavirus.jhu.edu/).  Thanks to MongoDB Atlas for hosting the data.  See [Their article](https://developer.mongodb.com/article/johns-hopkins-university-covid-19-data-atlas) if you want to work on this or https://github.com/Ciemaar/covid-19-explore to work on this notebook.

In [None]:
from datetime import datetime
import json
from itertools import chain

#import dovpanda

from data_access import *
from display import *
from analysis import *

pn.extension()

config = json.load(open('config.json'))
SHORT_LOOKBACK = config['short_lookback']
LONG_LOOKBACK = config['long_lookback']

# Get the last date loaded:

pn.Row(pn.widgets.StaticText(name='Based on data up to',value=last_date),
       pn.widgets.StaticText(name='Last full day',value=last_full_date))

In [None]:
# Helper functions

### Sample record

This is the odd thing with Rhode Island, just one record of their data.

In [None]:
json_obj = stats.find_one({'combined_name':'Unassigned, Rhode Island, US',
                'date': datetime(2020, 4, 14)})
json_obj.pop('_id')
json_obj['date']=str(json_obj['date'])
pn.pane.JSON(json_obj, name='JSON')

## Basic Numbers for our States

In [None]:
pd.set_option("display.max_rows", 200)
tabs = pn.Tabs(height=1000)

lookback_date = last_date - timedelta(days=SHORT_LOOKBACK)

tabs.append(state_summary(get_for_country_day(stat_date = { "$gt": lookback_date}).groupby(['state','date']).sum(),
                          label='National US',lookback=SHORT_LOOKBACK))
tabs.append(state_summary(get_for_country_day(country='Portugal', 
                                              stat_date = { "$gt": lookback_date}).groupby(['combined_name','date']).sum(),
                          label='Portugal',lookback=SHORT_LOOKBACK))
for state in config['state_list']:
    df = get_for_country_day(state=state,stat_date = { "$gt": lookback_date})
    df = df.groupby(['combined_name','date']).sum()
    summary = state_summary(df,label=state,lookback=SHORT_LOOKBACK)
    summary.append(pn.widgets.StaticText(name='Trend',value=''))
    tabs.append(summary)
tabs

## Death and Diagnosis Trending

McKinley is the County that contains the city of Gallup, New Mexico, which has been in the news lately.

In [None]:
lookback_date = datetime.now() - timedelta(days=LONG_LOOKBACK)
city_list = config['city_list']
our_cities = snag_data(
          combined_name={ "$in": city_list},
          date={ "$gt": lookback_date})
our_cities.groupby('combined_name').sum()
our_cities['per_capita_deaths']    = our_cities['deaths']/our_cities['population']
our_cities['per_capita_confirmed'] = our_cities['confirmed']/our_cities['population']


In [None]:
tabs = pn.Tabs()
for column in ['deaths','per_capita_deaths','confirmed','per_capita_confirmed']:
    
    tabs.append( make_graph(our_cities,column))
    
tabs 

## Daily Deltas

The numbers of deaths or diagnosis is probably not the most useful number, rather the number of new cases is our measure.  Ideally we'd like to know when the infected population is less than 1 in a million.  Given the continued difficulty testing I personally have been paying more attention to daily death counts.

Predicted lines are based on a five day rolling average and assume reccent trends will continue, this is extremely simplistic.  The prediction runs in parallel with the data for a few days to give a sense of it's accuracy. 

In [None]:
tabs = pn.Tabs(width=800,tabs_location='left')
for column in ['deaths','per_capita_deaths','confirmed','per_capita_confirmed']:
    tabs.extend(column_summary(our_cities,column))

tabs

## Cases near our key locations

This query searches for statistics for the most recent day in the collection which are reported for locations near key locations.  Rhode Island does not report geographic locations of deaths and shows up as zero, I have not found the same error in any other location.

In [None]:
pn.Tabs(*chain(*[ 
    [
         pn.widgets.DataFrame(near_by_data(distance_km=25,latitude=loc_data['latitude'], 
                                             longitude=loc_data['longitude']),
                                width=600,name=f'{loc_name} 25km'),
          pn.widgets.DataFrame(near_by_data(distance_km=100,latitude=loc_data['latitude'], 
                                             longitude=loc_data['longitude']),
                                width=600,name=f'{loc_name} 100km'),   
          pn.widgets.DataFrame(near_by_data(distance_km=250,latitude=loc_data['latitude'], 
                                             longitude=loc_data['longitude']),
                                width=600,name=f'{loc_name} 250km'),  
          pn.widgets.DataFrame(near_by_data(distance_km=500,latitude=loc_data['latitude'], 
                                             longitude=loc_data['longitude']),
                                width=600,name=f'{loc_name} 500km')
           ]
          for loc_name, loc_data in config['locations'].items()])
    ,
    tabs_location='left',
       )
