# Playing with Coronavirus Timeseries

- https://www.kaggle.com/sudalairajkumar/novel-corona-virus-2019-dataset


## Notes

- This notebook uses 2 classes (based on a BaseDataset class) to load in data from both a kaggle dataset (novel coronavirus 2019) and the Covid Tracking Project data

## To Do:

- [x] Add data from Covid Tracking Project's API
    - https://covidtracking.com/api
    
- [x] Move app styling to a css file in a new `assets/` folder

- Functions and classes are in functions.py

### RESOURCES FOR FUTURE
- RAFAEL STUDY GROUP FOR MAKING A MAP
    - https://www.youtube.com/watch?v=MAhK7NHXEOg&feature=emb_logo
    - https://github.com/erdosn/additional-topic-plotly

In [1]:
import plotly.express as px
import plotly.graph_objects as go
import plotly.io as pio
pio.templates.default = "plotly_dark"

import cufflinks as cf
cf.go_offline()
cf.set_config_file(sharing='public',theme='solar',offline=True)

In [2]:
import os,glob,sys
import re

!pip install -U fsds
from fsds.imports import *

fsds v0.2.21 loaded.  Read the docs: https://fs-ds.readthedocs.io/en/latest/ 


Handle,Package,Description
dp,IPython.display,Display modules with helpful display and clearing commands.
fs,fsds,Custom data science bootcamp student package
mpl,matplotlib,Matplotlib's base OOP module with formatting artists
plt,matplotlib.pyplot,Matplotlib's matlab-like plotting module
np,numpy,scientific computing with Python
pd,pandas,High performance data structures and tools
sns,seaborn,High-level data visualization library based on matplotlib


[i] Pandas .iplot() method activated.


In [3]:
import functions as fn

%load_ext autoreload
%autoreload 2

In [4]:
# help(fn)

# Main Kaggle Dataset - Get US States

# 📦class `CoronaData`

In [5]:
from functions import BaselineData
from functions import CoronaData


In [6]:
corona = CoronaData(verbose=True,run_workflow=True)

[i] DOWNLOADING DATA USING KAGGLE API
	https://www.kaggle.com/sudalairajkumar/novel-corona-virus-2019-dataset
	- Downloaded dataset .zip and extracted to:"New Data/"
	- Extraction Complete.


Unnamed: 0,Date,Province/State,Country/Region,Confirmed,Deaths,Recovered
0,2020-01-22,Anhui,Mainland China,1.0,0.0,0.0
1,2020-01-22,Beijing,Mainland China,14.0,0.0,0.0
2,2020-01-22,Chongqing,Mainland China,6.0,0.0,0.0
3,2020-01-22,Fujian,Mainland China,1.0,0.0,0.0
4,2020-01-22,Gansu,Mainland China,0.0,0.0,0.0


[i] There are 223 countries in the datatset
[i] Dates Covered:
	From 01-22-2020 to 07-08-2020


In [7]:
df_world = corona.df.copy()
countries = list(df_world.groupby('Country/Region').groups.keys())
len(countries)

223

## 07/02 -FUNCTIONS - Making these methods into standalones

### def `set_datetime_index`, `set_freq_resample`, `get_group_ts`

In [8]:
def set_datetime_index(df_,col='Date',drop=True):#,drop_old=False):
        """Returns df copy with specified column as datetime index"""
        import pandas as pd
            
        ## Copy to avoid edits to orig
        df = df_.copy()
        
        ## Convert to date time
        df[col] = pd.to_datetime(df[col],infer_datetime_format=True)
        
        ## Set as index
        df.set_index(df[col],drop=False,inplace=True)
        
        if drop:
            # Drop the column if it is present
            if col in df.columns:
                df.drop(columns=col,inplace=True)
            
        return df
    
    
    
def set_freq_resample(df,date_col='Date',freq='D', agg_func='sum'):
    """Resamples the dataframe with Freq and agg_func. If index is not
    a datetime axis, will call set_datetime_index. 
    Helper function for get_group_ts """
    ## Make index datetime if it is not already. 
    if isinstance(df.index,pd.DatetimeIndex)==False:
        df = set_datetime_index(df,col=date_col)
        
    ts  = df.resample(freq).agg(agg_func).copy()
    return ts
    
    
    
def get_group_ts(df,group_name,group_col='state',
                     ts_col=None, freq='D', agg_func='sum'):
        """Take df_us and extracts state's data as then Freq/Aggregation provided"""
        from IPython.display import display
        try:
            ## Get state_df group
            group_df = df.groupby(group_col).get_group(group_name).copy()
            
        except Exception:
            print("[!] ERROR!")
#             display(df.head())
            return None
        
        ## Resample and aggregate state data
        group_df = set_freq_resample(group_df,freq=freq,agg_func=agg_func)
#         group_df = group_df.resample(freq).agg(agg_func)


        ## Get and Rename Sum Cols 
        orig_cols = group_df.columns

        ## Create Renamed Sum columns
        for col in orig_cols:
            # Group - Column 
            group_df[f"{group_name} - {col}"] = group_df[col]

        ## Drop original cols
        group_df.drop(orig_cols,axis=1,inplace=True)

        ## Return on columns containing ts_cols
        if ts_col is not None:
            ts_cols_selected = [col for col in group_df.columns if ts_col in col]
            group_df = group_df[ts_cols_selected]

        return group_df 

## Making World Version of Corona Dash

In [9]:
## Get WORLD dictionary with all countries
grouping_col = 'Country/Region'
countries = list(df_world.groupby(grouping_col).groups.keys())

WORLD = {}
for country in countries:
#     print(country)
    WORLD[country] = get_group_ts(df_world,country, grouping_col)
    

In [10]:
## Unused range slider
#         pfig.update_layout(
#             xaxis=dict(
#                 rangeselector=dict(
#                     buttons=list([
#                         dict(count=7,
#                              label="1week",
#                              step="day",
#                              stepmode="backward"),
#                         dict(count=14,
#                              label="2weeks",
#                              step="day",
#                              stepmode="backward"),
#                         dict(count=1,
#                              label="1m",
#                              step="month",
#                              stepmode="backward"),
#                         dict(count=6,
#                              label="6m",
#                              step="month",
#                              stepmode="backward"),

#                         dict(step="all")
#                     ])
#                 ),
#                 rangeslider=dict(
#                     visible=True
#                 ),
#                 type="date"
#             )
#         )
        

### def `plot_group_ts`

In [11]:
def plot_group_ts(df, group_list,group_col, plot_cols = ['Confirmed'],
                  df_only=False,
                new_only=False,plot_scatter=True,show=False,
                 width=1000,height=700):
    """Plots all columns conatining the words in plot_cols for every group in group_list.
    Returns plotly figure
    New as of 06/21"""
    import pandas as pd 
    import numpy as np
    
    ## Get state dataframes
    concat_dfs = []  
    GROUPS = {}
    
    ## Get each state
    for group in group_list:

        # Grab each state's df and save to STATES
        dfs = get_group_ts(df,group,group_col)
        GROUPS[group] = dfs

        ## for each plot_cols, find all columns that contain that col name
        for plot_col in plot_cols:
            concat_dfs.append(dfs[[col for col in dfs.columns if col.endswith(plot_col)]])

    ## Concatenate final dfs
    plot_df = pd.concat(concat_dfs,axis=1)
    
    
    ## Set title and df if new_only
    if new_only:
        plot_df = plot_df.diff()
        title = f"New Coronavirus Cases by {group_col}"
    else:
        title = f'Cumulative Coronavirus Cases by {group_col}'
    
    ## Reset Indes
    plot_df.reset_index(inplace=True)
  
    ## Return Df or plot
    if df_only:
         return plot_df#.reset_index()
    
    else:
        ## If any columns are per capita, change titleß
        if np.any(['per capita' in x.lower() for x in plot_cols]):
            value_name = "# of Cases - Per Capita"
        else:
            value_name='# of Cases'
            
            
        ## Melt Data for plotting
        pfig_df_melt = plot_df.melt(id_vars=['Date'],var_name='Group',
                                    value_name=value_name)
        
        ## Set plotting function
        if plot_scatter:
            plot_func = px.scatter
        else:
            plot_func = px.line
    
        # Plot concatenated dfs
        pfig = plot_func(pfig_df_melt,x='Date',y=value_name,color='Group',
                      title=title,template='plotly_dark',width=width,height=height)     
        
        ## Add range slider
        pfig.update_xaxes(rangeslider_visible=True)
        
        ## Display?
        if show:
            pfig.show()
                
        return pfig
           

In [12]:
pfig = plot_group_ts(df_world,group_list=['US','Italy','Canada',
                                  'Germany',
                                        'Mainland China'],group_col='Country/Region',
                     new_only=True,plot_scatter=False,width=900,height=600)
pfig

#  📕Covid Tracking Project Data

https://covidtracking.com/api

`/api/v1/states/{state}/screenshots.csv`

In [13]:
from fsds.imports import *
import datetime as dt
import requests
import json,urllib
pd.set_option('display.max_columns',0)

## 📦 class `CovidTrackingProject`

In [14]:
from functions import CovidTrackingProject

covid=CovidTrackingProject(download=True,verbose=True)
covid

[i] DOWNLOADING DATASETS FROM COVID TRACKING PROJECT
	https://covidtracking.com/data
	- File saved as: "New Data/states_metadata.csv"
ERROR
	- File saved as: "New Data/us.csv"
	- File saved as: "New Data/states.csv"
states


------------------------------------------------------------
[i] CovidTrackingProject Contents:
------------------------------------------------------------

METHODS:
	download_state_daily
	download_state_meta
	download_us_daily
	get_csv_save_load
	get_df
	get_group_ts
	help

ATTRIBUTES
	base_folder
	base_url
	columns
	columns_us
	df
	df_states
	df_us
	urls

In [15]:
df_us = covid.df_us.copy()
df_us

Unnamed: 0_level_0,positive,negative,death,recovered,hospitalizedCurrently,hospitalizedCumulative,inIcuCurrently,inIcuCumulative,onVentilatorCurrently,onVentilatorCumulative,states,pending,dateChecked,hash
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
2020-07-09,3101339,34931627,125590.0,969111.0,43895.0,255253.0,5839.0,11370.0,2127.0,1138.0,56,2530.0,2020-07-09T00:00:00Z,cf10aef539ecb665c78e84ba56568f155b2a3411
2020-07-08,3042503,34353163,124723.0,953420.0,43004.0,253534.0,5867.0,11303.0,2172.0,1103.0,56,2360.0,2020-07-08T00:00:00Z,f18b5341612ca6e29534a5889bf29c553653ea02
2020-07-07,2980356,33788640,123826.0,936476.0,41700.0,251499.0,5826.0,11177.0,2098.0,1084.0,56,2136.0,2020-07-07T00:00:00Z,fc166aafb9add554f11c450b848bee5601cab6b9
2020-07-06,2928590,33207185,122904.0,924148.0,39749.0,249539.0,5680.0,11058.0,2105.0,1070.0,56,1907.0,2020-07-06T00:00:00Z,0d7619deb2079a52b3fa38d8adfee60f779d011f
2020-07-05,2881160,32734664,122662.0,902558.0,38734.0,248745.0,5652.0,11010.0,2080.0,1064.0,56,1885.0,2020-07-05T00:00:00Z,cb12b1705ba2d3589b5941da9133bdb57e4fbbf7
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2020-01-26,2,0,,,,,,,,,1,,2020-01-26T00:00:00Z,e1cf59ab48e1cf367c4a6798a508a23d9d36bd18
2020-01-25,2,0,,,,,,,,,1,,2020-01-25T00:00:00Z,bef2a1d5f2a13491e0e0369bbd46c10cdd12973b
2020-01-24,2,0,,,,,,,,,1,,2020-01-24T00:00:00Z,bfffe76fc0b7cf11efe8aecd3cc7b22598d77d61
2020-01-23,2,0,,,,,,,,,1,,2020-01-23T00:00:00Z,cee36ebf3174bf1df0daa36e1e8088a157406fad


In [16]:
df_us.columns

Index(['positive', 'negative', 'death', 'recovered', 'hospitalizedCurrently',
       'hospitalizedCumulative', 'inIcuCurrently', 'inIcuCumulative',
       'onVentilatorCurrently', 'onVentilatorCumulative', 'states', 'pending',
       'dateChecked', 'hash'],
      dtype='object')

In [17]:
df_us[['positive','negative']].iplot()

### def `iplot_cols`

In [18]:

def iplot_cols(df_us,cols='icu'):
    pfig = df_us[[col for col in df_us.columns if cols in col.lower()]].iplot()#kind=kind)
    return pfig

In [19]:
iplot_cols(df_us,'hospital')

In [20]:
iplot_cols(df_us,'icu')

In [21]:
iplot_cols(df_us,'vent')

In [22]:
covid.columns_us['good']

['positive',
 'negative',
 'death',
 'recovered',
 'hospitalizedCurrently',
 'hospitalizedCumulative',
 'inIcuCurrently',
 'inIcuCumulative',
 'onVentilatorCurrently',
 'onVentilatorCumulative',
 'states',
 'pending',
 'dateChecked',
 'hash']

In [23]:
df_states = covid.get_df()
df_states

Unnamed: 0_level_0,state,fips,positive,negative,death,recovered,hospitalizedCurrently,hospitalizedCumulative,inIcuCurrently,inIcuCumulative,onVentilatorCurrently,onVentilatorCumulative,pending,dataQualityGrade,lastUpdateEt,totalTestsViral,positiveTestsViral,negativeTestsViral,positiveCasesViral,positiveIncrease,totalTestResults,totalTestResultsIncrease,deathIncrease,hospitalizedIncrease
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1
2020-07-09,AK,2,1272.0,134472.0,17.0,571.0,28.0,,,,0.0,,,A,7/9/2020 00:00,135744.0,,,,46,135744,2343,0,0
2020-07-09,AL,1,49174.0,421330.0,1068.0,25783.0,1125.0,3039.0,,877.0,,490.0,,B,7/9/2020 11:00,,,,48588.0,2212,470504,2212,10,33
2020-07-09,AR,5,26052.0,338609.0,309.0,19992.0,394.0,1705.0,,,82.0,284.0,,A,7/9/2020 16:36,364661.0,,,26052.0,1540,364661,11389,8,50
2020-07-09,AS,60,0.0,816.0,0.0,,,,,,,,,C,7/1/2020 00:00,,,,,0,816,0,0,0
2020-07-09,AZ,4,112671.0,540390.0,2038.0,13341.0,3437.0,5526.0,861.0,,575.0,,,A+,7/9/2020 00:00,652418.0,,,112028.0,4057,653061,11991,75,139
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2020-01-26,WA,53,2.0,0.0,,,,,,,,,,,,,,,,0,2,0,0,0
2020-01-25,WA,53,2.0,0.0,,,,,,,,,,,,,,,,0,2,0,0,0
2020-01-24,WA,53,2.0,0.0,,,,,,,,,,,,,,,,0,2,0,0,0
2020-01-23,WA,53,2.0,0.0,,,,,,,,,,,,,,,,0,2,0,0,0


# 🗺Adding Mapping - 07/08

https://plotly.com/python/mapbox-county-choropleth/

In [24]:
df_states = corona.df_us
df_states

Unnamed: 0_level_0,Province/State,Country/Region,Confirmed,Deaths,Recovered,state,Confirmed Per Capita,Deaths Per Capita,Recovered Per Capita
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2020-01-22,Washington,US,1.0,0.0,0.0,WA,1.313216e-07,0.000000,0.0
2020-01-23,Washington,US,1.0,0.0,0.0,WA,1.313216e-07,0.000000,0.0
2020-01-24,Washington,US,1.0,0.0,0.0,WA,1.313216e-07,0.000000,0.0
2020-01-25,Washington,US,1.0,0.0,0.0,WA,1.313216e-07,0.000000,0.0
2020-01-26,Washington,US,1.0,0.0,0.0,WA,1.313216e-07,0.000000,0.0
...,...,...,...,...,...,...,...,...,...
2020-07-04,Puerto Rico,US,7787.0,155.0,0.0,PR,2.438242e-03,0.000049,0.0
2020-07-05,Puerto Rico,US,7916.0,155.0,0.0,PR,2.478634e-03,0.000049,0.0
2020-07-06,Puerto Rico,US,8585.0,155.0,0.0,PR,2.688110e-03,0.000049,0.0
2020-07-07,Puerto Rico,US,8714.0,157.0,0.0,PR,2.728502e-03,0.000049,0.0


In [25]:
## Get maximum value for cases by state
max_corona = df_states.groupby('state').max().reset_index()
max_corona.head()

Unnamed: 0,state,Province/State,Country/Region,Confirmed,Deaths,Recovered,Confirmed Per Capita,Deaths Per Capita,Recovered Per Capita
0,AK,Alaska,US,1222.0,17.0,0.0,0.00167,2.3e-05,0.0
1,AL,Alabama,US,46962.0,1058.0,0.0,0.009578,0.000216,0.0
2,AR,Arkansas,US,25246.0,305.0,0.0,0.008366,0.000101,0.0
3,AZ,"Tempe, AZ",US,108614.0,1963.0,1.0,0.014922,0.00027,1.373868e-07
4,CA,"Yolo County, CA",US,292560.0,6718.0,6.0,0.007404,0.00017,1.518517e-07


In [26]:
import plotly.express as px

color_column = 'Confirmed'
pfig = px.choropleth(max_corona,color=color_column,locations='state',
              hover_data=['Confirmed','Deaths','Recovered'], 
              hover_name='state',
              locationmode="USA-states", scope='usa',
              title=f"Total {color_column} Cases by State",
              color_continuous_scale=px.colors.sequential.Reds)

pfig

In [None]:
def plot_map_corona(df_states,color_column = 'Confirmed',
                   hover_data=['Confirmed','Deaths','Recovered']):
    
    ## Get maximum value for cases by state
    max_corona = df_states.groupby('state').max().reset_index()

    pfig = px.choropleth(max_corona,color=color_column,locations='state',
                  hover_data=hover_data, 
                  hover_name='state',
                  locationmode="USA-states", scope='usa',
                  title=f"Total {color_column} Cases by State",
                  color_continuous_scale=px.colors.sequential.Reds)
    pfig.update_layout(autosize=True)#,zoom=False)
    
    return pfig
pmap = plot_map_corona(df_states)

# 07/09/20 - Updating get_methods, etc to work with plotly fig

In [28]:


def get_methods(obj,private=False):
    """
    Retrieves a list of all non-private methods (default) from inside of obj.
    - If private==False: only returns methods whose names do NOT start with a '_'
    
    Args:
        obj (object): Object to retrieve methods from.
        private (bool): Whether to retrieve private methods or public.

    Returns:
        list: the names of all of the retrieved methods.
    """
    method_list = [func for func in dir(obj) if callable(getattr(obj, func))]
    if private:
        filt_methods = list(filter(lambda x: '_' in x[0] ,method_list))
    else:
        filt_methods = list(filter(lambda x: '_' not in x[0] ,method_list))
    return  filt_methods

def get_attributes(obj,private=False):
    """
    Retrieves a list of all non-private attributes (default) from inside of obj.
    - If private==False: only returns methods whose names do NOT start with a '_'
    
    Args:
        obj (object): Object to retrieve attributes from.
        private (bool): Whether to retrieve private attributes or public.
    
    Returns:
        list: the names of all of the retrieved attributes.
    """
    method_list = [func for func in dir(obj) if not callable(getattr(obj, func))]
    if private:
        filt_methods = list(filter(lambda x: '_' in x[0] ,method_list))
    else:
        filt_methods = list(filter(lambda x: '_' not in x[0] ,method_list))
    return  filt_methods

def get_methods_attributes_df(obj,include_private=False):
    """
    Retrieves all attributes and methods (with docstrings)
    and returns them in a DataFrame. By default only retrieves
    non-private methods, unless include_privates==True
    Args:
        obj (object): object to retrieve methods/attributes from
        include_privates (bool): Whether to include private methods/attributes
    
    Returns:
        Frame: DataFrame with results.
    """
    import pandas as pd
    methods = get_methods(obj,private=False)
    method_types = ['Method' for item in methods]

    attrs = get_attributes(obj,private=False)
    att_types =['Attribute' for item in attrs]
    
    if include_private:
        private_methods = get_methods(obj,private=True)
        methods.extend(private_methods)
        method_types.extend(['Private Method' for item in private_methods])
        
        private_attrs = get_attributes(obj,private=True)
        attrs.extend(private_attrs)
        att_types.extend(['Private Attribute' for item in private_attrs])
    
    
    docs=[]
    for m in methods:
        att = getattr(obj,m)
        docs.append(att.__doc__)

    all_res = [*methods,*attrs]
    res_type = [*method_types,*att_types]#['Method' for item in methods]+['Attribute' for item in attrs]
    docstrings= docs + ['na' for i in attrs]

    df_obj = pd.DataFrame({'Object':all_res,'Type':res_type,'Doc':docstrings})
    return df_obj


In [53]:
for obj in dir(pmap):
    print(obj)
    

__class__
__contains__
__delattr__
__dict__
__dir__
__doc__
__eq__
__format__
__ge__
__getattribute__
__getitem__
__gt__
__hash__
__init__
__init_subclass__
__iter__
__le__
__lt__
__module__
__ne__
__new__
__reduce__
__reduce_ex__
__repr__
__setattr__
__setitem__
__sizeof__
__str__
__subclasshook__
__weakref__
_animation_duration_validator
_animation_easing_validator
_batch_layout_edits
_batch_trace_edits
_bracket_re
_build_dispatch_plan
_build_update_params_from_batch
_config
_data
_data_defaults
_data_objs
_data_validator
_dispatch_layout_change_callbacks
_dispatch_trace_change_callbacks
_frame_objs
_frames_validator
_get_child_prop_defaults
_get_child_props
_grid_ref
_grid_str
_in_batch_mode
_index_is
_init_child_props
_initialize_layout_template
_ipython_display_
_is_dict_list
_is_key_path_compatible
_layout
_layout_defaults
_layout_obj
_layout_validator
_normalize_trace_indexes
_perform_batch_animate
_perform_plotly_relayout
_perform_plotly_restyle
_perform_plotly_update
_perform_

## Geocoding

In [None]:
df = corona.df_us
df

In [None]:
# !pip install geopandas
# !pip install geopy

In [None]:
from geopy.geocoders import Nominatim
locator = Nominatim(user_agent="myGeocoder")
res = locator.geocode('Baltimore')
res.latitude,res.longitude

## LEFTOVERS

In [None]:
# covid.df_us[['positive','negative','death','recovered',
# 'hospitalizedCurrently', 'hospitalizedCumulative',
#  'inIcuCurrently', 'inIcuCumulative', 
#  'onVentilatorCurrently','onVentilatorCumulative', 
#  'states','pending','dateChecked', 'hash',]]

In [None]:
covid.columns['good']

In [None]:
covid.df_states

In [None]:
df_us = covid.df_us.copy()
# sorted(list(df_us.columns))
df_us.columns

In [None]:
# df_us['fips']

In [None]:
good_us_cols = ['dateChecked','death', 'hash', 'hospitalizedCumulative',
 'hospitalizedCurrently','inIcuCumulative', 'inIcuCurrently',
 'negative', 'onVentilatorCumulative', 'onVentilatorCurrently',
 'pending','positive','recovered','states']

dep_us_cols = ['hospitalized', 'lastModified', 'total', 
             'totalTestResults', 'posNeg', 'deathIncrease',
            'hospitalizedIncrease', 'negativeIncrease', 'positiveIncrease', 
            'totalTestResultsIncrease']#[col for col in df_us.columns if col not in good_us_cols]
# print(dep_cols)

In [None]:
df = covid.df_us[covid.columns_us['good']].copy()
df[good_us_cols]

In [None]:
covid

In [None]:
# covid.US

# APPENDIX

In [None]:
## Load in Fips Data
fips = pd.read_csv('Reference Data/ZIP-COUNTY-FIPS_2018-03.csv')
fips.groupby('STATE').get_group("NY")['STCOUNTYFP'].value_counts()

In [None]:
fips.loc[fips['STCOUNTYFP']==36]

In [None]:

df = covid.STATES
df['fips']

In [None]:
# #     def __init__(self):
# tracking = CovidTrackingProject()
# states_daily = tracking.download_state_daily()
# us_daily=tracking.download_us_daily()
# state_meta = tracking.download_state_meta()
# display(states_daily.head(),us_daily.head(),state_meta.head())

In [None]:
covid = CovidTrackingProject(download=True)
state_meta = covid.data['states_metadata']
states_daily = covid.data['states']
state_list = state_meta['state'].unique()
states_daily

In [None]:
from pandas_profiling import ProfileReport

In [None]:
report  = ProfileReport(states_daily)


## NOTES: COLUMNS TO PLOT

- Basic Stats:
    - death: cumulative total people died
    - positive: total number people positive so far
    - negative
    - recovered
    

- Hospitalization:
    - hospitalizedCumulative: total number hospital so far(recovered and dead)
    - hospitalizedCurrently: 
    - hospitalizedIncrease


- ICU:
    - inIcuCumulative: total number hospital so far(recovered and dead)
    - inIcuCurrently: 
    
- Ventilator 
    - onVentilatorCumulative
    - onVentilatorCurrently


In [None]:

covid.columns

In [None]:
NY = states_daily.groupby('state').get_group('NY')[covid.columns['good']]
NY

## Folium

In [None]:
# import folium
# center = (res.latitude,res.longitude) #(resp['region']['center']['latitude'],resp['region']['center']['longitude'])

# popup = folium.Popup(f"Latitude={center[0]}, Longitude={center[1]}")
# marker = folium.Marker(center,popup)
# mymap = folium.Map(center)
# marker.add_to(mymap)
# mymap