# CDC Influnza, Pneumonia and Total Deaths Notebook

This notebook can be used to explore the CDC's dta on influenza, pneumonia and total deaths for each state as well as United States as a whole.

Raw data is from the US CDC. 
The original CDC data for this notebook can be found at: https://gis.cdc.gov/grasp/fluview/mortality.html

## Instructions:

In order to start processing data with this notebook click on '**Runtime** menu item at the top of the notebook. Then click on the **Run All** menu item. Scroll to the bottom of the notebook. If everything went as planned you shoud see a set of control widgets that will allow you to select the state, the type of death (total, infuenza or pneumonia) and a time period (flu season or full year)Below the controls are a graph of the data for the past 7 years. Below the graph will be a table of the data used to make the graph. Since 2012-13 does not have complete data for all the states it was not included in the graph but is provided in the table for comparison.

## Disclaimer

No claim is made to the accuracy of this notebook. Before using results from this notebook results should verified with the data provided on the CDC website.

This notebook is hosted on mybinder.org. When you click on the link below mybinder will create a one time Jupyter Notebook server. You may edit and expirement with the code. Since this is a one time server you can not save your changes. Any changes you made will be lost. 

https://mybinder.org/v2/gh/tav2119/cdcinfluenza/master?filepath=CDC_FluPneumoniaState.ipynb



In [8]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from IPython.display import HTML
import ipywidgets as wg
from IPython.display import display

In [9]:
file="State_Custom_Data.csv"
df=pd.read_csv(file, thousands=',')
#df1

In [10]:
def buildTable(data, state, period, cause):
    data1 = data.copy()
    data1.columns = data1.columns.str.strip()
    data1.loc[(data1['TOTAL DEATHS'] == 'Insufficient Data'),'TOTAL DEATHS']='0'
    data1.loc[(data1['NUM INFLUENZA DEATHS'] == 'Insufficient Data'),'NUM INFLUENZA DEATHS']='0'
    data1.loc[(data1['NUM PNEUMONIA DEATHS'] == 'Insufficient Data'),'NUM PNEUMONIA DEATHS']='0'
    data1['ORDER'] = 2
    data1.loc[(data1['WEEK'] >= 40),'ORDER']=1

    data1['TOTAL DEATHS'] = data1["TOTAL DEATHS"].astype({'TOTAL DEATHS': 'int32'})
    data1['NUM INFLUENZA DEATHS'] = data1["NUM INFLUENZA DEATHS"].astype({'NUM INFLUENZA DEATHS': 'int32'})
    data1['NUM PNEUMONIA DEATHS'] = data1["NUM PNEUMONIA DEATHS"].astype({'NUM PNEUMONIA DEATHS': 'int32'})

    #print("State:", state)
    if state == 'National':
        df2 = data1
    else:    
        df2 = data1[data['SUB AREA'] == state]
        
    if cause == 'TOTAL DEATHS':
        allCause= df2[['SEASON', 'WEEK', 'ORDER', 'TOTAL DEATHS']]
    elif cause == 'NUM INFLUENZA DEATHS':
        allCause= df2[['SEASON', 'WEEK', 'ORDER', 'NUM INFLUENZA DEATHS']]
    else:
        allCause= df2[['SEASON', 'WEEK', 'ORDER', 'NUM PNEUMONIA DEATHS']]
#    print(allCause)        
    ptable2 = pd.pivot_table(allCause, values = cause, index=['ORDER','WEEK'], columns='SEASON', aggfunc=np.sum, fill_value=0)
    ptable2.reset_index(inplace = True)

    cols = list(ptable2.columns.values)
    #print(cols)
    cols.reverse()
    #print(cols)
    cols = (cols[-2:] + cols[0:-2])
    
    ptable2 = ptable2[cols].copy()
    #print(ptable2)
    
    ptable2 = pd.DataFrame(ptable2.to_records())
    if period == 'Full Year':
        #ptable2.drop(ptable2[ptable2['WEEK'] == 53].index, inplace=True)
        ptable2 = ptable2.loc[ptable2['WEEK'] != 53]
    else:
        ptable2 = ptable2.loc[ptable2['WEEK'] != 53]
        ptable2 = ptable2.loc[ptable2['2019-20'] != 0]
        #ptable2.drop(ptable2['2019-20'] == 0, inplace=True)
        w = list(ptable2['WEEK'])
        ptable2 = ptable2.loc[(ptable2['WEEK'] <= w[-1]) | (ptable2['WEEK'] >= 40)]

    ptable2 = ptable2.drop(columns="ORDER")
    
    return ptable2

def printTable(data, state = 'National', period = 'Flu Season', cause = 'TOTAL DEATHS'):
    table = buildTable(data, state, period, cause)
    table.plot(y=['2019-20', '2018-19', '2017-18', '2016-17', '2015-16', '2014-15','2013-14'],
              figsize=(12,8), title = cause)
    plt.show()
    display(cause + " " + state)
    display(HTML(table.to_html(index=False)))

#printTable(df, cause = 'NUM INFLUENZA DEATHS')



In [11]:
stateList = df.copy()
stateList = ['National'] + pd.DataFrame(stateList.groupby(['SUB AREA']).sum()).reset_index()['SUB AREA'].tolist()
#display(stateList)
areaSel = wg.Dropdown(
    options=stateList,
    value='National',
    description='Number:',
    disabled=False,
)

periodSel = wg.RadioButtons(
    options=['Flu Season', 'Full Year'],
    value='Flu Season', # Defaults to 'pineapple'
#    layout={'width': 'max-content'}, # If the items' names are long
    description='Time Period:',
    disabled=False
)

causeSel = wg.RadioButtons(
    options=['TOTAL DEATHS', 'NUM INFLUENZA DEATHS', 'NUM PNEUMONIA DEATHS'],
    value='TOTAL DEATHS', # Defaults to 'pineapple'
#    layout={'width': 'max-content'}, # If the items' names are long
    description='Time Period:',
    disabled=False
)

dummy = wg.interact(printTable, data = wg.fixed(df), state = areaSel, period = periodSel, cause = causeSel)


interactive(children=(Dropdown(description='Number:', options=('National', 'Alabama', 'Alaska', 'Arizona', 'Ar…

In [12]:
HTML(df.to_html(index=False))

AREA,SUB AREA,AGE GROUP,SEASON,WEEK,PERCENT P&I,NUM INFLUENZA DEATHS,NUM PNEUMONIA DEATHS,TOTAL DEATHS,PERCENT COMPLETE
State,Alabama,All,2019-20,40,5.0,0,48,955,98.3%
State,Alabama,All,2019-20,41,3.8,0,36,940,96.8%
State,Alabama,All,2019-20,42,4.5,0,45,1011,> 100%
State,Alabama,All,2019-20,43,4.4,2,42,1000,> 100%
State,Alabama,All,2019-20,44,5.7,2,53,962,99%
State,Alabama,All,2019-20,45,4.9,0,52,1054,> 100%
State,Alabama,All,2019-20,46,6.1,0,67,1099,> 100%
State,Alabama,All,2019-20,47,4.3,3,43,1061,> 100%
State,Alabama,All,2019-20,48,7.6,2,78,1049,> 100%
State,Alabama,All,2019-20,49,5.3,2,55,1072,> 100%
