[DIY Covid-19 Dashboard](https://github.com/fsmeraldi/diy-covid19dash) (C) Joshua Hunter, 2020 ([ec20719@qmul.ac.uk] All rights reserved.

# DIY Covid-19 Dashboard

This is a DIY Covid-19 Dashboard developed by Josh Hunterin November 2020. This dashboard wrangles some data provided by Public Health England, relating to the cases of coronavirus in the UK since the start of the pandemic. A graph with interactive widgets is then provided based on that wrangled data.

In [None]:
import ipywidgets as wdg
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import json
from uk_covid19 import Cov19API

#importing modules for use throughout dashboard

In [None]:
%matplotlib inline
plt.rcParams['figure.dpi'] = 100

# make figures larger

In [None]:
with open("timeseries.json", "rt") as INFILE:
    data1=json.load(INFILE)
datalist=data1['data']

#load data from saved json file

## Wrangle the data

The is the logic to wrangle the raw data into a ```DataFrame``` that will be used for plotting. This dashboard will display the newcases in the UK for each date provided by Public Health England and also a running total of all the cases in the UK to date.  

In [None]:
def wrangle_data(data):
    
    datalist=data['data']
    
    dates=[dictionary['date'] for dictionary in datalist ]

    dates.sort()

    def parse_date(datestring):
        """ Convert a date string into a pandas datetime object """
        return pd.to_datetime(datestring, format="%Y-%m-%d")

    startdate=parse_date(dates[0])
    enddate=parse_date(dates[-1])

    index=pd.date_range(startdate, enddate, freq='D')
    timeseriesdf=pd.DataFrame(index=index, columns=['Newcases', 'Cumcases'])

    for entry in datalist: # each entry is a dictionary with date, Newcases and Cumcases
        date=parse_date(entry['date'])
        for column in ['Newcases', 'Cumcases']:
            if pd.isna(timeseriesdf.loc[date, column]):  
                value= int(entry[column]) if entry[column]!=None else 0
                timeseriesdf.loc[date, column]=value
            
    timeseriesdf.fillna(0, inplace=True)
    timeseriesdf.to_pickle("timeseriesdf.pkl")
    timeseriesdf=pd.read_pickle("timeseriesdf.pkl")
    
    return timeseriesdf

#logic required to wrangle data into a dataframe

In [None]:
df1 = wrangle_data(data1)

#call to wrangle the data of loaded file into a dataframe

## Download current data

Here is a button that allows users to refresh the data from Public Health England. Clicking this button will also refresh the graph at the bottom of the dashboard.

In [None]:
def access_api():
    """ Accesses the PHE API. Returns raw data in the same format as data loaded from the "canned" JSON file. """
    filters = [
    'areaType=overview' # note each metric-value pair is inside one string
    ]
    structure = {
    "date": "date",
    "Newcases": "newCasesByPublishDate",
    "Cumcases": "cumCasesByPublishDate"
    }
    api = Cov19API(filters=filters, structure=structure)
    timeseries=api.get_json()
    
    with open('timeseries.json', "wt") as OUTF:
        data = json.dump(timeseries, OUTF)
    
    return timeseries

#defining access to the api for current data 

In [None]:
def api_button_callback(button):
    """ Button callback - it must take the button as its parameter.
    Accesses API, wrangles data, updates global variable df used for plotting. """
    apidata=access_api()
    global df1
    df1=wrangle_data(apidata)
    refresh_graph()
    apibutton.icon="check"
    apibutton.disabled=True
    return df1

    
apibutton=wdg.Button(
    description='Refresh data',
    disabled=False,
    button_style='info', 
    tooltip='Click to download current Public Health England data',
    icon='refresh'
)

apibutton.on_click(api_button_callback)

display(apibutton)

# RUN ALL CELLS BEFORE CLICKING ON THIS BUTTON

## Graphs and Analysis

Below is a graph representing the cases of Covid-19 in the UK, based on the data published by Public Health England. Displayed are the new cases for each date and the running total of cases since the start of the pandemic in both linear and log format, selected by using the controls provided.

In [None]:
cases=wdg.SelectMultiple(
    options=['Newcases', 'Cumcases'],
    value=['Newcases', 'Cumcases'],
    rows=2,
    description='Cases',
    disabled=False
)

scale=wdg.RadioButtons(
    options=['linear', 'log'],
    description='Scale:',
    disabled=False
)

controls=wdg.HBox([cases, scale])

def timeseries_graph(gcols, gscale):
    if gscale=='linear':
        logscale=False
    else:
        logscale=True
    ncols=len(gcols)
    if ncols>0:
        df1[list(gcols)].plot(logy=logscale)
        plt.xlabel('Date (Month)') 
        plt.ylabel('Number of Cases') 
        plt.title('''New Cases per Day and Cumulative Cases Over Time of Covid-19 in the UK
        ''') 
    else:
        print("Click to select data for graph")
        print("(CTRL-Click to select more than one category)")
    
graph=wdg.interactive_output(timeseries_graph, {'gcols': cases, 'gscale': scale}) 

display(controls, graph)

#plots graph with interactive widgets


In [None]:
def refresh_graph():

    current=scale.value
    if current==scale.options[0]:
        other=scale.options[1]
    else:
        other=scale.options[0]
    scale.value=other # forces the redraw
    scale.value=current # now we can change it back
    
#defining the refresh graph function - cycles between widget options

**Author and Copyright Notice** *Created by Josh Hunter Nov 2020 - Based on UK Government [data](https://coronavirus.data.gov.uk/) published by [Public Health England](https://www.gov.uk/government/organisations/public-health-england).*