**DIY Covid-19 Dashboard-Daily newcases per nation as %population  (C) Maria Lagoumidou, 2020. All rights reserved.**

# DIY Covid-19 Dashboard

## Daily new cases for England, Scotland, Wales and Northern Ireland as a percentage of the population of each nation


This DIY Covid-19 Dashboard, uses the API provided by Public Health England to access current statistics on COVID-19 pandemic for the United Kingodom. The dashboard will be displayed using [voila](https://voila.readthedocs.io/en/stable/index.html).

The **"newCasesByPublishDate" data timeseries** have been selected from the above API, to plot a graph showing the **daily new cases for each UK nation, i.e.England, Scotland, Wales and Northern Ireland, as a percentage of the population of each nation**. 
  
A detailed explanation about the data provided by Public Health England may be found [here](https://coronavirus.data.gov.uk/details/about-data). For the purpose of this Dashboard the following applies:

- The daily number of cases is the number of people with a positive COVID-19 virus test (either lab-reported or lateral flow device) on or up to the reporting date (depending on availability).The reporting date is the date the case was first included in the published totals. The availability of each of these time series varies by area.

- COVID-19 cases are identified by taking specimens from people and testing them for the presence of the COVID-19 virus. If the test is positive, this is a referred to as a case. If a person has had more than one positive test they are only counted as one case.

Population data for each nation is provided by the Office of National Statistics for mid-2019 [here](https://www.ons.gov.uk/peoplepopulationandcommunity/populationandmigration/populationestimates/bulletins/annualmidyearpopulationestimates/mid2019estimates#population-growth-in-england-wales-scotland-and-northern-ireland) and was used in the calculations to plot the graph.


In [None]:
import ipywidgets as wdg
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import json
from uk_covid19 import Cov19API

In [None]:
%matplotlib inline
# make figures larger
plt.rcParams['figure.dpi'] = 120 #made this larger

## Initial data from a file

The graph displayed when this dashboard is first accessed, uses data saved in a file. This file includes daily cases from 03-Jan-2020 until 17-Nov-2020. The plots for all nations are displayed as default in the graph.

A drop-down menu, with instructions, is provided to enable the user to select which nation daily cases are of interest.

This initial graph shows that between January and the beginning of March there weren't any positive COVID-19 cases or that testing was not deployed. Between the middle of March and June, we can see the peak of the 'first wave' of the pandemic in UK, with Wales appearring to be the worst afflicted nation as a percentage of its population. 

It is interesting to notice the 'second wave' starting in September and continuing in November. The cases, as a percentage of a nation's population for all nations, are higher than the ones in the first wave, though this may be due to the higher number of tests being conducted. These spikes appear consistent with the four nation governements' decisions to enforce a second lockdown. Northern Ireland appears to be the nation having the highest number of cases as a percentage of its population in the 'second wave'.

The cases as percentages of each nation's population, appear quite low in general, however these are cases with registered positive COVID-19 tests and not everyone with COVID-19 symptoms is tested.

In general, the spread of the virus in the population of each nation through time, shown in this graph, appears to be in line with the updates made to the general public in the mainstream media.

In [None]:
# Load JSON files and store the raw data in a variable
with open("timeseries.json", "rt") as INFILE:
    jsondata=json.load(INFILE)


In [None]:
### WRANGLE DATA ###### putting the wrangling code into a function allows to call it again after refreshing 
## the data through the API. 

def wrangle_data(rawdata):
    """ Parameters: rawdata - data from json file or API call. Returns a dataframe.
    Edit to include the code that wrangles the data, creates the dataframe and fills it in. """
    datalist=rawdata['data']
    dates=[dictionary['date'] for dictionary in datalist]
    dates.sort()
   
    ##constructing a dates list where all dteas are included once
    datesb=[]
    for i in dates:
        if i not in datesb:
            datesb.append(i)
    datesb.sort()
    
    
    ## reconstructing the dictionaries so that for each date we get the number of cases in 
    ## each nation and as percentage of the population in each nation. 
    ##The first for loop reads the dates list and the nested loop reads the list with 
    ## the dictionaries for each date. Population data from ons.gov.uk , 2019 midyear estimate
    mldatalist=[]
    for i in datesb:
        for dictionary in datalist:
            if  dictionary['areaName']=='England'and dictionary['date']==i:
                e=100*dictionary['cases']/56287000        # percentage cases/England population
            elif dictionary['areaName'] =='Scotland'and dictionary['date']==i:
                s=100*dictionary['cases']/5463300        # percentage cases/Scotland population
            elif dictionary['date'] ==i and dictionary['areaName'] == 'Wales':
                w=100*dictionary['cases']/3152900     # percentage cases/Wales population
            elif dictionary['date'] ==i and dictionary['areaName'] == 'Northern Ireland':
                ni=100*dictionary['cases']/1893700     # percentage cases/Northern Ireland population
        newdict={'date':i,'Englandcases':e, 'Scotlandcases':s, 'NorthernIrelandcases':ni, 'Walescases':w}
        mldatalist.append(newdict)
   
    #we find the earliest and latest date and convert them to the pandas type for representing dates
    def parse_date(datestring):
        """ Convert a date string into a pandas datetime object """
        return pd.to_datetime(datestring, format="%Y-%m-%d")

    #'datesb' is the dates list created above
    startdate=parse_date(datesb[0])
    enddate=parse_date(datesb[-1])
    
    
    #we create an index as a date_range: this is the date analog of a range for integers, 
    #and it will include any dates that may be missing from our list. 
    #We then proceed to define the DateFrame by specifying its index and the title of its columns
    index=pd.date_range(startdate, enddate, freq='D')
    df=pd.DataFrame(index=index, columns=['Englandcases', 'Scotlandcases', 'NorthernIrelandcases', 'Walescases'])
    
    
    for entry in mldatalist: # each entry is a dictionary with date, cases, hospital and deaths
        date=parse_date(entry['date'])
        for column in ['Englandcases', 'Scotlandcases', 'NorthernIrelandcases', 'Walescases']:
        # check that nothing is there yet - just in case some dates are duplicated,
        # maybe with data for different columns in each entry
            if pd.isna(df.loc[date, column]): 
                # replace None with 0 in our data 
                value= float(entry[column]) if entry[column]!=None else 0.0
                # this is the way you access a specific location in the dataframe - use .loc
                # and put index,column in a single set of [ ]
                df.loc[date, column]=value
            
    # fill in any remaining "holes" due to missing dates
    df.fillna(0.0, inplace=True)
    return  df 

# We call the function directly on the JSON data when the dashboard starts, by including 
# the call in the cell as below:

df=wrangle_data(jsondata) # df is the dataframe for plotting


## Download current data

An option to refresh the dataset from the API is available by clicking the  blue button "Refresh Data" below. The blue button :

* calls the code that accesses the API and downloads current data in a few seconds;
* redraws the graph when current data is dowloaded sucessfully.

When the above process runs smoothly the button is disabled and labelled 'Success'.

If for any reason, e.g. the API is unavailable or there is no internet connection, the blue button becomes orange and is labelled 'Unavailable'. You can still click the 'orange' button later and if the connection issues are resolved and data is downloaded, the the button will be turned to blue but disabled and labelled 'Success'. In the later case the graph will be updated accordingly.

The graph functionality and instructions remain the same with the initial graph drawn from file data, as explained above.


In [None]:
# API access code is palced in this function. Do not call this function directly; it will be called by 
# the button callback. 
def access_api():
    """ Accesses the PHE API. Returns raw data in the same format as data loaded from the "canned" JSON file. """
    # Polling API and getting the data in a json file
    filters = ['areaType=nation']
    structure={"areaName":"areaName","date":"date", "cases":"newCasesByPublishDate"}
    api=Cov19API(filters=filters, structure=structure)
    try:
        apinewdata=api.get_json()  ### NOTE: this call polls the server.
    except:
        apibutton.button_style='warning'
        apibutton.icon='unlink'
        apibutton.description = "Unavailable"
    return apinewdata # return data read from the API


In [None]:
# Printout from this function will be lost in Voila unless captured in an
# output widget - therefore, we give feedback to the user by changing the 
# appearance of the button as explained in the text cells
def api_button_callback(button):
    """ Button callback - it must take the button as its parameter (unused in this case).
    Accesses API, wrangles data, updates global variable df used for plotting. """
    # Get fresh data from the API. If you have time, include some error handling
    # around this call.
    apidata=access_api()
    
    # wrangle the data and overwrite the dataframe for plotting
    global df
    df=wrangle_data(apidata)
    
    # the graph won't refresh until the user interacts with the widget.
    # this function simulates the interaction see cells below
    refresh_graph()
    
    # after all is done, we modify the button as explained in the text cells
    apibutton.icon="check-circle"
    apibutton.button_style='info'
    apibutton.description='Success'
    apibutton.tooltip="Data has been downloaded"
    apibutton.disabled=True

    
apibutton=wdg.Button(
    description='Refresh Data', 
    disabled=False,
    button_style='info', # 'success', 'info', 'warning', 'danger' or ''
    tooltip="Click to download current Public Health England data",
    # FontAwesome names without the `fa-` prefix - try "download"
    icon='download'
)

#link button with function
apibutton.on_click(api_button_callback) # the name of your function inside these brackets

display(apibutton)


In [None]:
####widget####
    
series=wdg.SelectMultiple(
    options=['Englandcases', 'Scotlandcases', 'NorthernIrelandcases', 'Walescases'],
    value=['Englandcases', 'Scotlandcases', 'NorthernIrelandcases', 'Walescases'],
    rows=4,
    description='Nation(s):',
    disabled=False
)

####plotting the graph####

def plot_df(gcols):
    ncols=len(gcols)
    if ncols>0:
        df[list(gcols)].plot(linewidth=1.2)
        print("Click to select one nation")
        print("CTRL-Click to select more than one nation")
        print("CTRL-Click to deselect a nation")
        
    else:
        print("Click to select data for graph")
        print("CTRL-Click to select more than one nation")
    plt.ylabel("newCases as % nation population") ## added for clarity
    
# keep calling plot_df(gcols=value_of_series); capture output in variable graph   #####
graph=wdg.interactive_output(plot_df, {'gcols': series})

display(series, graph)

### force refresh the graph when API data refresh is done####
def refresh_graph():
    """ We change the value of the widget in order to force a redraw of the graph;
    this is useful when the data have been updated. This is a bit of a gimmick; it
    needs to be customised for one of your widgets. """
    current=series.value
    if current!=[series.options[0],series.options[1]]:
        other=[series.options[1],series.options[2]]
    else :
        other=[series[2],series[3]]
    series.value=other # forces the redraw
    series.value=['Englandcases', 'Scotlandcases', 'NorthernIrelandcases', 'Walescases'] # now we can 
    


**Data sources:** *Based on UK Government published data by  [Public Health England](https://www.gov.uk/government/organisations/public-health-england) and [Office for National Statistics](https://www.ons.gov.uk)* 