[DIY Disease Tracking Dashboard Kit](https://github.com/fsmeraldi/diy-covid19dash) (C) Fabrizio Smeraldi, 2020,2024 ([f.smeraldi@qmul.ac.uk](mailto:f.smeraldi@qmul.ac.uk) - [web](http://www.eecs.qmul.ac.uk/~fabri/)). This notebook is released under the [GNU GPLv3.0 or later](https://www.gnu.org/licenses/).

# COVID Tracking Dashboard

In [57]:
from IPython.display import clear_output
import ipywidgets as wdg
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import requests
import time
import json

In [58]:
%matplotlib inline
# make figures larger
plt.rcParams['figure.dpi'] = 100

## Extracting Public Health Care Data

During the Pandemic, Public Health England (PHE) launched a Covid-19 dashboard. This timely service came with an Application Programming Interface (API) allowing users programmatic access to the data for the purpose of creating visualisations or data analysis. Interestingly, it also included a wrapper library written in Python, that made access to the data seamless. At the end of 2023, the PHE dashboard was replaced by the UK Health Security Agency dashboard (UKHSA dashboard). This new API, at the time of writing in the Beta stage, includes data on various infectious diseases including respiratory and gastrointestinal, bloodstream infections, and vaccine-preventable diseases. The data are better organised and documented, and many of the quirks of the old API have been fixed. An interesting feature of the new system is that all of its code has been open-sourced.

The data was extracted from UKHSA dashboard using an API wrapper to the convert the raw data to JSON format. The data that we are looking at shows the rolling average number of cases over the 7 day period per 100,000 population ending on the dates shown. New cases are reported by specimen date - the date the first sample that identified the infection was taken from an individual. We explore the rolling mean of cases across different age groups for the year of 2020.

## Load initial data from disk

The raw data was loaded and assigned to the variable ```jsondata```.

In [59]:
with open("RollingMeancases.json", "rt") as INFILE:
    jsondata = json.load(INFILE)

## Wrangle the data

Using a function that called the JSON file, the data was wrangled into a Data Frame to use for future plotting.

The function extracted the key values we wanted to look at, in this case age and dates, to become the index of the data frame where the columns contain our data. The function retrieves these points and brings everything together into a dictionary with the dates as keys, and inner dictionaries with metrics. The dates are extracted and sorted, then parsed to account for any missing dates.

The same was done for the ages where they were extracted from the columns so that the data frame could be defined. I used the date range as an index, but with a weekly spacing and the sorted age groups as columns. The data frame was filled in by looping over the dates and filling in with the corresponding values.


In [60]:
from datetime import datetime

def parse_date(s):
    return datetime.strptime(s, "%Y-%m-%d")

def wrangle_data(rawdata):
    data={}
    for entry in rawdata:
        date=entry['date']
        age=entry['age']
        value=entry['metric_value']
        if date not in data:
            data[date]={}
        data[date][age]=value

    dates = list(data.keys())
    dates.sort()
    startdate = parse_date(dates[0])
    enddate = parse_date(dates[-1])

    age=[]
    for entry in data.values():
        for x in entry.keys():
            if x not in age:
                age.append(x)
    age.sort()

    index=pd.date_range(startdate, enddate, freq='W-MON')
    df=pd.DataFrame(index=index, columns=age)

    for date, entry in data.items():
        pd_date=parse_date(date) 
        if pd_date in df.index:
            for column in entry.keys(): 
                df.loc[date, column]=entry[column]

    df.fillna(0.0, inplace=True)
    df = df.apply(pd.to_numeric, errors='coerce')
    return df


In [61]:
df=wrangle_data(jsondata)

## Download current data

I created a call code to access the API and gave users an option to refresh the dataset if needed. The button callback allows the call code to access the API and download fresh raw data and wrangle that data into a dataframe to update the corresponding variables. 

The ipywidgets library provides a Button class that implements a clickable button. This is a function that is passed as a parameter to the on_click method of the Button object, that in turn calls it when it is clicked. The callback function is passed the Button object itself as a parameter (which is useful if more than one button, for instance, shares the same callback). 

In [62]:
def access_api():
    return jsondata  

In [63]:
def api_button_callback(button):
    """ Button callback - it must take the button as its parameter (unused in this case).
    Accesses API, wrangles data, updates global variable df used for plotting. """
    apibutton.on_click(access_api)
    display(apibutton)
    apidata=access_api()

    global df
    df=wrangle_data(apidata)
    refresh_graph()

   
apibutton=wdg.Button(
    description='Refresh data',
    disabled=False,
    button_style='',
    tooltip='Click to download current Public Health England data',
    icon='download')

apibutton.on_click(api_button_callback) 

display(apibutton)


Button(description='Refresh data', icon='download', style=ButtonStyle(), tooltip='Click to download current Pu…

## Graphs and Analysis

The first graph is an interactive plot displaying the rolling mean number of cases along the Y axis and dates along the X axis for the year 2020. The user can select which age group they would like to look at specifically.

Graph two uses the JSON file and converts to a pickle file with the dataframe. 

In [64]:
age_selector = wdg.Dropdown(
    options=sorted(df.columns),
    value='all',
    description='Age Group:',
    disabled=False
)

# Plot function
def plot_age_group(selected_age):
    plt.figure(figsize=(12, 6))
    df[selected_age].plot(color='blue', label=selected_age)
    plt.title(f"Rolling Mean COVID-19 Cases for Age Group: {selected_age}")
    plt.xlabel("Date")
    plt.ylabel("Rolling Mean Case Rate")
    plt.grid(True)
    plt.legend()
    plt.tight_layout()
    plt.show()

graph = wdg.interactive_output(plot_age_group, {'selected_age': age_selector})

display(age_selector, graph)

Dropdown(description='Age Group:', index=20, options=('00-04', '05-09', '10-14', '15-19', '20-24', '25-29', '3…

Output()

In [71]:
df.to_pickle("RollingMeandf.pkl")

In [72]:
rollingmeandf=pd.read_pickle("RollingMeandf.pkl")

In [73]:
rollingmeandf = rollingmeandf.apply(pd.to_numeric, errors='coerce',axis=0)

In [74]:
import calendar 

month_numbers = sorted(rollingmeandf.index.month.unique())
month_names = [calendar.month_name[m] for m in month_numbers]
month_map = dict(zip(month_names, month_numbers))

month=wdg.Select(
    options=month_names, 
    value=month_names[-1], 
    rows=1, 
    description='Month',
    disabled=False
)

def rollingmean_graph(graphmonth):
    graphmonth_num = month_map[graphmonth]

    monthdf=rollingmeandf[rollingmeandf.index.month==graphmonth_num]
    weekly= monthdf.groupby(pd.Grouper(freq='1W')).mean() 
    totals=weekly.sum(axis=1) 
    weekly=weekly.div(totals, axis=0)*100
    weekly = weekly[::-1]
    ax=weekly.plot(kind='barh', stacked=True,cmap='tab20')
    ax.legend(loc='center left',bbox_to_anchor=(1.0, 0.5))
    ax.set_yticklabels(weekly.index.strftime('%Y-%m-%d'))
    plt.show()
    
output=wdg.interactive_output(rollingmean_graph, {'graphmonth': month})

display(month, output)

Select(description='Month', index=10, options=('February', 'March', 'April', 'May', 'June', 'July', 'August', …

Output()

**Author and License** Remember that if you deploy your dashboard as a Binder it will be publicly accessible. Change the copyright notice and take credit for your work! Also acknowledge your sources and the conditions of the license by including this notice: "Based on UK Government [data](https://ukhsa-dashboard.data.gov.uk/) published by the [UK Health Security Agency](https://www.gov.uk/government/organisations/uk-health-security-agency) and on the [DIY Disease Tracking Dashboard Kit](https://github.com/fsmeraldi/diy-covid19dash) by Fabrizio Smeraldi. Released under the [GNU GPLv3.0 or later](https://www.gnu.org/licenses/)."