![Noteable.ac.uk Banner](https://github.com/jstix/mr-noteable/blob/master/Banner%20image/1500x500.jfif?raw=true)

# Noteable Data Visualization Example 


## Plotting Daily COVID-19 Cases per Country on a Logarithmic Scale


### COVID-19 number of confirmed cases grow exponentially

Since the beginning of the COVID-19 outbreak, health authorities have been tracking the total number of confirmed cases on a daily basis. Health autorities noticed [exponential growth](https://en.wikipedia.org/wiki/Exponential_growth) during the first few weeks in which COVID-19 cases were confirmed across the globe. This observation prompted social distancing measures to reduce the number of new COVID-19 cases. 


## Question:

### How did health authorities identify the increase in COVID-19 cases as exponential growth? 

They observed how long it took for the number of cases to double for a period of a few weeks or more. Logarithmic scale charts can help us determine how long it took before the number of COVID-19 cases doubles. In this notebook, we will use the logarithmic function base 2 on the average number of confirmed cases of COVID-19. 





## Gather 

Let's take a look at some data[1].

[1] COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University https://github.com/CSSEGISandData/COVID-19

## Organize

The file `download_and_parse_data.py` contains Python code that imports the Python programming libraries we need to gather and organize the data to answer our question. The code also manipulates, cleans and presents the data so that it is easier to work with. 

You can see the full script [here](https://github.com/callysto/data-viz-of-the-week/blob/main/covid-19-visualizations/scripts/download_and_parse_data.py).

Run the cell below to execute this code. 

In [1]:
!pip install pycountry-convert
%run /home/jovyan/Data/download_and_parse_data.py
print("Success!")

Collecting pycountry-convert
  Downloading pycountry_convert-0.7.2-py3-none-any.whl (13 kB)
Collecting pytest>=3.4.0
  Downloading pytest-6.2.2-py3-none-any.whl (280 kB)
[K     |████████████████████████████████| 280 kB 4.9 MB/s eta 0:00:01
Collecting pycountry>=16.11.27.1
  Downloading pycountry-20.7.3.tar.gz (10.1 MB)
[K     |████████████████████████████████| 10.1 MB 6.7 MB/s eta 0:00:01
[?25hCollecting pytest-mock>=1.6.3
  Downloading pytest_mock-3.5.1-py3-none-any.whl (12 kB)
Collecting repoze.lru>=0.7
  Downloading repoze.lru-0.7-py3-none-any.whl (10 kB)
Collecting pytest-cov>=2.5.1
  Downloading pytest_cov-2.11.1-py2.py3-none-any.whl (20 kB)
Collecting pprintpp>=0.3.0
  Downloading pprintpp-0.4.0-py2.py3-none-any.whl (16 kB)
Collecting py>=1.8.2
  Downloading py-1.10.0-py2.py3-none-any.whl (97 kB)
[K     |████████████████████████████████| 97 kB 2.1 MB/s  eta 0:00:01
[?25hCollecting iniconfig
  Downloading iniconfig-1.1.1-py2.py3-none-any.whl (5.0 kB)
Collecting pluggy<1.0.0a1

ERROR:root:File `'/home/jovyan/Data/download_and_parse_data.py'` not found.


Success!


## Explore 

The code below will be used to help us find evidence to answer our question. This can involve looking at data in table format, applying math and statistics, and creating different types of visualizations.

In this case, we will start by looking at the daily number of confirmed COVID-19 cases in the United Kingdom. 

Run the cell below to generate a visualization. 

In [2]:
try:
    print("Generating plot. Please wait.")
    # Select country
    country = "United Kingdom"
    # Subset data to extract information for United Kingdom
    by_prov = final_confirmed[final_confirmed.index==country].set_index("province").T.iloc[:-4,]
    by_prov["TotalDailyCase"] = by_prov.sum(axis=1)
    # This variable contains data on COVID 19 daily cases
    non_cumulative_cases = by_prov.diff(axis=0)
    t = np.linspace(0, len(non_cumulative_cases["TotalDailyCase"]), len(non_cumulative_cases["TotalDailyCase"]))
    # Create figure
    trace3 = go.Scatter(x = non_cumulative_cases.index,y=non_cumulative_cases["TotalDailyCase"])
    layout = go.Layout(
            title= ('Daily number of COVID-19 reported cases in ' + str(country)),
            yaxis=dict(title='Daily number of COVID-19 reported cases in ' + str(country),\
                       titlefont=dict(color='blue'), tickfont=dict(color='blue')),
                yaxis2=dict(title='Number of infectious members (our model)', titlefont=dict(color='red'), \
                            tickfont=dict(color='red'), overlaying='y', side='right'),
                showlegend=False)
    fig = go.Figure(data=[trace3],layout=layout)
    # Display figure
    fig.show()
except:
    # Sanity check
    print("WARNING")
    print("Please ensure you have run code cell '%run -i ./scripts/download_and_parse_data.py' in this notebook.")

Generating plot. Please wait.
Please ensure you have run code cell '%run -i ./scripts/download_and_parse_data.py' in this notebook.


##  Interpret

We can see that in the UK between the months March through May 2020, COVID-19 spread. Between May and July, the number of cases decreased. Between August and November 2020, the number of infections increased rapidly once more, took a slight dip and peaked to its highest through January 2021. 

The graph above is looking at the step-wise difference of COVID-19 reported cases. That is, it takes the number of reported cases and from that subtracts the number of reported cases from a day before. We see "spikes" in this chart, particularly around November and January. 

In the next chart we will take a rolling average - this will allow us to smooth out those spikes. We will also add the logarithmic scale (base 2) to help us see how many days it took before the number of reported COVID-19 cases doubled.



## Explore

Run the cell below to display the log curve of covid cases. 

<b>Notes: </b>

a) Enter the name of a country using the text box. This will display the log curve for raw or cumulative number of cases per country. 

b) Toggle the "Get cumulative results" checkbox to see how the log scale changes when we compute the daily number of cases, vs the cumulative number of cases. 

c) To see results for a different country, use the backspace or delete key to remove the country selected at first, and type the name of a new country (use the drop down menu to help you find as well). 


In [3]:
try:
    display(tab)
except:
    # Sanity check
    print("WARNING")
    print("Please ensure you have run code cell '%run -i ./scripts/download_and_parse_data.py' in this notebook.")

Please ensure you have run code cell '%run -i ./scripts/download_and_parse_data.py' in this notebook.


## Interpret

The plots above contain in blue the daily rolling average number for confirmed (top plot) cases and deaths (bottom plot. Each of the plots also contains the logarithmic scale (base 2).


___

### Further questions

Use the widget above to explore how the number of COVID-19 cases in countries in different parts of the world does with respect to the log scale. Can you identify countries whose growth is exponential? 

Explore the cumulative number of cases for different countries. These plots can also indicate when the number of infections is growing exponentially. Identify countries whose log scale indicates exponential growth. 

<b>Notes: </b>

a) Enter the name of a country using the text box. This will display the log curve for raw or cumulative number of cases per country. 

b) Toggle the "Get cumulative results" checkbox to see how the log scale changes when we compute the daily number of cases, vs the cumulative number of cases. 

c) To see results for a different country, use the backspace or delete key to remove the country selected at first, and type the name of a new country (use the drop down menu to help you find as well). 


![Noteable license](https://github.com/jstix/mr-noteable/blob/master/Banner%20image/Screenshot%202021-03-05%20115453.png?raw=true)