![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)
<a href="https://hub.callysto.ca/jupyter/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2Fcallysto%2Fdata-viz-of-the-week&branch=main&subPath=covid-19-visualizations/covid-19-log-curve-data-viz.ipynb&depth=1" target="_parent"><img src="https://raw.githubusercontent.com/callysto/curriculum-notebooks/master/open-in-callysto-button.svg?sanitize=true" width="123" height="24" alt="Open in Callysto"/></a>

# Callysto’s Weekly Data Visualization


## Plotting Daily COVID-19 Cases per Country on a Logarithmic Scale

### Recommended grade level: 9-12

#### Instructions:

Below is a link to a video walk through of how to interact with with notebook. Open the video in a new tab or window, then come back to the notebook.

[![Video here](https://img.youtube.com/vi/caJF5G2YLTE/0.jpg)](https://www.youtube.com/watch?v=caJF5G2YLTE)

Press the image above to play the tutorial.


Callysto's Weekly Data Visualization is a learning resource that helps Grades 5-12 teachers and students grow and develop data literacy skills. We do this by providing a data visualization, like a graph, and asking teachers and students to interpret it. This companion resource walks learners through how the data visualization is created and interpreted using the data science process. The steps of this process are listed below and applied to each weekly topic.

* Question - What are we trying to answer?
* Gather - Find the data source(s) you will need.
* Organize - Arrange the data so that you can easily explore it.
* Explore - Examine the data to look for evidence to answer our question. This includes creating visualizations.
* Interpret - Explain how the evidence answers our question.
* Communicate - Reflect on the interpretation.


In this notebook we will explore the logarithmic scale (also known as the log scale) and we will explore how the log scale can help us determine if a country is experiencing exponential growth in the number of reported COVID-19 cases. 


### COVID-19 number of confirmed cases grow exponentially

Since the beginning of the COVID-19 outbreak, health authorities have been tracking the total number of confirmed cases on a daily basis. Health autorities noticed [exponential growth](https://en.wikipedia.org/wiki/Exponential_growth) during the first few weeks in which COVID-19 cases were confirmed across Canada. This observation prompted social distancing measures to reduce the number of new COVID-19 cases. 


## Question:

### How did health authorities identify the increase in COVID-19 cases as exponential growth? 

They observed how long it took for the number of cases to double for a period of a few weeks or more. Logarithmic scale charts can help us determine how long it took before the number of COVID-19 cases doubles. In this notebook, we will use the logarithmic function base 2 on the average number of confirmed cases of COVID-19. 





## Gather 

Let's take a look at some data[1].

[1] COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University https://github.com/CSSEGISandData/COVID-19

## Organize

The file `download_and_parse_data.py` contains Python code that imports the Python programming libraries we need to gather and organize the data to answer our question. The code also manipulates, cleans and presents the data so that it is easier to work with. 

You can see the full script [here](https://github.com/callysto/data-viz-of-the-week/blob/main/covid-19-visualizations/scripts/download_and_parse_data.py).

Run the cell below to execute this code. 

In [None]:
%pip install -q pyodide_http plotly nbformat requests pandas pycountry-convert ipywidgets
import pyodide_http
pyodide_http.patch_all()
%run -i ./scripts/download_and_parse_data.py
print("Success!")

## Explore 

The code below will be used to help us find evidence to answer our question. This can involve looking at data in table format, applying math and statistics, and creating different types of visualizations.

In this case, we will start by looking at the daily number of confirmed COVID-19 cases in Canada. 

Run the cell below to generate a visualization. 

In [None]:
try:
    print("Generating plot. Please wait.")
    # Select country
    country = "Canada"
    # Subset data to extract information for Canada
    by_prov = final_confirmed[final_confirmed.index==country].set_index("province").T.iloc[:-4,]
    by_prov["TotalDailyCase"] = by_prov.sum(axis=1)
    # This variable contains data on COVID 19 daily cases
    non_cumulative_cases = by_prov.diff(axis=0)
    t = np.linspace(0, len(non_cumulative_cases["TotalDailyCase"]), len(non_cumulative_cases["TotalDailyCase"]))
    # Create figure
    trace3 = go.Scatter(x = non_cumulative_cases.index,y=non_cumulative_cases["TotalDailyCase"])
    layout = go.Layout(
            title= ('Daily number of COVID-19 reported cases in ' + str(country)),
            yaxis=dict(title='Daily number of COVID-19 reported cases in ' + str(country),\
                       titlefont=dict(color='blue'), tickfont=dict(color='blue')),
                yaxis2=dict(title='Number of infectious members (our model)', titlefont=dict(color='red'), \
                            tickfont=dict(color='red'), overlaying='y', side='right'),
                showlegend=False)
    fig = go.Figure(data=[trace3],layout=layout)
    # Display figure
    fig.show()
except:
    # Sanity check
    print("WARNING")
    print("Please ensure you have run code cell '%run -i ./scripts/download_and_parse_data.py' in this notebook.")

##  Interpret

We can see that in Canada between the months March through May, COVID-19 spread rapidly. Between May and July, the number of cases decreased. Between August and November 2020, the number of infections is increasing rapidly once more. 

The graph above is looking at the step-wise difference of COVID-19 reported cases. That is, it takes the number of reported cases and from that subtracts the number of reported cases from a day before. We see "spikes" in this chart, particularly around November. 

In the next chart we will take a rolling average - this will allow us to smooth out those spikes. We will also add the logarithmic scale (base 2) to help us see how many days it took before the number of reported COVID-19 cases doubled.



## Explore

Run the cell below to display the log curve of covid cases. 

<b>Notes: </b>

a) Enter the name of a country using the text box. This will display the log curve for raw or cumulative number of cases per country. 

b) Toggle the "Get cumulative results" checkbox to see how the log scale changes when we compute the daily number of cases, vs the cumulative number of cases. 

c) To see results for a different country, use the backspace or delete key to remove the country selected at first, and type the name of a new country (use the drop down menu to help you find as well). 


In [None]:
try:
    display(tab)
except:
    # Sanity check
    print("WARNING")
    print("Please ensure you have run code cell '%run -i ./scripts/download_and_parse_data.py' in this notebook.")

## Interpret

The plots above contain in blue the daily rolling average number for confirmed (top plot) cases and deaths (bottom plot. Each of the plots also contains the logarithmic scale (base 2).

Run the cell below to display a video on interpreting the plots above. 

In [None]:
from IPython.lib.display import YouTubeVideo
YouTubeVideo('6sTZYxsqTMo', width=900, height=400)

___

### Further questions

Use the widget above to explore how the number of COVID-19 cases in countries in different parts of the world does with respect to the log scale. Can you identify countries whose growth is exponential? 

Explore the cumulative number of cases for different countries. These plots can also indicate when the number of infections is growing exponentially. Identify countries whose log scale indicates exponential growth. 

<b>Notes: </b>

a) Enter the name of a country using the text box. This will display the log curve for raw or cumulative number of cases per country. 

b) Toggle the "Get cumulative results" checkbox to see how the log scale changes when we compute the daily number of cases, vs the cumulative number of cases. 

c) To see results for a different country, use the backspace or delete key to remove the country selected at first, and type the name of a new country (use the drop down menu to help you find as well). 


[![Callysto.ca License](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-bottom.jpg?raw=true)](https://github.com/callysto/curriculum-notebooks/blob/master/LICENSE.md)