# Exploring data on COVID-19

In this notebook we will explore and analyse some data on the advance of the COVID-19 pandemic. The goal is to produce a plot like this:

![plot](US_data.png)

Shift-enter to evaluate a cell

$y = x^2$

In [None]:
url = "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv"

In [None]:
url

In [None]:
typeof(url)

In [None]:
x = 3

In [None]:
typeof(x)

In [None]:
x * x

In [None]:
url * url

In [None]:
*

In [None]:
methods(*);  # remove `;` to see the output

In [None]:
(1 + 2im) * (4 + im)

In [None]:
@which (1 + 2im) * (3 + im)

In [None]:
url

## Grab the data

In [None]:
download(url, "covid_data.csv")

In [None]:
readdir

In [None]:
readdir()

Install a package *ONCE* in our current Julia installation:

In [None]:
using Pkg   # built-in package manager in Julia: Pkg
Pkg.add("CSV")   # calls the `add` function from the module Pkg.  This installs a package

Load a package every time we run a Julia session: 

In [None]:
using CSV   # Comma Separated Values

In [None]:
CSV.read("covid_data.csv")  # run the function `read` from the package CSV

In [None]:
data = CSV.read("covid_data.csv");   

In [None]:
data

In [None]:
typeof(data)

In [None]:
using DataFrames

In [None]:
data_2 = rename(data, 1 => "province", 2 => "country")   

In [None]:
rename!(data, 1 => "province", 2 => "country") # ! is convention: function *modifies* its argument in place

In [None]:
data

In [None]:
DataFrames.rename!(...)

Ctrl-M, Y  to switch to code cell

Ctrl-M, M  to switch to **markdown**

Esc instead of Ctrl-M



## Interact.jl: Simple interactive visualizations

In [None]:
using Interact

In [None]:
for i in 1:10
    @show i
end

In [None]:
typeof(1:10)

In [None]:
collect(1:10)

In [None]:
for i in 1:10
    println("i = ", i)
end

In [None]:
for i in 1:10
    @show i
end

In [None]:
for i in 1:10
    i
end

In [None]:
@manipulate for i in 1:10
    HTML(i^2)
end

In [None]:
countries = data[2:5, 2]

In [None]:
countries = data[1:end, 2]

In [None]:
countries = collect(data[:, 2])

In [None]:
unique_countries = unique(countries)

In [None]:
@manipulate for i in 1:length(countries)
    countries[i]
end

Julia has 1-based indexing: indices of vectors start at 1, not 0 

In [None]:
@manipulate for i in 1:length(countries)
    data[i, 1:15]
end

## Extract data and plot

In [None]:
startswith("United", "U")

In [None]:
startswith("David", "U")

Array comprehension:

In [None]:
U_countries = [startswith(country, "U") for country in countries];

In [None]:
data[U_countries, :]

In [None]:
countries == "US"

In [None]:
countries .== "US"  
# . is "broadcasting": apply operation to each element of a vector

In [None]:
US_row = findfirst(countries .== "US")

In [None]:
US_data_row = data[US_row, :]

In [None]:
US_data = convert(Vector, US_data_row[5:end])

In [None]:
using Plots

In [None]:
plot(US_data)

In [None]:
col_names = names(data)

In [None]:
date_strings = String.(names(data))[5:end];  # apply String function to each element

Parse: convert string representation into a Julia object:

In [None]:
date_strings[1]

In [None]:
using Dates

In [None]:
format = Dates.DateFormat("d/m/Y")

In [None]:
parse(Date, date_strings[1], format)

In [None]:
format = Dates.DateFormat("m/d/Y")

In [None]:
parse(Date, date_strings[1], format) + Year(2000)

In [None]:
dates = parse.(Date, date_strings, format) .+ Year(2000)

In [None]:
plot(dates, US_data, xticks=dates[1:5:end], xrotation=45, leg=:topleft, 
    label="US data", m=:o)

xlabel!("date")
ylabel!("confirmed cases in US")
title!("US confirmed COVID-19 cases")

# annotate!(20, US_data[end], text("US", :blue, :left))

In [None]:
plot(dates, US_data, xticks=dates[1:5:end], xrotation=45, leg=:topleft, 
    label="US data", m=:o,
    yscale=:log10)

xlabel!("date")
ylabel!("confirmed cases in US")
title!("US confirmed COVID-19 cases")

# annotate!(20, US_data[end], text("US", :blue, :left))

Straight line on semi-log scale means **exponential growth**!

In [None]:
function f(country)
    return country * country
end

In [None]:
f("US")

`plot!`  add new curve onto the same graph
