# Coronavirus spreads

https://towardsdatascience.com/interactive-data-visualization-for-exploring-coronavirus-spreads-f33cabc64043

The data used for this data visualizations was provided by the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE) who shared their data on a public Github page: https://github.com/CSSEGISandData/COVID-19

datasource: https://gist.githubusercontent.com/BindiChen/cd911af6eb6205038d3aff036d20fb20/raw/736a886be20f10720493fda40dca7365fe3fbe87/covid_19_clean_complete_06_Apr_2020.csv
        
        

In [1]:
import pandas as pd
import altair as alt

In [2]:
# Load data
full_clean_data = pd.read_csv('data/covid_19_clean_complete.csv', parse_dates=['Date'])

# Select a list of countries
countries = ['US', 'Italy', 'China', 'Spain', 'France', 'Iran', 'United Kingdom', 'Switzerland']
in_countries = full_clean_data['Country/Region'].isin(countries)
in_countries.head()

FileNotFoundError: [Errno 2] No such file or directory: 'data/covid_19_clean_complete.csv'

In [None]:
selected_data = full_clean_data[in_countries]
selected_data.head()

#### Create a selection with type=’interval’
https://altair-viz.github.io/user_guide/generated/api/altair.selection_interval.html

In [None]:
interval = alt.selection_interval()
type(interval)

#### Create a circle chart - daily new cases
TimeUnit transforms are used to discretize dates and times within Altair:  
https://altair-viz.github.io/user_guide/transform/timeunit.html
  
By default timeUnit output is a continuous quantity; if you would instead like it to be a categorical, you can specify the ordinal (O) or nominal (N) type. This can be useful when plotting a bar chart or other discrete chart type.

In [None]:
# O - ordinal / a discrete ordered quantity
# Q - quantitative / a continuous real-valued quantity

# Create a chart from selected_data
circle = alt.Chart(selected_data)

# Set the chart's mark to 'circle'
circle = circle.mark_circle()

# Map data fields to visual properties
circle = circle.encode(
    x='monthdate(Date):O',
    y='Country/Region',
    size=alt.Size('New cases:Q',
        scale=alt.Scale(range=[0, 300]),
        legend=alt.Legend(title='Daily new cases')
    ) 
)

# Create conditional color encoding:
#    map the color to the "Country/Region" column for data in the selection 
#    map the color to "lightgray" for data outside the selection
circle = circle.encode(    
    color=alt.condition(predicate=interval, if_true='Country/Region', if_false=alt.value('lightgray'))
)

# Bind the interval to our chart by setting the selection property
circle = circle.properties(
    width=1000,
    height=300,
    selection=interval
)

type(circle)

#### Altair Scale
https://altair-viz.github.io/user_guide/generated/core/altair.Scale.html
  
For continuous scales, the range of the scale is the two-element array indicating minimum and maximum values.  
  
Vega Scales: https://vega.github.io/vega/docs/scales/
  
Internally, Vega uses the scales provided by the d3-scale library; for more background see Introducing d3-scale by Mike Bostock: https://medium.com/@mbostock/introducing-d3-scale-61980c51545f
  
#### TimeUnit Transform
https://altair-viz.github.io/user_guide/transform/timeunit.html
  
TimeUnit transforms are used to discretize dates and times within Altair.  
"date" - Day of month, i.e., 1 - 31  
"day" - Day of week, i.e., Monday - Friday  

#### Create a bar chart - the total number of new cases for the selected area
**transform_filter()** - selects a subset (filters) data based on a Selection object.  This chart is using an interval selection that allows a user to select the data to be shown on the bar chart.

Data Transformations:  
https://altair-viz.github.io/user_guide/transform/index.html  
https://altair-viz.github.io/user_guide/transform/filter.html  



In [None]:
bars = alt.Chart(selected_data).mark_bar().encode(
    y='Country/Region',
    color='Country/Region',
    x='sum(New cases):Q'
).properties(
    width=1000
).transform_filter(
    interval
)
type(bars)

In [None]:
circle & bars