# Compare countries growth rate vs confirmed cases
> Understand how well are the countries standing up, what countrie are accelerating the most?

- comments: false
- author: Pablo Zivic
- categories: [growth, compare, interactive]
- image: images/covid-growth-rate-confirmed-cases.png
- permalink: /covid-growth-rate-confirmed-cases/

In [None]:
#hide

from io import StringIO
import csv
import pandas as pd
from datetime import datetime
import requests

BASE_URL = 'https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/'

def parse_data(fname):
    csv_content = requests.get(BASE_URL + fname).content
    docs = list(csv.DictReader(StringIO(csv_content.decode('utf8'))))
        
    new_docs = []
    for doc in docs:
        meta = {k: doc[k] for k in ['Province/State', 'Country/Region', 'Lat', 'Long']}
        for k, v in doc.items():
            if k in meta: continue
            new_doc = meta.copy()
            new_doc['date'] = datetime.strptime(k, '%m/%d/%y')
            new_doc['cnt'] = int(v)
            new_docs.append(new_doc)


    return (
        pd.DataFrame(new_docs)
          .rename(
              columns={
                  'Province/State': 'province',
                  'Country/Region': 'country',
                  'Lat': 'lat', 'Long': 'long'
              }
          ).groupby(['date', 'country'])
          .cnt.sum()
          .reset_index()
    )

In [4]:
#hide
confirmed_df = parse_data('csse_covid_19_time_series/time_series_19-covid-Confirmed.csv')
confirmed_df = confirmed_df.rename(columns=dict(cnt='confirmed'))

In [11]:
#hide
import numpy as np

dfs = []
for c in confirmed_df.country.unique():
    cdf = confirmed_df[confirmed_df.country == c].set_index('date').sort_index()
    cdf['rate'] = np.minimum(1.6, cdf.confirmed / cdf.confirmed.shift(1)).rolling(5).mean().fillna(1.6)
    cdf['cum'] = cdf.confirmed.cumsum()
    dfs.append(cdf)
    
confirmed_df = pd.concat(dfs).reset_index()

In [7]:
#hide
cnt_by_country = confirmed_df.groupby('country').confirmed.max().sort_values(ascending=False)

In [46]:
#hide
import altair as alt

def plot_rate_cnt_trajectories(countries):
    selection = alt.selection_multi(
        fields=['country'], bind='legend', init=[{'country': c} for c in countries]
    )
    
    source = confirmed_df[(confirmed_df.confirmed > 30) & (confirmed_df.country.isin(countries))]
    
    return (
        alt.Chart(source)
           .mark_line(point=True, radius=150)
           .encode(
               x=alt.X('confirmed', scale=alt.Scale(type='log'), axis=alt.Axis(title='Confirmed Cases')),
               y=alt.Y('rate', axis=alt.Axis(title='Confirmed Cases'), scale=alt.Scale(domain=[0.95, 1.65])),
               color='country:N',
               tooltip=list(confirmed_df.columns), 
               opacity=alt.condition(selection, alt.value(.8), alt.value(.05))
            ).add_selection(selection)
             .configure_point(size=200).properties(width=650, height=400)
    )

# How do the countries with most cases compare with each other?


### March 13th
- We can see that China has dominated the illness, since it is on 1 on the Y axis (not growing).

- We can also see Italy is at a similar level to that of China when there were 40k confirmed cases. However it might seem that they are not deaccelerating as fast as China did.

- Iran seems to be entering on a slow piece growth rate: it would take 20 days to double 

- Germany is behind Italy a couple days behind Italy, growing at a much faster rate

In [48]:
#hide_input
plot_rate_cnt_trajectories(cnt_by_country.index[:5])

# What about Latin America?

- Brazil is growing fast, needs to take action quickly

- Chile seems to be starting a deacceleration 

- Argentina is growing constantly at a 20% rate

In [14]:
#hide_input
latam = [
    'Brazil', 'Chile', 'Argentina', 'Panama', 'Colombia','Mexico', 
    'Ecuador', 'Costa Rica', 'Venezuela', 'Dominican Republic', 'Bolivia',
    'Paraguay', 'Uruguay','Honduras', 'Cuba', 'Puerto Rico','Guatemala', 
]

plot_rate_cnt_trajectories(latam)