# Exceso de mortalidad en Chile
> Exceso de mortalidad en Chile

- toc: true 
- badges: true
- comments: true
- author: Alonso Silva Allende
- categories: [jupyter]
- image: images/fallecimientos.png

In [1]:
#hide
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import altair as alt
import datetime

In [2]:
#hide
from IPython.display import display, Markdown, display_html, HTML

In [3]:
#hide
update_date = pd.to_datetime('today') - pd.offsets.Hour(19)

In [4]:
#hide_input
display(Markdown(f"Última actualización: {update_date.strftime('%d/%m/%Y')}."))

Última actualización: 13/08/2020.

In [5]:
#hide
deaths_raw = pd.read_csv(
    "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto32/Defunciones.csv")

In [6]:
#hide
deaths_raw.head(3)

Unnamed: 0,Region,Codigo region,Comuna,Codigo comuna,2010-01-01,2010-01-02,2010-01-03,2010-01-04,2010-01-05,2010-01-06,...,2020-08-04,2020-08-05,2020-08-06,2020-08-07,2020-08-08,2020-08-09,2020-08-10,2020-08-11,2020-08-12,2020-08-13
0,Antofagasta,2,Antofagasta,2101,0,6,1,8,0,5,...,10,4,8,6,13,4,7,8,9,6
1,Antofagasta,2,Calama,2201,0,0,0,6,3,4,...,6,5,3,2,0,0,6,6,4,1
2,Antofagasta,2,María Elena,2302,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


# Exceso de mortalidad en Chile

Durante una pandemia mundial, la recolección de datos de calidad es un problema de logística complicado. Además, las definiciones de indicadores se han modificado a través del tiempo. Por ejemplo, en Chile, el término "recuperado" ha tenido 3 definiciones distintas en dos meses. De igual modo, caso confirmado también ha sido modificado (sintomático vs asintomático) y recientemente fallecido confirmado (con PCR positivo vs en espera de resultado de exámen PCR y con PCR positivo), etc.
No hay un indicador perfecto, sin embargo, existe un indicador que es difícil de redefinir: defunciones.

El exceso de mortalidad es un cálculo sencillo a partir de los datos de defunciones inscritas en el registro civil: resulta de comparar las muertes de este año con el promedio (ponderado o no) de los años anteriores, semana a semana. En azul, el promedio (ponderado o no), en gris, los años anteriores y en rojo este año. 

In [7]:
#hide
# Defunciones inscritas en todo Chile
deaths = deaths_raw.drop(columns=["Region", "Codigo region", "Comuna", "Codigo comuna"]).sum()

In [8]:
#hide
deaths.head(3)

2010-01-01     88
2010-01-02    120
2010-01-03     83
dtype: int64

In [9]:
#hide
# sort rows by date
deaths.index = pd.to_datetime(deaths.index)
deaths = deaths.sort_index()

In [10]:
#hide
# last day in the data
last_day = deaths.index[-1]
last_day.strftime("%Y-%m-%d")

'2020-08-13'

In [11]:
#hide
# Give three days so that Registro Civil has time to add inscriptions
_, current_week, _ = (last_day-pd.DateOffset(1)).isocalendar()
current_week

33

In [12]:
#hide
deaths = (deaths
          .reset_index()
          .rename(columns={"index": "fecha", 0: "fallecidos"})
          )

In [13]:
#hide
deaths.head(3)

Unnamed: 0,fecha,fallecidos
0,2010-01-01,88
1,2010-01-02,120
2,2010-01-03,83


In [14]:
#hide
def get_isoyear_isoweek(row):
    isoyear, isoweek, _ = row["fecha"].isocalendar()
    return pd.Series({"año": isoyear, "semana": isoweek})

In [15]:
#hide
deaths[["año", "semana"]] = deaths.apply(get_isoyear_isoweek, axis="columns")

In [16]:
#hide
deaths.head()

Unnamed: 0,fecha,fallecidos,año,semana
0,2010-01-01,88,2009,53
1,2010-01-02,120,2009,53
2,2010-01-03,83,2009,53
3,2010-01-04,641,2010,1
4,2010-01-05,275,2010,1


In [17]:
#hide
deaths_year_week = (deaths
                    .drop(columns=["fecha"])
                    .groupby(["año", "semana"])
                    .sum()
                    ["fallecidos"]
                    .unstack()
                    .astype("Float16")
                    )

In [18]:
#hide
deaths_year_week.tail()

semana,1,2,3,4,5,6,7,8,9,10,...,44,45,46,47,48,49,50,51,52,53
año,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2016,2116.0,1858.0,1932.0,1863.0,1910.0,1749.0,1724.0,1836.0,1854.0,1765.0,...,1761.0,1927.0,1800.0,1854.0,1874.0,1811.0,1891.0,1788.0,1825.0,
2017,1874.0,1874.0,1958.0,2110.0,1940.0,1815.0,1854.0,1893.0,1830.0,1731.0,...,1930.0,1973.0,1816.0,1943.0,1907.0,1737.0,2006.0,2023.0,1936.0,
2018,1861.0,1865.0,1778.0,1779.0,1860.0,1885.0,1868.0,1818.0,1780.0,1904.0,...,1761.0,2226.0,1927.0,1986.0,1906.0,1932.0,1907.0,1980.0,1892.0,
2019,1939.0,1889.0,1807.0,1943.0,1932.0,2021.0,1893.0,1936.0,1885.0,1804.0,...,1898.0,2222.0,2003.0,2072.0,1960.0,1963.0,1878.0,1893.0,1980.0,
2020,2060.0,2116.0,2060.0,1987.0,2034.0,1906.0,1959.0,1825.0,1849.0,2020.0,...,,,,,,,,,,


In [19]:
#hide
deaths_year_week = deaths_year_week.drop(2009)
deaths_year_week = deaths_year_week.drop(columns=53)

In [20]:
#hide
deaths_year_week.loc[2019,31], deaths_year_week.loc[2020,31]

(2370.0, 2654.0)

In [21]:
#hide
# Ignore data from current week
deaths_year_week.loc[2020,current_week] = np.NaN

In [22]:
#hide
deaths_year_week.head(3)

semana,1,2,3,4,5,6,7,8,9,10,...,43,44,45,46,47,48,49,50,51,52
año,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2010,1786.0,1672.0,1775.0,1689.0,1687.0,1645.0,1685.0,1641.0,1865.0,1852.0,...,1906.0,1783.0,1722.0,1703.0,1738.0,1617.0,1810.0,1663.0,1725.0,1789.0
2011,1814.0,1718.0,1743.0,1795.0,1703.0,1618.0,1562.0,1657.0,1679.0,1542.0,...,1837.0,1673.0,1751.0,1714.0,1743.0,1750.0,1720.0,1735.0,1654.0,1911.0
2012,1937.0,1736.0,1730.0,1717.0,1731.0,1718.0,1669.0,1654.0,1697.0,1767.0,...,1923.0,1670.0,2019.0,1871.0,1728.0,1799.0,1812.0,1657.0,1759.0,1853.0


In [23]:
#hide
# PUCA
expected = deaths_year_week.loc[2015:2019].mean()

In [24]:
#hide
# Experimental standard deviation of PUCA
ci = deaths_year_week.loc[2015:2019].std(ddof=1)

In [25]:
#hide_input
exceso = (deaths_year_week.loc[2020,12:current_week-1] - expected).sum()
display(Markdown(f"Exceso de mortalidad (sin corregir): {'{:,.0f}'.format(exceso).replace(',', '.')}"))

Exceso de mortalidad (sin corregir): 12.112

In [26]:
#hide
totales_covid19 = pd.read_csv(
    "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto5/TotalesNacionales_T.csv",
    index_col="Fecha")

In [27]:
#hide
totales_covid19.head(2)

Unnamed: 0_level_0,Casos nuevos con sintomas,Casos totales,Casos recuperados,Fallecidos,Casos activos,Casos nuevos sin sintomas,Casos nuevos totales,Casos activos por FD,Casos activos por FIS,Casos recuperados por FIS,Casos recuperados por FD,Casos confirmados recuperados,Casos activos confirmados,Casos probables acumulados,Casos activos probables,Casos nuevos sin notificar
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
2020-03-02,1.0,1.0,0.0,0.0,1.0,,1.0,1.0,,,0.0,,,,,
2020-03-03,0.0,1.0,0.0,0.0,1.0,,0.0,1.0,,,0.0,,,,,


In [28]:
#hide
def iso_year_start(iso_year):
    "The gregorian calendar date of the first day of the given ISO year"
    fourth_jan = datetime.date(iso_year, 1, 4)
    delta = datetime.timedelta(fourth_jan.isoweekday()-1)
    return fourth_jan - delta 

In [29]:
#hide
def iso_to_gregorian(iso_year, iso_week, iso_day):
    "Gregorian calendar date for the given ISO year, week and day"
    year_start = iso_year_start(iso_year)
    return year_start + datetime.timedelta(days=iso_day-1, weeks=iso_week-1)

In [30]:
#hide
date = iso_to_gregorian(2020, current_week-1, 7)

In [31]:
#hide_input
display(Markdown(f"Fallecimientos confirmados por Covid-19 oficiales al {pd.to_datetime(date).strftime('%d/%m')}: {'{:,.0f}'.format(totales_covid19['Fallecidos'].loc[pd.to_datetime(date).strftime('%Y-%m-%d')]).replace(',', '.')}"))

Fallecimientos confirmados por Covid-19 oficiales al 09/08: 10.077

In [32]:
#hide_input
diff = (deaths_year_week.loc[2020,12:current_week-1]-expected).sum() - totales_covid19['Fallecidos'].loc[pd.to_datetime(date).strftime('%Y-%m-%d')]
display(Markdown(f"Diferencia: {'{:,.0f}'.format(diff).replace(',', '.')}"))

Diferencia: 2.035

In [33]:
#hide
df_expected = pd.DataFrame()
df_expected["lower"] = expected - ci
df_expected["upper"] = expected + ci
df_expected["lower 2"] = expected - 2*ci
df_expected["upper 2"] = expected + 2*ci

In [34]:
#hide
df_expected.head(2)

Unnamed: 0_level_0,lower,upper,lower 2,upper 2
semana,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,1795.0,2039.0,1673.0,2162.0
2,1861.0,1889.0,1848.0,1902.0


In [35]:
#hide
deaths_year_week = deaths_year_week.T
deaths_year_week["Promedio últimos 5 años"] = expected
deaths_year_week = deaths_year_week.T

In [36]:
#hide_input
label = alt.selection_single(
    encodings=['x'], # limit selection to x-axis value
    on='mouseover',  # select on mouseover events
    nearest=True,    # select data point nearest the cursor
    empty='none'     # empty selection includes no data points
)

base = alt.Chart(deaths_year_week.drop([i for i in np.arange(2010,2015)]).reset_index().melt("año", value_name="defunciones")).mark_line(point=True).encode(
    x = alt.X("semana:Q",scale=alt.Scale(domain=(1, 52))),
    y = alt.Y("defunciones", scale=alt.Scale(domain=(0,4000))),
    color=alt.Color('año:N', scale=alt.Scale(range=['lightgray', 'lightgray', 'lightgray', 'lightgray', 'lightgray', 'red',"blue"], 
                                             domain=["2015", "2016", "2017", "2018", "2019", "2020", "Promedio últimos 5 años"]))
)

alt.layer(
    base, # base line chart
    
    alt.Chart(df_expected.reset_index()).mark_area(
        opacity=0.5
    ).encode(
        x=alt.X("semana", scale=alt.Scale(domain=(1, 52))),
        y=alt.Y("lower:Q", axis=alt.Axis(title="defunciones")),
        y2="upper:Q"
    ),
    
    alt.Chart(df_expected.reset_index()).mark_area(
        opacity=0.3
    ).encode(
        x=alt.X("semana", scale=alt.Scale(domain=(1, 52))),
        y="lower 2:Q",
        y2="upper 2:Q"
    ),
    # add a rule mark to serve as a guide line
    alt.Chart().mark_rule(color='#aaa').encode(
        x = alt.X('semana:Q', scale=alt.Scale(domain=(1, 52)), axis=alt.Axis(title='semana'), sort=None)
    ).transform_filter(label),
    
    # add circle marks for selected time points, hide unselected points
    base.mark_circle().encode(
        opacity=alt.condition(label, alt.value(1), alt.value(0))
    ).add_selection(label),

    # add white stroked text to provide a legible background for labels
    base.mark_text(align='left', dx=5, dy=-5, stroke='white', strokeWidth=2).encode(
        text='defunciones'
    ).transform_filter(label),

    # add text labels for stock prices
    base.mark_text(align='left', dx=5, dy=-5).encode(
        text='defunciones'
    ).transform_filter(label),
    
    data=deaths_year_week.drop([i for i in np.arange(2010,2015)]).reset_index().melt("año", value_name="defunciones")
).properties(
    title = f'Defunciones inscritas en Chile por semana hasta semana {current_week-1}',
    width=600
)

Fuente: [Ministerio de Ciencia](https://github.com/MinCiencia/Datos-COVID19), [Registro Civil](https://www.registrocivil.cl/)

In [37]:
#hide
deaths_year_week = deaths_year_week.drop("Promedio últimos 5 años")

In [38]:
#hide
deaths = deaths_raw.drop(columns=["Region", "Codigo region", "Comuna", "Codigo comuna"]).sum()
n_defunciones = []
for year in np.arange(2010,2020):
    n_defunciones.append(deaths.loc[f"{year}-01-01":f"{year}-12-31"].sum())

In [39]:
#hide
df = pd.DataFrame()
df["Año"] = np.arange(2010,2020)
df["Número de defunciones"] = n_defunciones 
df = df.set_index("Año")

In [40]:
#hide
adjustment = 365/366
df["ajustado a 365 días"] = df["Número de defunciones"].copy()
df["ajustado a 365 días"].loc[2012] *= adjustment
df["ajustado a 365 días"].loc[2016] *= adjustment
df["ajustado a 365 días"] = df["ajustado a 365 días"].astype(int)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_with_indexer(indexer, value)


In [41]:
#hide
df

Unnamed: 0_level_0,Número de defunciones,ajustado a 365 días
Año,Unnamed: 1_level_1,Unnamed: 2_level_1
2010,98178,98178
2011,95105,95105
2012,99169,98898
2013,100286,100286
2014,102252,102252
2015,103710,103710
2016,104390,104104
2017,106877,106877
2018,107286,107286
2019,109837,109837


In [42]:
#hide
growth_rate = []
for year in np.arange(2011, 2020):
    growth_rate.append(df["ajustado a 365 días"].loc[year]/df["ajustado a 365 días"].loc[year-1])
growth_rate

[0.9686997086923751,
 1.0398822354240052,
 1.014034661974964,
 1.0196039327523283,
 1.0142588898016665,
 1.0037990550573714,
 1.0266368247137478,
 1.0038268289716217,
 1.0237775665044833]

In [43]:
#hide
growth_rate_percentage = [f"{100*(r - 1):.2f}%" for r in growth_rate]

In [44]:
#hide
df["Variación c/r año anterior"] = ["-"]+growth_rate_percentage

In [45]:
#hide
gr_mean = np.mean(growth_rate[2:])
gr_mean

1.0151339656823117

In [46]:
#hide
gr_std = np.std(growth_rate[2:], ddof=1)
gr_std

0.008993645651094097

In [47]:
#hide
df

Unnamed: 0_level_0,Número de defunciones,ajustado a 365 días,Variación c/r año anterior
Año,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2010,98178,98178,-
2011,95105,95105,-3.13%
2012,99169,98898,3.99%
2013,100286,100286,1.40%
2014,102252,102252,1.96%
2015,103710,103710,1.43%
2016,104390,104104,0.38%
2017,106877,106877,2.66%
2018,107286,107286,0.38%
2019,109837,109837,2.38%


In [48]:
#hide
print(f"Promedio de las tasas de crecimiento anual de defunciones inscritas entre 2012 y 2019: {100*(gr_mean-1):.2f}%")

Promedio de las tasas de crecimiento anual de defunciones inscritas entre 2012 y 2019: 1.51%


In [49]:
#hide
print(f"Promedio menos una desviación estándar: {100*(gr_mean-gr_std-1):.2f}%")
print(f"Promedio más una desviación estándar: {100*(gr_mean+gr_std-1):.2f}%")

Promedio menos una desviación estándar: 0.61%
Promedio más una desviación estándar: 2.41%


In [50]:
#hide
adjustement = []
for year in np.arange(2015, 2020):
    adjustement.append((df.loc[2019,'ajustado a 365 días']/df.loc[year,'ajustado a 365 días'])*gr_mean)

In [51]:
#hide
adjustement

[1.0751062519395245,
 1.0710373221840475,
 1.0432484948927092,
 1.0392713810622827,
 1.0151339656823117]

In [52]:
#hide
df_adjustement = pd.DataFrame()
df_adjustement["Año"] = np.arange(2015,2020)
df_adjustement["Número de defunciones"] = np.round(adjustement, decimals=3) 
df_adjustement = df_adjustement.set_index("Año")

In [53]:
#hide
df_adjustement

Unnamed: 0_level_0,Número de defunciones
Año,Unnamed: 1_level_1
2015,1.075
2016,1.071
2017,1.043
2018,1.039
2019,1.015


In [54]:
#hide
amended_deaths = pd.DataFrame()
for year in deaths_year_week.index[2:-1]:
    amended_deaths[year] = \
    deaths_year_week.loc[year]*(df.loc[2019,"ajustado a 365 días"]/df.loc[year,"ajustado a 365 días"])*gr_mean
amended_deaths[2020] = deaths_year_week.loc[2020,:]

In [55]:
#hide
# amended_deaths = pd.DataFrame()
# i = 8
# for year in deaths_year_week.index[2:-1]:
#     amended_deaths[year] = deaths_year_week.loc[year]*(gr_mean**i)
#     i -=1
# amended_deaths[2020] = deaths_year_week.loc[2020,:22]*(gr_mean**i)

In [56]:
#hide
amended_deaths.tail()

Unnamed: 0_level_0,2012,2013,2014,2015,2016,2017,2018,2019,2020
semana,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
48,2027.0,1952.0,2012.0,2001.0,2005.0,1988.0,1980.0,1989.0,
49,2041.0,2090.0,2082.0,2104.0,1938.0,1810.0,2006.0,1992.0,
50,1867.0,1895.0,1902.0,2056.0,2023.0,2090.0,1981.0,1906.0,
51,1982.0,2130.0,1992.0,2018.0,1914.0,2108.0,2056.0,1921.0,
52,2088.0,2062.0,1986.0,1899.0,1953.0,2018.0,1964.0,2009.0,


In [57]:
#hide
amended_deaths = amended_deaths.T
amended_deaths.tail()

semana,1,2,3,4,5,6,7,8,9,10,...,43,44,45,46,47,48,49,50,51,52
2016,2264.0,1989.0,2068.0,1994.0,2044.0,1872.0,1845.0,1964.0,1984.0,1889.0,...,2234.0,1884.0,2062.0,1926.0,1984.0,2005.0,1938.0,2023.0,1914.0,1953.0
2017,1953.0,1953.0,2041.0,2200.0,2022.0,1892.0,1933.0,1973.0,1908.0,1804.0,...,1876.0,2012.0,2056.0,1893.0,2025.0,1988.0,1810.0,2090.0,2108.0,2018.0
2018,1933.0,1937.0,1847.0,1848.0,1932.0,1957.0,1940.0,1888.0,1849.0,1978.0,...,2142.0,1828.0,2312.0,2001.0,2062.0,1980.0,2006.0,1981.0,2056.0,1964.0
2019,1967.0,1917.0,1833.0,1971.0,1960.0,2050.0,1921.0,1964.0,1913.0,1830.0,...,2047.0,1926.0,2254.0,2032.0,2102.0,1989.0,1992.0,1906.0,1921.0,2009.0
2020,2060.0,2116.0,2060.0,1987.0,2034.0,1906.0,1959.0,1825.0,1849.0,2020.0,...,,,,,,,,,,


In [58]:
#hide
#PPUCA
expected = amended_deaths.loc[2015:2019].mean()
ci = amended_deaths.loc[2015:2019].std(ddof=1)

In [59]:
#hide
exceso = (amended_deaths.loc[2020,12:current_week-1] - expected).sum()
print(f"Exceso de mortalidad (con corrección): {'{:,.0f}'.format(exceso).replace(',', '.')}")

Exceso de mortalidad (con corrección): 9.984


In [60]:
#hide
df_expected = pd.DataFrame()
df_expected["lower"] = expected - ci
df_expected["upper"] = expected + ci
df_expected["lower 2"] = expected - 2*ci
df_expected["upper 2"] = expected + 2*ci

In [61]:
#hide
amended_deaths = amended_deaths.T
amended_deaths["Promedio ponderado últimos 5 años"] = expected
amended_deaths = amended_deaths.T

In [62]:
#hide_input
label = alt.selection_single(
    encodings=['x'], # limit selection to x-axis value
    on='mouseover',  # select on mouseover events
    nearest=True,    # select data point nearest the cursor
    empty='none'     # empty selection includes no data points
)

base = alt.Chart(amended_deaths.drop([i for i in np.arange(2012,2015)]).reset_index().melt("index", value_name="defunciones").rename(columns={"index":"año"})).mark_line(point=True).encode(
    x = alt.X("semana:Q",scale=alt.Scale(domain=(1, 52))),
    y = alt.Y("defunciones", scale=alt.Scale(domain=(0,3750))),
    color=alt.Color('año:N', scale=alt.Scale(range=['lightgray', 'lightgray', 'lightgray', 'lightgray', 'lightgray', 'red', 'blue'], 
                                             domain=["2015", "2016", "2017", "2018", "2019", "2020", "Promedio ponderado últimos 5 años"]))
)

alt.layer(
    base, # base line chart
    
    alt.Chart(df_expected.reset_index()).mark_area(
        opacity=0.5
    ).encode(
        x="semana",
        y=alt.Y("lower:Q", axis=alt.Axis(title="defunciones")),
        #y="lower:Q",
        y2="upper:Q"
    ),
    
    alt.Chart(df_expected.reset_index()).mark_area(
        opacity=0.3
    ).encode(
        x="semana",
        y="lower 2:Q",
        y2="upper 2:Q"
    ),
    
#     alt.Chart(expected.reset_index().rename(columns={0:"defunciones"})).mark_line(point=True).encode(
#     x = "semana",
#     y = "defunciones"
#     ),
    # add a rule mark to serve as a guide line
    alt.Chart().mark_rule(color='#aaa').encode(
        x = alt.X('semana:Q', scale=alt.Scale(domain=(1, 52)), axis=alt.Axis(title='Semanas'), sort=None)
    ).transform_filter(label),
    
    # add circle marks for selected time points, hide unselected points
    base.mark_circle().encode(
        opacity=alt.condition(label, alt.value(1), alt.value(0))
    ).add_selection(label),

    # add white stroked text to provide a legible background for labels
    base.mark_text(align='left', dx=5, dy=-5, stroke='white', strokeWidth=2).encode(
        text='defunciones'
    ).transform_filter(label),

    # add text labels for stock prices
    base.mark_text(align='left', dx=5, dy=-5).encode(
        text='defunciones'
    ).transform_filter(label),
    
    data=amended_deaths.drop([i for i in np.arange(2012,2015)]).reset_index().melt("index", value_name="defunciones").rename(columns={"index":"año"})
).properties(
    title = f'Defunciones inscritas en Chile por semana (corregido) hasta semana {current_week-1}',
    width=600
)

Fuente: [Ministerio de Ciencia](https://github.com/MinCiencia/Datos-COVID19), [Registro Civil](https://www.registrocivil.cl/)

In [63]:
#hide
(amended_deaths.loc[2020,:12] - expected).sum()

491.0

In [64]:
#hide
excess_mortality = []
for i in np.arange(0,current_week-1):
    excess_mortality.append((amended_deaths.loc[2020,:i] - expected).sum())

In [65]:
#hide
excess_mortality = pd.Series(excess_mortality)
#excess_mortality
excess_mortality.diff()

0        NaN
1       52.0
2      151.0
3      103.0
4        0.0
5       53.0
6      -51.0
7       39.0
8     -124.0
9      -77.0
10     154.0
11      78.0
12     113.0
13      28.0
14      71.0
15     -66.0
16     199.0
17       6.0
18     -57.0
19     314.0
20     288.0
21     556.0
22    1058.0
23    1400.0
24    1544.0
25    1328.0
26    1224.0
27     640.0
28     528.0
29     304.0
30     296.0
31     192.0
dtype: float64

In [66]:
#hide
casos_raw = pd.read_csv(
    "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto3/CasosTotalesCumulativo.csv",
    index_col='Region')

In [67]:
#hide
casos_raw.loc["Metropolitana"].head(2)

2020-03-03    0.0
2020-03-04    1.0
Name: Metropolitana, dtype: float64

In [68]:
#hide
_, week_first_case, _ = pd.to_datetime("2020-03-03").isocalendar()
week_first_case

10

In [69]:
#hide
deaths_raw = pd.read_csv(
    "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto5/TotalesNacionales_T.csv")

In [70]:
#hide
deaths_raw[deaths_raw["Fallecidos"]>0].head(2)

Unnamed: 0,Fecha,Casos nuevos con sintomas,Casos totales,Casos recuperados,Fallecidos,Casos activos,Casos nuevos sin sintomas,Casos nuevos totales,Casos activos por FD,Casos activos por FIS,Casos recuperados por FIS,Casos recuperados por FD,Casos confirmados recuperados,Casos activos confirmados,Casos probables acumulados,Casos activos probables,Casos nuevos sin notificar
20,2020-03-22,95.0,632.0,8.0,1.0,623.0,,95.0,623.0,,,8.0,,,,,
21,2020-03-23,114.0,746.0,11.0,1.0,734.0,,114.0,734.0,,,11.0,,,,,


In [71]:
#hide
_, week_first_death, _ = pd.to_datetime("2020-03-22").isocalendar()
week_first_death

12

In [72]:
#hide_input
display(Markdown(
f"Entre las semanas 12 y {current_week-1}, hubo un total de {'{:,.0f}'.format(amended_deaths.loc[2020,12:current_week-1].sum()).replace(',', '.')} defunciones inscritas."))

Entre las semanas 12 y 32, hubo un total de 57.056 defunciones inscritas.

In [73]:
#hide_input
display(Markdown(f"El número esperado de defunciones inscritas entre esas semanas es de {'{:,.0f}'.format(expected.loc[12:current_week-1].sum()).replace(',', '.')}, para una tase de crecimiento de defunciones inscritas entre 2019/2020 de {100*(gr_mean-1):.2f}%."))

El número esperado de defunciones inscritas entre esas semanas es de 47.072, para una tase de crecimiento de defunciones inscritas entre 2019/2020 de 1.51%.

In [74]:
#hide_input
exceso = (amended_deaths.loc[2020,12:current_week-1]-expected).sum()
display(Markdown(f"Exceso de mortalidad al {pd.to_datetime(date).strftime('%d/%m')}: {'{:,.0f}'.format(exceso).replace(',', '.')}"))

Exceso de mortalidad al 09/08: 9.984

In [75]:
#hide
(amended_deaths.loc[2020,22:22]-expected).sum(), (amended_deaths.loc[2020,23:23]-expected).sum()

(1058.0, 1400.0)

In [76]:
#hide
totales_covid19 = pd.read_csv(
    "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto5/TotalesNacionales_T.csv",
    index_col="Fecha")

In [77]:
#hide
totales_covid19.head(2)

Unnamed: 0_level_0,Casos nuevos con sintomas,Casos totales,Casos recuperados,Fallecidos,Casos activos,Casos nuevos sin sintomas,Casos nuevos totales,Casos activos por FD,Casos activos por FIS,Casos recuperados por FIS,Casos recuperados por FD,Casos confirmados recuperados,Casos activos confirmados,Casos probables acumulados,Casos activos probables,Casos nuevos sin notificar
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
2020-03-02,1.0,1.0,0.0,0.0,1.0,,1.0,1.0,,,0.0,,,,,
2020-03-03,0.0,1.0,0.0,0.0,1.0,,0.0,1.0,,,0.0,,,,,


In [78]:
#hide_input
display(Markdown(f"Fallecimientos confirmados por Covid-19 oficiales al {pd.to_datetime(date).strftime('%d/%m')}: {'{:,.0f}'.format(totales_covid19['Fallecidos'].loc[pd.to_datetime(date).strftime('%Y-%m-%d')]).replace(',', '.')}"))

Fallecimientos confirmados por Covid-19 oficiales al 09/08: 10.077

In [79]:
#hide
diff = (amended_deaths.loc[2020,12:current_week-1]-expected).sum() - totales_covid19['Fallecidos'].loc[pd.to_datetime(date).strftime('%Y-%m-%d')]

In [80]:
#hide_input
display(Markdown(f"Diferencia: {'{:,.0f}'.format(diff).replace(',', '.')}"))

Diferencia: -93

In [81]:
#hide
casos_raw = pd.read_csv(
    "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto3/CasosTotalesCumulativo.csv",
    index_col='Region')

In [82]:
#hide
casos_raw.loc["Metropolitana"].head(2)

2020-03-03    0.0
2020-03-04    1.0
Name: Metropolitana, dtype: float64

In [83]:
#hide
deaths_raw = pd.read_csv("https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto5/TotalesNacionales_T.csv")

In [84]:
#hide
deaths_raw[deaths_raw["Fallecidos"]>0].head(2)

Unnamed: 0,Fecha,Casos nuevos con sintomas,Casos totales,Casos recuperados,Fallecidos,Casos activos,Casos nuevos sin sintomas,Casos nuevos totales,Casos activos por FD,Casos activos por FIS,Casos recuperados por FIS,Casos recuperados por FD,Casos confirmados recuperados,Casos activos confirmados,Casos probables acumulados,Casos activos probables,Casos nuevos sin notificar
20,2020-03-22,95.0,632.0,8.0,1.0,623.0,,95.0,623.0,,,8.0,,,,,
21,2020-03-23,114.0,746.0,11.0,1.0,734.0,,114.0,734.0,,,11.0,,,,,


In [85]:
#hide
_, week_first_death, _ = pd.to_datetime("2020-03-22").isocalendar()
week_first_death

12

In [86]:
#hide
data = pd.DataFrame()
data_raw = pd.read_csv(
    "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto4/2020-03-24-CasosConfirmados-totalRegional.csv",
    index_col='Region')
data['2020-03-24'] = data_raw['Fallecidos']

In [87]:
#hide
update_date = pd.to_datetime('today') - pd.offsets.Hour(19)
today = update_date.strftime('%Y-%m-%d')
today

'2020-08-13'

In [88]:
#hide
first_death_date = '2020-03-24'
total_days = (pd.to_datetime(today)-pd.to_datetime(first_death_date)).days

In [89]:
#hide
# s = "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto4/" + date + "-CasosConfirmados-totalRegional.csv"
# data_by_date = pd.read_csv(s)
# idx_fallecidos = [i for i, x in enumerate(data_by_date.columns.str.contains("Fallecidos", case=False)) if x]

In [90]:
#hide
for i in np.arange(total_days+1):
    date = (pd.to_datetime(first_death_date)+pd.DateOffset(i)).strftime('%Y-%m-%d')
#     print(f'{date}')
    s = "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto4/" + date + "-CasosConfirmados-totalRegional.csv"
    data_by_date = pd.read_csv(s)
    if 'Fallecidos' in data_by_date.columns:
        data[date] = data_by_date["Fallecidos"].values
    elif 'Casos fallecidos' in data_by_date.columns:
        data[date] = data_by_date["Casos fallecidos"].values
    elif 'Fallecidos totales' in data_by_date.columns:
        if 'Se desconoce región de origen' in data_by_date["Region"].unique():
            data_by_date = data_by_date.set_index("Region").drop("Se desconoce región de origen").reset_index()
            data[date] = data_by_date["Fallecidos totales"].values
        else:
            data[date] = data_by_date["Fallecidos totales"].values
    elif 'Fallecidos totales ' in data_by_date.columns:
        data[date] = data_by_date["Fallecidos totales "].values
    else:
        data[date] = data_by_date[" Casos fallecidos"].values

# Exceso de mortalidad en R.M.

In [91]:
#hide
deaths_raw = pd.read_csv(
    "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto32/Defunciones.csv")

In [92]:
#hide
# Defunciones inscritas en la R.M.
deaths = deaths_raw.query("Region == 'Metropolitana de Santiago'").drop(columns=["Region", "Codigo region", "Comuna", "Codigo comuna"]).sum()

In [93]:
#hide
# sort rows by date/index
deaths.index = pd.to_datetime(deaths.index)
deaths = deaths.sort_index()
last_day = deaths.index[-1]
last_day.strftime("%Y-%m-%d")

'2020-08-13'

In [94]:
#hide
deaths = (deaths
          .reset_index()
          .rename(columns={"index": "fecha", 0: "fallecidos"})
          )

In [95]:
#hide
def get_isoyear_isoweek(row):
    isoyear, isoweek, _ = row["fecha"].isocalendar()
    return pd.Series({"año": isoyear, "semana": isoweek})

In [96]:
#hide
deaths[["año", "semana"]] = deaths.apply(get_isoyear_isoweek, axis="columns")

In [97]:
#hide
deaths_year_week = (deaths
                    .drop(columns=["fecha"])
                    .groupby(["año", "semana"])
                    .sum()
                    ["fallecidos"]
                    .unstack()
                    .astype("Float16")
                    )

In [98]:
#hide
deaths_year_week = deaths_year_week.drop(2009)
deaths_year_week = deaths_year_week.drop(columns=53)

In [99]:
#hide
deaths_year_week.loc[2020]

semana
1      825.0
2      784.0
3      751.0
4      724.0
5      767.0
6      683.0
7      706.0
8      621.0
9      639.0
10     807.0
11     717.0
12     719.0
13     709.0
14     754.0
15     787.0
16     779.0
17     790.0
18     744.0
19     952.0
20    1098.0
21    1379.0
22    1963.0
23    2246.0
24    2356.0
25    2216.0
26    1908.0
27    1531.0
28    1391.0
29    1211.0
30    1165.0
31    1054.0
32     983.0
33     422.0
34       NaN
35       NaN
36       NaN
37       NaN
38       NaN
39       NaN
40       NaN
41       NaN
42       NaN
43       NaN
44       NaN
45       NaN
46       NaN
47       NaN
48       NaN
49       NaN
50       NaN
51       NaN
52       NaN
Name: 2020, dtype: float16

In [100]:
#hide
# Ignore data from current week
deaths_year_week.loc[2020,current_week] = np.NaN

In [101]:
#hide
expected = deaths_year_week.loc[2015:2019].mean()

In [102]:
#hide
ci = deaths_year_week.loc[2015:2019].std(ddof=1)

In [103]:
#hide
df_expected = pd.DataFrame()
df_expected["Expected"] = expected
df_expected["lower"] = expected - ci
df_expected["upper"] = expected + ci
df_expected["lower 2"] = expected - 2*ci
df_expected["upper 2"] = expected + 2*ci

In [104]:
#hide
df_expected.loc[20:24]

Unnamed: 0_level_0,Expected,lower,upper,lower 2,upper 2
semana,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
20,809.5,754.0,865.0,699.0,920.0
21,811.0,755.5,866.5,700.5,921.5
22,853.0,817.5,888.5,781.5,924.5
23,874.5,795.0,954.0,715.5,1033.0
24,920.0,816.0,1024.0,712.5,1128.0


In [105]:
#hide
deaths_year_week = deaths_year_week.T
deaths_year_week["Promedio últimos 5 años"] = expected
deaths_year_week = deaths_year_week.T

In [106]:
#hide_input
label = alt.selection_single(
    encodings=['x'], # limit selection to x-axis value
    on='mouseover',  # select on mouseover events
    nearest=True,    # select data point nearest the cursor
    empty='none'     # empty selection includes no data points
)

base = alt.Chart(deaths_year_week.drop([i for i in np.arange(2010,2015)]).reset_index().melt("año", value_name="defunciones")).mark_line(point=True).encode(
    x = alt.X("semana:Q",scale=alt.Scale(domain=(1, 52))),
    y = alt.Y("defunciones", scale=alt.Scale(domain=(0,2500))),
    color=alt.Color('año:N', scale=alt.Scale(range=['lightgray', 'lightgray', 'lightgray', 'lightgray', 'lightgray', 'red',"blue"], 
                                             domain=["2015", "2016", "2017", "2018", "2019", "2020", "Promedio últimos 5 años"]))
)

alt.layer(
    base, # base line chart
    
    alt.Chart(df_expected.reset_index()).mark_area(
        opacity=0.5
    ).encode(
        x=alt.X("semana", scale=alt.Scale(domain=(1, 52))),
        y=alt.Y("lower:Q", axis=alt.Axis(title="defunciones")),
        y2="upper:Q"
    ),
    
    alt.Chart(df_expected.reset_index()).mark_area(
        opacity=0.3
    ).encode(
        x=alt.X("semana", scale=alt.Scale(domain=(1, 52))),
        y="lower 2:Q",
        y2="upper 2:Q"
    ),
    # add a rule mark to serve as a guide line
    alt.Chart().mark_rule(color='#aaa').encode(
        x = alt.X('semana', scale=alt.Scale(domain=(1, 52)), axis=alt.Axis(title='semana'), sort=None)
    ).transform_filter(label),
    
    # add circle marks for selected time points, hide unselected points
    base.mark_circle().encode(
        opacity=alt.condition(label, alt.value(1), alt.value(0))
    ).add_selection(label),

    # add white stroked text to provide a legible background for labels
    base.mark_text(align='left', dx=5, dy=-5, stroke='white', strokeWidth=2).encode(
        text='defunciones'
    ).transform_filter(label),

    # add text labels for stock prices
    base.mark_text(align='left', dx=5, dy=-5).encode(
        text='defunciones'
    ).transform_filter(label),
    
    data=deaths_year_week.drop([i for i in np.arange(2010,2015)]).reset_index().melt("año", value_name="defunciones")
).properties(
    title = f'Defunciones inscritas en la R.M. por semana (sin corregir) hasta semana {current_week-1}',
    width=600
)

Fuente: [Ministerio de Ciencia](https://github.com/MinCiencia/Datos-COVID19), [Registro Civil](https://www.registrocivil.cl/)

In [107]:
#hide
deaths_year_week = deaths_year_week.drop("Promedio últimos 5 años")

In [108]:
#hide
n_defunciones = []
for year in np.arange(2010,2020):
    n_defunciones.append(deaths_year_week.loc[year,:].astype(int).sum())

In [109]:
#hide
df = pd.DataFrame()
df["Año"] = np.arange(2010,2020)
df["Número de defunciones"] = n_defunciones 
df = df.set_index("Año")

In [110]:
#hide
adjustment = 365/366
df["ajustado a 365 días"] = df["Número de defunciones"].copy()
df["ajustado a 365 días"].loc[2012] *= adjustment
df["ajustado a 365 días"].loc[2016] *= adjustment
df["ajustado a 365 días"] = df["ajustado a 365 días"].astype(int)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_with_indexer(indexer, value)


In [111]:
#hide
growth_rate = []
for year in np.arange(2011, 2020):
    growth_rate.append(df["ajustado a 365 días"].loc[year]/df["ajustado a 365 días"].loc[year-1])
growth_rate

[0.9808015807321567,
 1.028095393662202,
 1.0085795996186844,
 1.0310333963453056,
 1.0062388591800357,
 0.9991901809439453,
 1.0411569536256111,
 0.977546949498881,
 1.046161503048401]

In [112]:
#hide
print(f"Promedio de las tasas de crecimiento anual entre 2012 y 2019: {100*(np.mean(growth_rate[2:])-1):.2f}%")

Promedio de las tasas de crecimiento anual entre 2012 y 2019: 1.57%


In [113]:
#hide
deaths = deaths_raw.query("Region == 'Metropolitana de Santiago'").drop(columns=["Region", "Codigo region", "Comuna", "Codigo comuna"]).sum()
n_defunciones = []
for year in np.arange(2010,2020):
    n_defunciones.append(deaths.loc[f"{year}-01-01":f"{year}-12-31"].sum())
df = pd.DataFrame()
df["Año"] = np.arange(2010,2020)
df["Número de defunciones"] = n_defunciones 
df = df.set_index("Año")

In [114]:
#hide
adjustment = 365/366
df["ajustado a 365 días"] = df["Número de defunciones"].copy()
df["ajustado a 365 días"].loc[2012] *= adjustment
df["ajustado a 365 días"].loc[2016] *= adjustment
df["ajustado a 365 días"] = df["ajustado a 365 días"].astype(int)

In [115]:
#hide
df

Unnamed: 0_level_0,Número de defunciones,ajustado a 365 días
Año,Unnamed: 1_level_1,Unnamed: 2_level_1
2010,37540,37540
2011,36874,36874
2012,37990,37886
2013,38251,38251
2014,39336,39336
2015,39642,39642
2016,39836,39727
2017,41114,41114
2018,40319,40319
2019,42183,42183


In [116]:
#hide
growth_rate = []
for year in np.arange(2011, 2020):
    growth_rate.append(df["ajustado a 365 días"].loc[year]/df["ajustado a 365 días"].loc[year-1])
growth_rate

[0.9822589238145978,
 1.0274448120627,
 1.0096341656548593,
 1.02836527149617,
 1.0077791336180597,
 1.00214419050502,
 1.0349132831575503,
 0.9806635209417717,
 1.0462313053399142]

In [117]:
#hide
growth_rate_percentage = [f"{100*(r - 1):.2f}%" for r in growth_rate]

In [118]:
#hide
df["Variación c/r año anterior"] = ["-"]+growth_rate_percentage

In [119]:
#hide
gr_mean = np.mean(growth_rate[2:])
gr_mean

1.0156758386733349

In [120]:
#hide
gr_std = np.std(growth_rate[2:], ddof=1)
gr_std

0.022258201400607868

In [121]:
#hide
df

Unnamed: 0_level_0,Número de defunciones,ajustado a 365 días,Variación c/r año anterior
Año,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2010,37540,37540,-
2011,36874,36874,-1.77%
2012,37990,37886,2.74%
2013,38251,38251,0.96%
2014,39336,39336,2.84%
2015,39642,39642,0.78%
2016,39836,39727,0.21%
2017,41114,41114,3.49%
2018,40319,40319,-1.93%
2019,42183,42183,4.62%


In [122]:
#hide
print(f"Promedio de las tasas de crecimiento anual entre 2012 y 2019: {100*(np.mean(growth_rate[2:])-1):.2f}")

Promedio de las tasas de crecimiento anual entre 2012 y 2019: 1.57


In [123]:
#hide
amended_deaths = pd.DataFrame()
for year in deaths_year_week.index[2:-1]:
    amended_deaths[year] = \
    deaths_year_week.loc[year]*(df.loc[2019,"Número de defunciones"]/df.loc[year,"Número de defunciones"])*gr_mean
amended_deaths[2020] = deaths_year_week.loc[2020,:current_week-1]

In [124]:
#hide
# amended_deaths = pd.DataFrame()
# i = 8
# for year in deaths_year_week.index[2:-1]:
#     amended_deaths[year] = deaths_year_week.loc[year]*(gr_mean**i)
#     i -=1
# amended_deaths[2020] = deaths_year_week.loc[2020,:22]*(gr_mean**i)

In [125]:
#hide
amended_deaths = amended_deaths.T
amended_deaths

semana,1,2,3,4,5,6,7,8,9,10,...,43,44,45,46,47,48,49,50,51,52
2012,880.5,761.0,700.5,708.5,795.0,737.5,685.5,664.0,726.0,741.0,...,798.5,815.5,804.0,729.5,750.0,777.0,782.5,727.0,804.0,787.0
2013,798.5,769.5,702.5,730.0,713.5,799.5,697.5,712.0,774.0,776.0,...,775.0,753.5,776.0,782.5,880.0,725.5,764.0,682.0,818.5,780.5
2014,763.0,832.0,787.0,743.0,680.5,693.5,685.0,695.5,680.5,658.5,...,807.0,726.0,854.5,829.0,797.5,803.0,745.0,721.0,754.5,776.5
2015,736.5,810.0,773.0,672.5,727.5,755.5,763.0,740.5,668.5,697.0,...,803.5,707.0,779.5,889.5,824.5,761.0,805.5,785.0,780.5,758.5
2016,847.0,785.0,774.0,745.0,712.0,654.5,713.0,768.0,783.5,704.5,...,868.0,760.0,773.0,726.5,672.0,721.5,728.5,795.5,663.0,703.5
2017,786.0,706.0,792.0,846.5,792.0,689.0,695.0,751.5,714.0,670.5,...,750.5,705.0,758.5,707.0,782.0,706.0,697.0,717.0,826.5,816.0
2018,710.5,734.0,652.0,629.5,727.5,744.5,739.5,729.5,658.5,771.5,...,845.5,744.5,832.0,774.5,823.0,716.0,786.0,739.5,813.5,756.0
2019,744.5,688.5,678.5,753.5,791.0,769.0,716.0,684.5,731.0,666.0,...,772.0,830.0,805.5,755.5,801.5,751.5,775.0,693.5,718.0,745.5
2020,825.0,784.0,751.0,724.0,767.0,683.0,706.0,621.0,639.0,807.0,...,,,,,,,,,,


In [126]:
#hide
expected = amended_deaths.loc[2015:2019].mean()
ci = amended_deaths.loc[2015:2019].std(ddof=1)

In [127]:
#hide
df_expected = pd.DataFrame()
df_expected["Expected"] = expected
df_expected["lower"] = expected - ci
df_expected["upper"] = expected + ci
df_expected["lower 2"] = expected - 2*ci
df_expected["upper 2"] = expected + 2*ci

In [128]:
#hide
amended_deaths = amended_deaths.T
amended_deaths["Promedio ponderado últimos 5 años"] = expected
amended_deaths = amended_deaths.T

In [129]:
#hide_input
label = alt.selection_single(
    encodings=['x'], # limit selection to x-axis value
    on='mouseover',  # select on mouseover events
    nearest=True,    # select data point nearest the cursor
    empty='none'     # empty selection includes no data points
)

base = alt.Chart(amended_deaths.drop([i for i in np.arange(2012,2015)]).reset_index().melt("index", value_name="defunciones").rename(columns={"index":"año"})).mark_line(point=True).encode(
    x = alt.X("semana:Q",scale=alt.Scale(domain=(1, 52))),
    y = alt.Y("defunciones", scale=alt.Scale(domain=(0,2400))),
    color=alt.Color('año:N', scale=alt.Scale(range=['lightgray', 'lightgray', 'lightgray', 'lightgray', 'lightgray', 'red', 'blue'], 
                                             domain=["2015", "2016", "2017", "2018", "2019", "2020", "Promedio ponderado últimos 5 años"]))
)

alt.layer(
    base, # base line chart
    
    alt.Chart(df_expected.reset_index()).mark_area(
        opacity=0.5
    ).encode(
        x="semana",
        y=alt.Y("lower:Q", axis=alt.Axis(title="defunciones")),
        #y="lower:Q",
        y2="upper:Q"
    ),
    
    alt.Chart(df_expected.reset_index()).mark_area(
        opacity=0.3
    ).encode(
        x="semana",
        y="lower 2:Q",
        y2="upper 2:Q"
    ),
    
#     alt.Chart(expected.reset_index().rename(columns={0:"defunciones"})).mark_line(point=True).encode(
#     x = "semana",
#     y = "defunciones"
#     ),
    # add a rule mark to serve as a guide line
    alt.Chart().mark_rule(color='#aaa').encode(
        x = alt.X('semana:Q', scale=alt.Scale(domain=(1, 52)), axis=alt.Axis(title='Semanas'), sort=None)
    ).transform_filter(label),
    
    # add circle marks for selected time points, hide unselected points
    base.mark_circle().encode(
        opacity=alt.condition(label, alt.value(1), alt.value(0))
    ).add_selection(label),

    # add white stroked text to provide a legible background for labels
    base.mark_text(align='left', dx=5, dy=-5, stroke='white', strokeWidth=2).encode(
        text='defunciones'
    ).transform_filter(label),

    # add text labels for stock prices
    base.mark_text(align='left', dx=5, dy=-5).encode(
        text='defunciones'
    ).transform_filter(label),
    
    data=amended_deaths.drop([i for i in np.arange(2012,2015)]).reset_index().melt("index", value_name="defunciones").rename(columns={"index":"año"})
).properties(
    title = f'Defunciones inscritas en Chile por semana (corregido) hasta semana {current_week-1}',
    width=600
)

Fuente: [Ministerio de Ciencia](https://github.com/MinCiencia/Datos-COVID19), [Registro Civil](https://www.registrocivil.cl/)

In [130]:
#hide
(amended_deaths.loc[2020,:12] - expected).sum()

-17.5

In [131]:
#hide
excess_mortality = []
for i in np.arange(0,current_week-1):
    excess_mortality.append((amended_deaths.loc[2020,:i] - expected).sum())

In [132]:
#hide
excess_mortality = pd.Series(excess_mortality)
excess_mortality.diff()

0        NaN
1       60.0
2       39.5
3       17.0
4       -5.5
5       16.5
6      -39.5
7      -19.5
8     -114.0
9      -72.0
10     105.0
11       4.0
12      -9.0
13     -10.5
14       9.0
15      28.0
16       1.5
17       9.0
18     -92.0
19     168.0
20     244.5
21     524.5
22    1063.5
23    1324.0
24    1384.0
25    1220.0
26     932.0
27     560.0
28     400.0
29     240.0
30     204.0
31      80.0
dtype: float64

In [133]:
#hide
casos_raw = pd.read_csv(
    "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto3/CasosTotalesCumulativo.csv",
    index_col='Region')

In [134]:
#hide
casos_raw.loc["Metropolitana"].head(2)

2020-03-03    0.0
2020-03-04    1.0
Name: Metropolitana, dtype: float64

In [135]:
#hide
_, week_first_case, _ = pd.to_datetime("2020-03-03").isocalendar()
week_first_case

10

In [136]:
#hide
data = pd.DataFrame()
data_raw = pd.read_csv(
    "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto4/2020-03-24-CasosConfirmados-totalRegional.csv",
    index_col='Region')
data['2020-03-24'] = data_raw['Fallecidos']

In [137]:
#hide
update_date = pd.to_datetime('today') - pd.offsets.Hour(19)
today = update_date.strftime('%Y-%m-%d')
today

'2020-08-13'

In [138]:
#hide
first_death_date = '2020-03-24'
total_days = (pd.to_datetime(today)-pd.to_datetime(first_death_date)).days

In [139]:
#hide
for i in np.arange(total_days+1):
    date = (pd.to_datetime(first_death_date)+pd.DateOffset(i)).strftime('%Y-%m-%d')
    s = "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto4/" + date + "-CasosConfirmados-totalRegional.csv"
    data_by_date = pd.read_csv(s)
    if 'Fallecidos' in data_by_date.columns:
        data[date] = data_by_date["Fallecidos"].values
    elif 'Casos fallecidos' in data_by_date.columns:
        data[date] = data_by_date["Casos fallecidos"].values
    elif 'Fallecidos totales' in data_by_date.columns:
        if 'Se desconoce región de origen' in data_by_date["Region"].unique():
            data_by_date = data_by_date.set_index("Region").drop("Se desconoce región de origen").reset_index()
            data[date] = data_by_date["Fallecidos totales"].values
        else:
            data[date] = data_by_date["Fallecidos totales"].values
    elif 'Fallecidos totales ' in data_by_date.columns:
        data[date] = data_by_date["Fallecidos totales "].values
    else:
        data[date] = data_by_date[" Casos fallecidos"].values

In [140]:
#hide
data.loc["Metropolitana"]

2020-03-24       2.0
2020-03-25       2.0
2020-03-26       3.0
2020-03-27       3.0
2020-03-28       3.0
               ...  
2020-08-09    7909.0
2020-08-10    7946.0
2020-08-11    7972.0
2020-08-12    7992.0
2020-08-13    8023.0
Name: Metropolitana, Length: 143, dtype: float64

In [141]:
#hide
_, week_first_death, _ = pd.to_datetime("2020-03-24").isocalendar()
week_first_death

13

In [142]:
#hide_input
display(Markdown(f"En la R.M., entre las semanas 12 y {current_week-1} hubo un total de {'{:,.0f}'.format(amended_deaths.loc[2020,12:current_week-1].sum()).replace(',', '.')} defunciones inscritas."))

En la R.M., entre las semanas 12 y 32 hubo un total de 26.736 defunciones inscritas.

In [143]:
#hide_input
display(Markdown(f"El número esperado de defunciones inscritas entre esas semanas es de {'{:,.0f}'.format(expected.loc[12:current_week-1].sum()).replace(',', '.')}, para una tasa de crecimiento de defunciones inscritas entre 2019/2020 de {100*(gr_mean-1):.2f}% para la R.M."))

El número esperado de defunciones inscritas entre esas semanas es de 18.416, para una tasa de crecimiento de defunciones inscritas entre 2019/2020 de 1.57% para la R.M.

In [144]:
#hide
date = iso_to_gregorian(2020, current_week-1, 7)

In [145]:
#hide_input
display(Markdown(f"Exceso de mortalidad al {pd.to_datetime(date).strftime('%d/%m')}: {'{:,.0f}'.format((amended_deaths.loc[2020,12:current_week-1]-expected).sum()).replace(',', '.')}"))

Exceso de mortalidad al 09/08: 8.320

In [146]:
#hide_input
display(Markdown(f"Fallecimientos confirmados por Covid-19 oficiales al {pd.to_datetime(date).strftime('%d/%m')}: {'{:,.0f}'.format(data.loc['Metropolitana', pd.to_datetime(date).strftime('%Y-%m-%d')]).replace(',', '.')}"))

Fallecimientos confirmados por Covid-19 oficiales al 09/08: 7.909

In [147]:
#hide
diff = (amended_deaths.loc[2020,12:current_week-1]-expected).sum() - data.loc['Metropolitana', pd.to_datetime(date).strftime('%Y-%m-%d')]

In [148]:
#hide_input
display(Markdown(f"Diferencia: {'{:,.0f}'.format(diff).replace(',', '.')}"))

Diferencia: 411

In [149]:
#hide
# Primer caso en R.M.
casos_raw = pd.read_csv(
    "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto3/CasosTotalesCumulativo.csv",
    index_col='Region')

In [150]:
#hide
casos_raw.loc["Metropolitana"].head(2)

2020-03-03    0.0
2020-03-04    1.0
Name: Metropolitana, dtype: float64

In [151]:
#hide
_, week_first_case, _ = pd.to_datetime("2020-03-04").isocalendar()
week_first_case

10

Gracias a [Patricio Reyes](https://pareyesv.github.io/) por la ayuda en realizar este trabajo.