# Exceso de mortalidad en Chile
> Exceso de mortalidad en Chile

- toc: true 
- badges: true
- comments: true
- author: Alonso Silva Allende
- categories: [jupyter]
- image: images/fallecimientos.png

In [1]:
#hide
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import altair as alt
import datetime

In [2]:
#hide
from IPython.display import display, Markdown, display_html, HTML

In [122]:
#hide
deaths_raw = pd.read_csv(
    "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto32/Defunciones.csv")

# Exceso de mortalidad en Chile

Durante una pandemia mundial, la recolección de datos de calidad es un problema de logística complicado. Además, las definiciones de indicadores se han modificado a través del tiempo. Por ejemplo, en Chile, el término "recuperado" ha tenido 3 definiciones distintas en dos meses. De igual modo, caso confirmado también ha sido modificado (sintomático vs asintomático) y recientemente fallecido confirmado (con PCR positivo vs en espera de resultado de exámen PCR y con PCR positivo), etc.
No hay un indicador perfecto, sin embargo, existe un indicador que es difícil de modificar su definición: defunciones inscritas.

El exceso de mortalidad es un cálculo sencillo a partir de los datos de defunciones inscritas en el registro civil: resulta de comparar las muertes de este año con el promedio (ponderado o no) de los años anteriores, semana a semana. En azul, el promedio (ponderado o no), en gris, los años anteriores y en rojo este año. 

In [123]:
#hide
# Defunciones inscritas en todo Chile
deaths = deaths_raw.drop(columns=["Region", "Codigo region", "Comuna", "Codigo comuna"]).sum()

In [5]:
#hide
# sort rows by date/index
deaths.index = pd.to_datetime(deaths.index)
deaths = deaths.sort_index()
last_day = deaths.index[-1]
last_day.strftime("%Y-%m-%d")

'2020-06-05'

In [6]:
#hide
#Give two days so that they have time to add inscriptions
_, current_week, _ = (last_day-pd.DateOffset(3)).isocalendar()
current_week

23

In [7]:
#hide
deaths = (deaths
          .reset_index()
          .rename(columns={"index": "fecha", 0: "fallecidos"})
          )

In [8]:
#hide
def get_isoyear_isoweek(row):
    isoyear, isoweek, _ = row["fecha"].isocalendar()
    return pd.Series({"año": isoyear, "semana": isoweek})

In [9]:
#hide
deaths[["año", "semana"]] = deaths.apply(get_isoyear_isoweek, axis="columns")

In [10]:
#hide
deaths_year_week = (deaths
                    .drop(columns=["fecha"])
                    .groupby(["año", "semana"])
                    .sum()
                    ["fallecidos"]
                    .unstack()
                    .astype("Float16")
                    )

In [11]:
#hide
deaths_year_week = deaths_year_week.iloc[1:,:-1]

In [12]:
#hide
deaths_year_week

semana,1,2,3,4,5,6,7,8,9,10,...,43,44,45,46,47,48,49,50,51,52
año,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2010,1786.0,1672.0,1775.0,1689.0,1687.0,1645.0,1685.0,1641.0,1865.0,1852.0,...,1906.0,1783.0,1722.0,1703.0,1738.0,1617.0,1810.0,1663.0,1725.0,1789.0
2011,1814.0,1718.0,1743.0,1795.0,1703.0,1618.0,1562.0,1657.0,1679.0,1542.0,...,1837.0,1673.0,1751.0,1714.0,1743.0,1750.0,1720.0,1735.0,1654.0,1911.0
2012,1937.0,1736.0,1730.0,1717.0,1731.0,1718.0,1669.0,1654.0,1697.0,1767.0,...,1923.0,1670.0,2019.0,1871.0,1728.0,1799.0,1812.0,1657.0,1759.0,1853.0
2013,1866.0,1855.0,1756.0,1817.0,1729.0,1867.0,1698.0,1643.0,1822.0,1708.0,...,1848.0,1619.0,2038.0,1904.0,1890.0,1756.0,1880.0,1705.0,1917.0,1855.0
2014,1841.0,1949.0,1852.0,1829.0,1744.0,1683.0,1756.0,1756.0,1701.0,1691.0,...,1934.0,1716.0,2052.0,1967.0,1855.0,1846.0,1910.0,1745.0,1827.0,1822.0
2015,1796.0,1888.0,1852.0,1792.0,1812.0,1875.0,1829.0,1824.0,1844.0,1703.0,...,1992.0,1619.0,2028.0,2236.0,1949.0,1863.0,1960.0,1915.0,1879.0,1768.0
2016,2116.0,1858.0,1932.0,1863.0,1910.0,1749.0,1724.0,1836.0,1854.0,1765.0,...,2088.0,1761.0,1927.0,1800.0,1854.0,1874.0,1811.0,1891.0,1788.0,1825.0
2017,1874.0,1874.0,1958.0,2110.0,1940.0,1815.0,1854.0,1893.0,1830.0,1731.0,...,1800.0,1930.0,1973.0,1816.0,1943.0,1907.0,1737.0,2006.0,2023.0,1936.0
2018,1861.0,1865.0,1778.0,1779.0,1860.0,1885.0,1868.0,1818.0,1780.0,1904.0,...,2064.0,1761.0,2226.0,1927.0,1986.0,1906.0,1932.0,1907.0,1980.0,1892.0
2019,1939.0,1889.0,1807.0,1943.0,1932.0,2021.0,1893.0,1936.0,1885.0,1804.0,...,2017.0,1898.0,2222.0,2003.0,2072.0,1960.0,1963.0,1878.0,1893.0,1980.0


In [13]:
#hide
deaths_year_week.loc[2020,current_week] = np.NaN

In [14]:
#hide
expected = deaths_year_week.loc[2015:2019].mean()

In [15]:
#hide
ci = deaths_year_week.loc[2015:2019].std(ddof=1)

In [16]:
#hide
df_expected = pd.DataFrame()
df_expected["Expected"] = expected
df_expected["lower"] = expected - ci
df_expected["upper"] = expected + ci
df_expected["lower 2"] = expected - 2*ci
df_expected["upper 2"] = expected + 2*ci

In [17]:
#hide
df_expected.head(2)

Unnamed: 0_level_0,Expected,lower,upper,lower 2,upper 2
semana,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,1917.0,1795.0,2039.0,1673.0,2162.0
2,1875.0,1861.0,1889.0,1848.0,1902.0


In [147]:
#hide
deaths_year_week = deaths_year_week.T
deaths_year_week["Promedio últimos 5 años"] = expected
deaths_year_week = deaths_year_week.T

In [19]:
#hide_input
label = alt.selection_single(
    encodings=['x'], # limit selection to x-axis value
    on='mouseover',  # select on mouseover events
    nearest=True,    # select data point nearest the cursor
    empty='none'     # empty selection includes no data points
)

base = alt.Chart(deaths_year_week.drop([i for i in np.arange(2010,2015)]).reset_index().melt("año", value_name="defunciones")).mark_line(point=True).encode(
    x = alt.X("semana:Q",scale=alt.Scale(domain=(1, 52))),
    y = alt.Y("defunciones", scale=alt.Scale(domain=(0,3500))),
    color=alt.Color('año:N', scale=alt.Scale(range=['lightgray', 'lightgray', 'lightgray', 'lightgray', 'lightgray', 'red',"blue"], 
                                             domain=["2015", "2016", "2017", "2018", "2019", "2020", "Promedio últimos 5 años"]))
)

alt.layer(
    base, # base line chart
    
    alt.Chart(df_expected.reset_index()).mark_area(
        opacity=0.5
    ).encode(
        x=alt.X("semana", scale=alt.Scale(domain=(1, 52))),
        y=alt.Y("lower:Q", axis=alt.Axis(title="defunciones")),
        y2="upper:Q"
    ),
    
    alt.Chart(df_expected.reset_index()).mark_area(
        opacity=0.3
    ).encode(
        x=alt.X("semana", scale=alt.Scale(domain=(1, 52))),
        y="lower 2:Q",
        y2="upper 2:Q"
    ),
    # add a rule mark to serve as a guide line
    alt.Chart().mark_rule(color='#aaa').encode(
        x = alt.X('semana:Q', scale=alt.Scale(domain=(1, 52)), axis=alt.Axis(title='semana'), sort=None)
    ).transform_filter(label),
    
    # add circle marks for selected time points, hide unselected points
    base.mark_circle().encode(
        opacity=alt.condition(label, alt.value(1), alt.value(0))
    ).add_selection(label),

    # add white stroked text to provide a legible background for labels
    base.mark_text(align='left', dx=5, dy=-5, stroke='white', strokeWidth=2).encode(
        text='defunciones'
    ).transform_filter(label),

    # add text labels for stock prices
    base.mark_text(align='left', dx=5, dy=-5).encode(
        text='defunciones'
    ).transform_filter(label),
    
    data=deaths_year_week.drop([i for i in np.arange(2010,2015)]).reset_index().melt("año", value_name="defunciones")
).properties(
    title = f'Defunciones inscritas en Chile por semana (sin corregir) hasta semana {current_week-1}',
    width=600
)

Fuente: [Ministerio de Ciencia](https://github.com/MinCiencia/Datos-COVID19), [Registro Civil](https://www.registrocivil.cl/)

In [20]:
#hide
deaths_year_week = deaths_year_week.drop("Promedio últimos 5 años")

In [21]:
#hide
n_defunciones = []
for year in np.arange(2010,2020):
    n_defunciones.append(deaths_year_week.loc[year,:].astype(int).sum())

df = pd.DataFrame()
df["Año"] = np.arange(2010,2020)
df["Número de defunciones"] = n_defunciones 
df = df.set_index("Año")

In [22]:
#hide
adjustment = 365/366
df["ajustado a 365 días"] = df["Número de defunciones"].copy()
df["ajustado a 365 días"].loc[2012] *= adjustment
df["ajustado a 365 días"].loc[2016] *= adjustment
df["ajustado a 365 días"] = df["ajustado a 365 días"].astype(int)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_with_indexer(indexer, value)


In [23]:
#hide
growth_rate = []
for year in np.arange(2011, 2020):
    growth_rate.append(df["ajustado a 365 días"].loc[year]/df["ajustado a 365 días"].loc[year-1])
growth_rate

[0.9681922943564013,
 1.036307523778426,
 1.0153985323115078,
 1.0210208106024965,
 1.0118332173213989,
 1.0060557321137897,
 1.0290659039033834,
 0.9991015526293624,
 1.025300922673411]

In [24]:
#hide
gr_mean_iso = np.mean(growth_rate[2:])
gr_mean_iso

1.0153966673650499

In [25]:
#hide
gr_std_iso = np.std(growth_rate[2:], ddof=1)
gr_std_iso

0.010661889319897802

In [26]:
#hide
print(f"Promedio de las tasas de crecimiento anual entre 2012 y 2019: {100*(np.mean(growth_rate[2:])-1):.2f}")

Promedio de las tasas de crecimiento anual entre 2012 y 2019: 1.54


In [27]:
#hide
deaths = deaths_raw.drop(columns=["Region", "Codigo region", "Comuna", "Codigo comuna"]).sum()
n_defunciones = []
for year in np.arange(2010,2020):
    n_defunciones.append(deaths.loc[f"{year}-01-01":f"{year}-12-31"].sum())
df = pd.DataFrame()
df["Año"] = np.arange(2010,2020)
df["Número de defunciones"] = n_defunciones 
df = df.set_index("Año")

In [28]:
#hide
adjustment = 365/366
df["ajustado a 365 días"] = df["Número de defunciones"].copy()
df["ajustado a 365 días"].loc[2012] *= adjustment
df["ajustado a 365 días"].loc[2016] *= adjustment
df["ajustado a 365 días"] = df["ajustado a 365 días"].astype(int)

In [29]:
#hide
df

Unnamed: 0_level_0,Número de defunciones,ajustado a 365 días
Año,Unnamed: 1_level_1,Unnamed: 2_level_1
2010,98178,98178
2011,95105,95105
2012,99169,98898
2013,100286,100286
2014,102252,102252
2015,103710,103710
2016,104390,104104
2017,106877,106877
2018,107286,107286
2019,109837,109837


In [30]:
#hide
growth_rate = []
for year in np.arange(2011, 2020):
    growth_rate.append(df["ajustado a 365 días"].loc[year]/df["ajustado a 365 días"].loc[year-1])
growth_rate

[0.9686997086923751,
 1.0398822354240052,
 1.014034661974964,
 1.0196039327523283,
 1.0142588898016665,
 1.0037990550573714,
 1.0266368247137478,
 1.0038268289716217,
 1.0237775665044833]

In [31]:
#hide
growth_rate_percentage = [f"{100*(r - 1):.2f}%" for r in growth_rate]

In [32]:
#hide
df["Variación c/r año anterior"] = ["-"]+growth_rate_percentage

In [33]:
#hide
gr_mean = np.mean(growth_rate[2:])
gr_mean

1.0151339656823117

In [34]:
#hide
gr_std = np.std(growth_rate[2:], ddof=1)
gr_std

0.008993645651094097

In [35]:
#hide
df

Unnamed: 0_level_0,Número de defunciones,ajustado a 365 días,Variación c/r año anterior
Año,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2010,98178,98178,-
2011,95105,95105,-3.13%
2012,99169,98898,3.99%
2013,100286,100286,1.40%
2014,102252,102252,1.96%
2015,103710,103710,1.43%
2016,104390,104104,0.38%
2017,106877,106877,2.66%
2018,107286,107286,0.38%
2019,109837,109837,2.38%


In [36]:
#hide
print(f"Promedio de las tasas de crecimiento anual entre 2012 y 2019: {100*(np.mean(growth_rate[2:])-1):.2f}")

Promedio de las tasas de crecimiento anual entre 2012 y 2019: 1.51


In [45]:
#hide
amended_deaths = pd.DataFrame()
for year in deaths_year_week.index[2:-1]:
    amended_deaths[year] = \
    deaths_year_week.loc[year]*(df.loc[2019,"Número de defunciones"]/df.loc[year,"Número de defunciones"])*gr_mean
amended_deaths[2020] = deaths_year_week.loc[2020,:22]

In [46]:
#hide
# amended_deaths = pd.DataFrame()
# i = 8
# for year in deaths_year_week.index[2:-1]:
#     amended_deaths[year] = deaths_year_week.loc[year]*(gr_mean**i)
#     i -=1
# amended_deaths[2020] = deaths_year_week.loc[2020,:22]*(gr_mean**i)

In [47]:
#hide
amended_deaths = amended_deaths.T
amended_deaths

semana,1,2,3,4,5,6,7,8,9,10,...,43,44,45,46,47,48,49,50,51,52
2012,2177.838688,1951.847167,1945.101151,1930.484784,1946.225487,1931.60912,1876.51666,1859.651621,1907.998065,1986.70158,...,2162.097985,1877.640995,2270.034233,2103.632516,1942.852479,2022.68033,2037.296697,1863.024628,1977.706893,2083.39447
2013,2074.642888,2062.412946,1952.343468,2020.164056,1922.32452,2075.754701,1887.858319,1826.708609,2025.72312,1898.976449,...,2054.630256,1800.0251,2265.874708,2116.891779,2101.326398,1952.343468,2090.208269,1895.64101,2131.345346,2062.412946
2014,2007.492811,2125.259907,2019.487608,1994.407578,1901.720512,1835.203912,1914.805745,1914.805745,1854.831761,1843.9274,...,2108.903366,1871.188302,2237.574823,2144.887757,2022.758916,2012.944992,2082.7329,1902.810948,1992.226706,1986.774526
2015,1930.890828,2029.800604,1991.096779,1926.590403,1948.092529,2015.824222,1966.369335,1960.993804,1982.495929,1830.905947,...,2141.611654,1740.597022,2180.315479,2403.937579,2095.382085,2002.922947,2107.208254,2058.828472,2020.124647,1900.787853
2016,2260.105892,1984.535324,2063.574945,1989.875839,2040.076679,1868.1121,1841.409526,1961.037059,1980.262913,1885.201748,...,2230.199008,1880.929336,2058.23443,1922.585352,1980.262913,2001.624972,1934.334485,2019.782723,1909.768116,1949.287926
2017,1955.047679,1955.047679,2042.680553,2201.254324,2023.90208,1893.496018,1934.18271,1974.869401,1909.144746,1805.863145,...,1877.847291,2013.469595,2058.32928,1894.539267,2027.031826,1989.47488,1812.122636,2092.756481,2110.491705,2019.729086
2018,1934.08404,1938.241126,1847.824516,1848.863787,1933.044769,1959.026553,1941.35894,1889.395371,1849.903058,1978.77271,...,2145.056131,1830.156902,2313.418094,2002.675951,2063.992963,1980.851252,2007.872308,1981.890524,2057.757335,1966.301453
2019,1968.344759,1917.588061,1834.347076,1972.405295,1961.238822,2051.585745,1921.648597,1965.299358,1913.527525,1831.301674,...,2047.525209,1926.724267,2255.627672,2033.313333,2103.357577,1989.662573,1992.707975,1906.421588,1921.648597,2009.965252
2020,2060.0,2116.0,2060.0,1987.0,2033.0,1906.0,1959.0,1818.0,1855.0,2017.0,...,,,,,,,,,,


In [48]:
#hide
expected = amended_deaths.loc[2015:2019].mean()
ci = amended_deaths.loc[2015:2019].std(ddof=1)

In [49]:
#hide
df_expected = pd.DataFrame()
df_expected["Expected"] = expected
df_expected["lower"] = expected - ci
df_expected["upper"] = expected + ci
df_expected["lower 2"] = expected - 2*ci
df_expected["upper 2"] = expected + 2*ci

In [52]:
#hide
amended_deaths = amended_deaths.T
amended_deaths["Promedio últimos 5 años"] = expected
amended_deaths = amended_deaths.T

In [68]:
#hide_input
label = alt.selection_single(
    encodings=['x'], # limit selection to x-axis value
    on='mouseover',  # select on mouseover events
    nearest=True,    # select data point nearest the cursor
    empty='none'     # empty selection includes no data points
)

base = alt.Chart(amended_deaths.drop([i for i in np.arange(2012,2015)]).reset_index().melt("index", value_name="defunciones").rename(columns={"index":"año"})).mark_line(point=True).encode(
    x = alt.X("semana:Q",scale=alt.Scale(domain=(1, 52))),
    y = alt.Y("defunciones", scale=alt.Scale(domain=(0,3500))),
    color=alt.Color('año:N', scale=alt.Scale(range=['lightgray', 'lightgray', 'lightgray', 'lightgray', 'lightgray', 'red', 'blue'], 
                                             domain=["2015", "2016", "2017", "2018", "2019", "2020", "Promedio últimos 5 años"]))
)

alt.layer(
    base, # base line chart
    
    alt.Chart(df_expected.reset_index()).mark_area(
        opacity=0.5
    ).encode(
        x="semana",
        y=alt.Y("lower:Q", axis=alt.Axis(title="defunciones")),
        #y="lower:Q",
        y2="upper:Q"
    ),
    
    alt.Chart(df_expected.reset_index()).mark_area(
        opacity=0.3
    ).encode(
        x="semana",
        y="lower 2:Q",
        y2="upper 2:Q"
    ),
    
#     alt.Chart(expected.reset_index().rename(columns={0:"defunciones"})).mark_line(point=True).encode(
#     x = "semana",
#     y = "defunciones"
#     ),
    # add a rule mark to serve as a guide line
    alt.Chart().mark_rule(color='#aaa').encode(
        x = alt.X('semana:Q', scale=alt.Scale(domain=(1, 52)), axis=alt.Axis(title='Semanas'), sort=None)
    ).transform_filter(label),
    
    # add circle marks for selected time points, hide unselected points
    base.mark_circle().encode(
        opacity=alt.condition(label, alt.value(1), alt.value(0))
    ).add_selection(label),

    # add white stroked text to provide a legible background for labels
    base.mark_text(align='left', dx=5, dy=-5, stroke='white', strokeWidth=2).encode(
        text='defunciones'
    ).transform_filter(label),

    # add text labels for stock prices
    base.mark_text(align='left', dx=5, dy=-5).encode(
        text='defunciones'
    ).transform_filter(label),
    
    data=amended_deaths.drop([i for i in np.arange(2012,2015)]).reset_index().melt("index", value_name="defunciones").rename(columns={"index":"año"})
).properties(
    title = f'Defunciones inscritas en Chile por semana (corregido) hasta semana {current_week-1}',
    width=600
)

Fuente: [Ministerio de Ciencia](https://github.com/MinCiencia/Datos-COVID19), [Registro Civil](https://www.registrocivil.cl/)

In [69]:
#hide
amended_deaths.index

Index([2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020,
       'Promedio últimos 5 años'],
      dtype='object')

In [70]:
#hide
(amended_deaths.loc[2020,:12] - expected).sum()

475.7512888324077

In [71]:
#hide
excess_mortality = []
for i in np.arange(1,23):
    excess_mortality.append((amended_deaths.loc[2020,:i] - expected).sum())

In [72]:
#hide
excess_mortality = pd.Series(excess_mortality)
excess_mortality

0       50.305360
1      201.262801
2      305.358028
3      304.560098
4      356.289122
5      304.680194
6      342.686373
7      210.367375
8      138.300541
9      288.891496
10     365.962194
11     475.751289
12     494.394502
13     568.993776
14     503.765366
15     702.240034
16     702.219987
17     634.355557
18     948.383970
19    1221.012227
20    1732.327014
21    2549.835608
dtype: float64

In [73]:
#hide
casos_raw = pd.read_csv(
    "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto3/CasosTotalesCumulativo.csv",
    index_col='Region')

In [74]:
#hide
casos_raw.loc["Metropolitana"].head(2)

2020-03-03    0
2020-03-04    1
Name: Metropolitana, dtype: int64

In [75]:
#hide
_, week_first_case, _ = pd.to_datetime("2020-03-03").isocalendar()
week_first_case

10

In [76]:
#hide
deaths_raw = pd.read_csv("https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto5/TotalesNacionales_T.csv")

In [77]:
#hide
deaths_raw[deaths_raw["Fallecidos"]>0].head(2)

Unnamed: 0,Fecha,Casos nuevos con sintomas,Casos totales,Casos recuperados,Fallecidos,Casos activos,Casos nuevos sin sintomas,Casos nuevos totales
19,2020-03-22,95.0,632.0,8.0,1.0,623.0,,95.0
20,2020-03-23,114.0,746.0,11.0,1.0,734.0,,114.0


In [78]:
#hide
_, week_first_death, _ = pd.to_datetime("2020-03-22").isocalendar()
week_first_death

12

In [152]:
#hide_input
display(Markdown(f"Entre las semanas 12 y {current_week-1}, hubo un total de {'{:,.0f}'.format(amended_deaths.loc[2020,12:22].sum()).replace(',', '.')} defunciones inscritas."))
display(Markdown(f"El número esperado de defunciones inscritas entre esas semanas es de {'{:,.0f}'.format(expected.loc[12:22].sum()).replace(',', '.')}, para una tase de crecimiento de defunciones inscritas entre 2019/2020 de {100*(gr_mean-1):.2f}%."))
display(Markdown(f"Exceso de mortalidad al 31 de mayo: {'{:,.0f}'.format((amended_deaths.loc[2020,12:22].sum()-expected.loc[12:22].sum())).replace(',', '.')}."))

Entre las semanas 12 y 22, hubo un total de 24.712 defunciones inscritas.

El número esperado de defunciones inscritas entre esas semanas es de 22.528, para una tase de crecimiento de defunciones inscritas entre 2019/2020 de 1.51%.

Exceso de mortalidad al 31 de mayo: 2.184.

In [80]:
#hide
totales_covid19 = pd.read_csv(
    "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto5/TotalesNacionales_T.csv",
    index_col="Fecha")

In [81]:
#hide
totales_covid19.head(2)

Unnamed: 0_level_0,Casos nuevos con sintomas,Casos totales,Casos recuperados,Fallecidos,Casos activos,Casos nuevos sin sintomas,Casos nuevos totales
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2020-03-03,0.0,1.0,0.0,0.0,1.0,,0.0
2020-03-04,2.0,3.0,0.0,0.0,3.0,,2.0


In [82]:
#hide
_, week_first_case, _ = pd.to_datetime("2020-03-03").isocalendar()
week_first_case

10

In [149]:
#hide_input
display(Markdown(f"Fallecimientos confirmados por Covid-19 oficiales al 31 de mayo: {'{:,.0f}'.format(totales_covid19['Fallecidos'].loc['2020-05-31']).replace(',', '.')}."))

Fallecimientos confirmados por Covid-19 oficiales al 31 de mayo: 1.054.

In [162]:
#hide
diff = amended_deaths.loc[2020,12:22].sum()-expected.loc[12:22].sum() - totales_covid19['Fallecidos'].loc['2020-05-31']

In [163]:
#hide_input
display(Markdown(f"Diferencia: {'{:,.0f}'.format(diff).replace(',', '.')}."))

Diferencia: 1.130.

In [83]:
#hide
casos_raw = pd.read_csv(
    "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto3/CasosTotalesCumulativo.csv",
    index_col='Region')

In [84]:
#hide
casos_raw.loc["Metropolitana"].head(2)

2020-03-03    0
2020-03-04    1
Name: Metropolitana, dtype: int64

In [85]:
#hide
deaths_raw = pd.read_csv("https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto5/TotalesNacionales_T.csv")

In [86]:
#hide
deaths_raw[deaths_raw["Fallecidos"]>0].head(2)

Unnamed: 0,Fecha,Casos nuevos con sintomas,Casos totales,Casos recuperados,Fallecidos,Casos activos,Casos nuevos sin sintomas,Casos nuevos totales
19,2020-03-22,95.0,632.0,8.0,1.0,623.0,,95.0
20,2020-03-23,114.0,746.0,11.0,1.0,734.0,,114.0


In [87]:
#hide
_, week_first_death, _ = pd.to_datetime("2020-03-22").isocalendar()
week_first_death

12

In [88]:
#hide
data = pd.DataFrame()
data_raw = pd.read_csv(
    "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto4/2020-03-24-CasosConfirmados-totalRegional.csv",
    index_col='Region')
data['2020-03-24'] = data_raw['Fallecidos']

In [141]:
#hide
update_date = pd.to_datetime('today') - pd.offsets.Hour(19)
today = update_date.strftime('%Y-%m-%d')
today

'2020-06-06'

In [142]:
#hide
first_death_date = '2020-03-24'
total_days = (pd.to_datetime(today)-pd.to_datetime(first_death_date)).days

In [144]:
#hide
# s = "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto4/" + date + "-CasosConfirmados-totalRegional.csv"
# data_by_date = pd.read_csv(s)
# idx_fallecidos = [i for i, x in enumerate(data_by_date.columns.str.contains("Fallecidos", case=False)) if x]

In [145]:
#hide
for i in np.arange(total_days+1):
    date = (pd.to_datetime(first_death_date)+pd.DateOffset(i)).strftime('%Y-%m-%d')
    s = "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto4/" + date + "-CasosConfirmados-totalRegional.csv"
    data_by_date = pd.read_csv(s)
    if 'Fallecidos' in data_by_date.columns:
        data[date] = data_by_date["Fallecidos"].values
    elif 'Casos fallecidos' in data_by_date.columns:
        data[date] = data_by_date["Casos fallecidos"].values
    elif 'Fallecidos totales' in data_by_date.columns:
        data[date] = data_by_date["Fallecidos totales"].values
    elif 'Fallecidos totales ' in data_by_date.columns:
        data[date] = data_by_date["Fallecidos totales "].values
    else:
        data[date] = data_by_date[" Casos fallecidos"].values

In [146]:
#hide
data

Unnamed: 0_level_0,2020-03-24,2020-03-25,2020-03-26,2020-03-27,2020-03-28,2020-03-29,2020-03-30,2020-03-31,2020-04-01,2020-04-02,...,2020-05-28,2020-05-29,2020-05-30,2020-05-31,2020-06-01,2020-06-02,2020-06-03,2020-06-04,2020-06-05,2020-06-06
Region,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Arica y Parinacota,0,0,0,0,0,0,0,0,0,0,...,7,7,7,7,8,8,8,8,9,9
Tarapacá,0,0,0,0,0,0,0,0,0,0,...,12,14,18,21,22,23,28,28,28,30
Antofagasta,0,0,0,0,0,0,0,0,0,0,...,28,28,29,31,32,34,36,38,42,47
Atacama,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Coquimbo,0,0,0,0,0,0,0,0,0,0,...,2,2,2,2,2,2,4,4,5,5
Valparaíso,0,0,0,0,0,0,0,1,1,1,...,60,61,63,66,70,72,77,78,88,91
Metropolitana,2,2,3,3,3,3,3,4,5,6,...,638,685,728,775,824,894,961,1034,1105,1184
O’Higgins,0,0,0,0,0,0,0,0,0,0,...,14,14,16,18,18,18,20,21,22,25
Maule,0,0,0,0,0,0,1,1,1,1,...,15,15,15,15,17,17,19,21,24,24
Ñuble,0,0,0,0,0,0,0,0,0,0,...,22,22,23,23,23,23,24,24,24,24


# Exceso de mortalidad en la R.M.

In [192]:
#hide
# Defunciones inscritas en la R.M.
deaths = deaths_raw.query("Region == 'Metropolitana de Santiago'").drop(columns=["Region", "Codigo region", "Comuna", "Codigo comuna"]).sum()

In [193]:
#hide
# sort rows by date/index
deaths.index = pd.to_datetime(deaths.index)
deaths = deaths.sort_index()
last_day = deaths.index[-1]
last_day.strftime("%Y-%m-%d")

'2020-06-05'

In [194]:
#hide
deaths = (deaths
          .reset_index()
          .rename(columns={"index": "fecha", 0: "fallecidos"})
          )

In [195]:
#hide
def get_isoyear_isoweek(row):
    isoyear, isoweek, _ = row["fecha"].isocalendar()
    return pd.Series({"año": isoyear, "semana": isoweek})

In [196]:
#hide
deaths[["año", "semana"]] = deaths.apply(get_isoyear_isoweek, axis="columns")

In [197]:
#hide
deaths_year_week = (deaths
                    .drop(columns=["fecha"])
                    .groupby(["año", "semana"])
                    .sum()
                    ["fallecidos"]
                    .unstack()
                    .astype("Float16")
                    )

In [198]:
#hide
deaths_year_week = deaths_year_week.iloc[1:,:-1]

In [199]:
#hide
deaths_year_week

semana,1,2,3,4,5,6,7,8,9,10,...,43,44,45,46,47,48,49,50,51,52
año,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2010,606.0,614.0,685.0,644.0,616.0,599.0,664.0,567.0,677.0,626.0,...,705.0,651.0,625.0,613.0,671.0,601.0,704.0,599.0,622.0,675.0
2011,682.0,645.0,672.0,722.0,683.0,614.0,561.0,634.0,624.0,551.0,...,651.0,615.0,658.0,627.0,692.0,644.0,638.0,648.0,618.0,714.0
2012,781.0,675.0,621.0,628.0,705.0,654.0,608.0,589.0,644.0,657.0,...,708.0,723.0,713.0,647.0,665.0,689.0,694.0,645.0,713.0,698.0
2013,713.0,687.0,627.0,652.0,637.0,714.0,623.0,636.0,691.0,693.0,...,692.0,673.0,693.0,699.0,786.0,648.0,682.0,609.0,731.0,697.0
2014,701.0,764.0,723.0,682.0,625.0,637.0,629.0,639.0,625.0,605.0,...,741.0,667.0,785.0,761.0,732.0,737.0,684.0,662.0,693.0,713.0
2015,681.0,749.0,715.0,622.0,673.0,699.0,706.0,685.0,618.0,645.0,...,743.0,654.0,721.0,823.0,763.0,704.0,745.0,726.0,722.0,702.0
2016,788.0,730.0,720.0,693.0,662.0,609.0,663.0,714.0,729.0,655.0,...,807.0,707.0,719.0,676.0,625.0,671.0,678.0,740.0,617.0,654.0
2017,754.0,677.0,760.0,812.0,760.0,661.0,667.0,721.0,685.0,643.0,...,720.0,676.0,728.0,678.0,750.0,677.0,669.0,688.0,793.0,783.0
2018,669.0,691.0,614.0,593.0,685.0,701.0,696.0,687.0,620.0,726.0,...,796.0,701.0,783.0,729.0,775.0,674.0,740.0,696.0,766.0,712.0
2019,733.0,678.0,668.0,742.0,779.0,757.0,705.0,674.0,720.0,656.0,...,760.0,817.0,793.0,744.0,789.0,740.0,763.0,683.0,707.0,734.0


In [200]:
#hide
deaths_year_week.loc[2020,current_week] = np.NaN

In [201]:
#hide
expected = deaths_year_week.loc[2015:2019].mean()

In [202]:
#hide
ci = deaths_year_week.loc[2015:2019].std(ddof=1)

In [203]:
#hide
df_expected = pd.DataFrame()
df_expected["Expected"] = expected
df_expected["lower"] = expected - ci
df_expected["upper"] = expected + ci
df_expected["lower 2"] = expected - 2*ci
df_expected["upper 2"] = expected + 2*ci

In [204]:
#hide
df_expected.head(2)

Unnamed: 0_level_0,Expected,lower,upper,lower 2,upper 2
semana,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,725.0,675.0,775.0,625.0,825.0
2,705.0,672.5,737.5,639.5,770.5


In [205]:
#hide
deaths_year_week = deaths_year_week.T
deaths_year_week["Promedio últimos 5 años"] = expected
deaths_year_week = deaths_year_week.T

In [206]:
#hide_input
label = alt.selection_single(
    encodings=['x'], # limit selection to x-axis value
    on='mouseover',  # select on mouseover events
    nearest=True,    # select data point nearest the cursor
    empty='none'     # empty selection includes no data points
)

base = alt.Chart(deaths_year_week.drop([i for i in np.arange(2010,2015)]).reset_index().melt("año", value_name="defunciones")).mark_line(point=True).encode(
    x = alt.X("semana:Q",scale=alt.Scale(domain=(1, 52))),
    y = alt.Y("defunciones", scale=alt.Scale(domain=(0,2000))),
    color=alt.Color('año:N', scale=alt.Scale(range=['lightgray', 'lightgray', 'lightgray', 'lightgray', 'lightgray', 'red',"blue"], 
                                             domain=["2015", "2016", "2017", "2018", "2019", "2020", "Promedio últimos 5 años"]))
)

alt.layer(
    base, # base line chart
    
    alt.Chart(df_expected.reset_index()).mark_area(
        opacity=0.5
    ).encode(
        x=alt.X("semana", scale=alt.Scale(domain=(1, 52))),
        y=alt.Y("lower:Q", axis=alt.Axis(title="defunciones")),
        y2="upper:Q"
    ),
    
    alt.Chart(df_expected.reset_index()).mark_area(
        opacity=0.3
    ).encode(
        x=alt.X("semana", scale=alt.Scale(domain=(1, 52))),
        y="lower 2:Q",
        y2="upper 2:Q"
    ),
    # add a rule mark to serve as a guide line
    alt.Chart().mark_rule(color='#aaa').encode(
        x = alt.X('semana', scale=alt.Scale(domain=(1, 52)), axis=alt.Axis(title='semana'), sort=None)
    ).transform_filter(label),
    
    # add circle marks for selected time points, hide unselected points
    base.mark_circle().encode(
        opacity=alt.condition(label, alt.value(1), alt.value(0))
    ).add_selection(label),

    # add white stroked text to provide a legible background for labels
    base.mark_text(align='left', dx=5, dy=-5, stroke='white', strokeWidth=2).encode(
        text='defunciones'
    ).transform_filter(label),

    # add text labels for stock prices
    base.mark_text(align='left', dx=5, dy=-5).encode(
        text='defunciones'
    ).transform_filter(label),
    
    data=deaths_year_week.drop([i for i in np.arange(2010,2015)]).reset_index().melt("año", value_name="defunciones")
).properties(
    title = f'Defunciones inscritas en la R.M. por semana (sin corregir) hasta semana {current_week-1}',
    width=600
)

Fuente: [Ministerio de Ciencia](https://github.com/MinCiencia/Datos-COVID19), [Registro Civil](https://www.registrocivil.cl/)

In [207]:
#hide
deaths_year_week = deaths_year_week.drop("Promedio últimos 5 años")

In [208]:
#hide
n_defunciones = []
for year in np.arange(2010,2020):
    n_defunciones.append(deaths_year_week.loc[year,:].astype(int).sum())

df = pd.DataFrame()
df["Año"] = np.arange(2010,2020)
df["Número de defunciones"] = n_defunciones 
df = df.set_index("Año")

In [209]:
#hide
adjustment = 365/366
df["ajustado a 365 días"] = df["Número de defunciones"].copy()
df["ajustado a 365 días"].loc[2012] *= adjustment
df["ajustado a 365 días"].loc[2016] *= adjustment
df["ajustado a 365 días"] = df["ajustado a 365 días"].astype(int)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_with_indexer(indexer, value)


In [210]:
#hide
growth_rate = []
for year in np.arange(2011, 2020):
    growth_rate.append(df["ajustado a 365 días"].loc[year]/df["ajustado a 365 días"].loc[year-1])
growth_rate

[0.9808015807321567,
 1.028095393662202,
 1.0085795996186844,
 1.0310333963453056,
 1.0062388591800357,
 0.9991901809439453,
 1.0411569536256111,
 0.977546949498881,
 1.046161503048401]

In [211]:
#hide
print(f"Promedio de las tasas de crecimiento anual entre 2012 y 2019: {100*(np.mean(growth_rate[2:])-1):.2f}")

Promedio de las tasas de crecimiento anual entre 2012 y 2019: 1.57


In [215]:
#hide
deaths = deaths_raw.query("Region == 'Metropolitana de Santiago'").drop(columns=["Region", "Codigo region", "Comuna", "Codigo comuna"]).sum()
n_defunciones = []
for year in np.arange(2010,2020):
    n_defunciones.append(deaths.loc[f"{year}-01-01":f"{year}-12-31"].sum())
df = pd.DataFrame()
df["Año"] = np.arange(2010,2020)
df["Número de defunciones"] = n_defunciones 
df = df.set_index("Año")

In [216]:
#hide
adjustment = 365/366
df["ajustado a 365 días"] = df["Número de defunciones"].copy()
df["ajustado a 365 días"].loc[2012] *= adjustment
df["ajustado a 365 días"].loc[2016] *= adjustment
df["ajustado a 365 días"] = df["ajustado a 365 días"].astype(int)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_with_indexer(indexer, value)


In [217]:
#hide
df

Unnamed: 0_level_0,Número de defunciones,ajustado a 365 días
Año,Unnamed: 1_level_1,Unnamed: 2_level_1
2010,37540,37540
2011,36874,36874
2012,37990,37886
2013,38251,38251
2014,39336,39336
2015,39642,39642
2016,39836,39727
2017,41114,41114
2018,40319,40319
2019,42183,42183


In [218]:
#hide
growth_rate = []
for year in np.arange(2011, 2020):
    growth_rate.append(df["ajustado a 365 días"].loc[year]/df["ajustado a 365 días"].loc[year-1])
growth_rate

[0.9822589238145978,
 1.0274448120627,
 1.0096341656548593,
 1.02836527149617,
 1.0077791336180597,
 1.00214419050502,
 1.0349132831575503,
 0.9806635209417717,
 1.0462313053399142]

In [219]:
#hide
growth_rate_percentage = [f"{100*(r - 1):.2f}%" for r in growth_rate]

In [220]:
#hide
df["Variación c/r año anterior"] = ["-"]+growth_rate_percentage

In [221]:
#hide
gr_mean = np.mean(growth_rate[2:])
gr_mean

1.0156758386733349

In [222]:
#hide
gr_std = np.std(growth_rate[2:], ddof=1)
gr_std

0.022258201400607868

In [223]:
#hide
df

Unnamed: 0_level_0,Número de defunciones,ajustado a 365 días,Variación c/r año anterior
Año,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2010,37540,37540,-
2011,36874,36874,-1.77%
2012,37990,37886,2.74%
2013,38251,38251,0.96%
2014,39336,39336,2.84%
2015,39642,39642,0.78%
2016,39836,39727,0.21%
2017,41114,41114,3.49%
2018,40319,40319,-1.93%
2019,42183,42183,4.62%


In [224]:
#hide
print(f"Promedio de las tasas de crecimiento anual entre 2012 y 2019: {100*(np.mean(growth_rate[2:])-1):.2f}")

Promedio de las tasas de crecimiento anual entre 2012 y 2019: 1.57


In [225]:
#hide
amended_deaths = pd.DataFrame()
for year in deaths_year_week.index[2:-1]:
    amended_deaths[year] = \
    deaths_year_week.loc[year]*(df.loc[2019,"Número de defunciones"]/df.loc[year,"Número de defunciones"])*gr_mean
amended_deaths[2020] = deaths_year_week.loc[2020,:22]

In [226]:
#hide
# amended_deaths = pd.DataFrame()
# i = 8
# for year in deaths_year_week.index[2:-1]:
#     amended_deaths[year] = deaths_year_week.loc[year]*(gr_mean**i)
#     i -=1
# amended_deaths[2020] = deaths_year_week.loc[2020,:22]*(gr_mean**i)

In [227]:
#hide
amended_deaths = amended_deaths.T
amended_deaths

semana,1,2,3,4,5,6,7,8,9,10,...,43,44,45,46,47,48,49,50,51,52
2012,880.5,761.0,700.5,708.5,795.0,737.5,685.5,664.0,726.0,741.0,...,798.5,815.5,804.0,729.5,750.0,777.0,782.5,727.0,804.0,787.0
2013,798.5,769.5,702.5,730.0,713.5,799.5,697.5,712.0,774.0,776.0,...,775.0,753.5,776.0,782.5,880.0,725.5,764.0,682.0,818.5,780.5
2014,763.0,832.0,787.0,743.0,680.5,693.5,685.0,695.5,680.5,658.5,...,807.0,726.0,854.5,829.0,797.5,803.0,745.0,721.0,754.5,776.5
2015,736.5,810.0,773.0,672.5,727.5,755.5,763.0,740.5,668.5,697.0,...,803.5,707.0,779.5,889.5,824.5,761.0,805.5,785.0,780.5,758.5
2016,847.0,785.0,774.0,745.0,712.0,654.5,713.0,768.0,783.5,704.5,...,868.0,760.0,773.0,726.5,672.0,721.5,728.5,795.5,663.0,703.5
2017,786.0,706.0,792.0,846.5,792.0,689.0,695.0,751.5,714.0,670.5,...,750.5,705.0,758.5,707.0,782.0,706.0,697.0,717.0,826.5,816.0
2018,710.5,734.0,652.0,629.5,727.5,744.5,739.5,729.5,658.5,771.5,...,845.5,744.5,832.0,774.5,823.0,716.0,786.0,739.5,813.5,756.0
2019,744.5,688.5,678.5,753.5,791.0,769.0,716.0,684.5,731.0,666.0,...,772.0,830.0,805.5,755.5,801.5,751.5,775.0,693.5,718.0,745.5
2020,825.0,784.0,751.0,724.0,767.0,683.0,706.0,616.0,643.0,808.0,...,,,,,,,,,,


In [228]:
#hide
expected = amended_deaths.loc[2015:2019].mean()
ci = amended_deaths.loc[2015:2019].std(ddof=1)

In [229]:
#hide
df_expected = pd.DataFrame()
df_expected["Expected"] = expected
df_expected["lower"] = expected - ci
df_expected["upper"] = expected + ci
df_expected["lower 2"] = expected - 2*ci
df_expected["upper 2"] = expected + 2*ci

In [230]:
#hide
amended_deaths = amended_deaths.T
amended_deaths["Promedio últimos 5 años"] = expected
amended_deaths = amended_deaths.T

In [232]:
#hide_input
label = alt.selection_single(
    encodings=['x'], # limit selection to x-axis value
    on='mouseover',  # select on mouseover events
    nearest=True,    # select data point nearest the cursor
    empty='none'     # empty selection includes no data points
)

base = alt.Chart(amended_deaths.drop([i for i in np.arange(2012,2015)]).reset_index().melt("index", value_name="defunciones").rename(columns={"index":"año"})).mark_line(point=True).encode(
    x = alt.X("semana:Q",scale=alt.Scale(domain=(1, 52))),
    y = alt.Y("defunciones", scale=alt.Scale(domain=(0,2000))),
    color=alt.Color('año:N', scale=alt.Scale(range=['lightgray', 'lightgray', 'lightgray', 'lightgray', 'lightgray', 'red', 'blue'], 
                                             domain=["2015", "2016", "2017", "2018", "2019", "2020", "Promedio últimos 5 años"]))
)

alt.layer(
    base, # base line chart
    
    alt.Chart(df_expected.reset_index()).mark_area(
        opacity=0.5
    ).encode(
        x="semana",
        y=alt.Y("lower:Q", axis=alt.Axis(title="defunciones")),
        #y="lower:Q",
        y2="upper:Q"
    ),
    
    alt.Chart(df_expected.reset_index()).mark_area(
        opacity=0.3
    ).encode(
        x="semana",
        y="lower 2:Q",
        y2="upper 2:Q"
    ),
    
#     alt.Chart(expected.reset_index().rename(columns={0:"defunciones"})).mark_line(point=True).encode(
#     x = "semana",
#     y = "defunciones"
#     ),
    # add a rule mark to serve as a guide line
    alt.Chart().mark_rule(color='#aaa').encode(
        x = alt.X('semana:Q', scale=alt.Scale(domain=(1, 52)), axis=alt.Axis(title='Semanas'), sort=None)
    ).transform_filter(label),
    
    # add circle marks for selected time points, hide unselected points
    base.mark_circle().encode(
        opacity=alt.condition(label, alt.value(1), alt.value(0))
    ).add_selection(label),

    # add white stroked text to provide a legible background for labels
    base.mark_text(align='left', dx=5, dy=-5, stroke='white', strokeWidth=2).encode(
        text='defunciones'
    ).transform_filter(label),

    # add text labels for stock prices
    base.mark_text(align='left', dx=5, dy=-5).encode(
        text='defunciones'
    ).transform_filter(label),
    
    data=amended_deaths.drop([i for i in np.arange(2012,2015)]).reset_index().melt("index", value_name="defunciones").rename(columns={"index":"año"})
).properties(
    title = f'Defunciones inscritas en Chile por semana (corregido) hasta semana {current_week-1}',
    width=600
)

Fuente: [Ministerio de Ciencia](https://github.com/MinCiencia/Datos-COVID19), [Registro Civil](https://www.registrocivil.cl/)

In [234]:
#hide
(amended_deaths.loc[2020,:12] - expected).sum()

-17.5

In [235]:
#hide
excess_mortality = []
for i in np.arange(1,23):
    excess_mortality.append((amended_deaths.loc[2020,:i] - expected).sum())

In [274]:
#hide
excess_mortality = pd.Series(excess_mortality)
excess_mortality.diff()

0       NaN
1      39.5
2      17.0
3      -5.5
4      16.5
5     -39.5
6     -19.5
7    -119.0
8     -68.0
9     106.0
10      4.0
11     -9.0
12    -12.5
13      9.0
14     28.0
15      0.5
16      9.0
17    -95.0
18    166.0
19    238.5
20    495.5
21    886.5
dtype: float64

In [237]:
#hide
casos_raw = pd.read_csv(
    "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto3/CasosTotalesCumulativo.csv",
    index_col='Region')

In [238]:
#hide
casos_raw.loc["Metropolitana"].head(2)

2020-03-03    0
2020-03-04    1
Name: Metropolitana, dtype: int64

In [239]:
#hide
_, week_first_case, _ = pd.to_datetime("2020-03-03").isocalendar()
week_first_case

10

In [240]:
#hide
data = pd.DataFrame()
data_raw = pd.read_csv(
    "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto4/2020-03-24-CasosConfirmados-totalRegional.csv",
    index_col='Region')
data['2020-03-24'] = data_raw['Fallecidos']

In [242]:
#hide
update_date = pd.to_datetime('today') - pd.offsets.Hour(19)
today = update_date.strftime('%Y-%m-%d')
today

'2020-06-06'

In [243]:
#hide
first_death_date = '2020-03-24'
total_days = (pd.to_datetime(today)-pd.to_datetime(first_death_date)).days

In [244]:
#hide
for i in np.arange(total_days+1):
    date = (pd.to_datetime(first_death_date)+pd.DateOffset(i)).strftime('%Y-%m-%d')
    s = "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto4/" + date + "-CasosConfirmados-totalRegional.csv"
    data_by_date = pd.read_csv(s)
    if 'Fallecidos' in data_by_date.columns:
        data[date] = data_by_date["Fallecidos"].values
    elif 'Casos fallecidos' in data_by_date.columns:
        data[date] = data_by_date["Casos fallecidos"].values
    elif 'Fallecidos totales' in data_by_date.columns:
        data[date] = data_by_date["Fallecidos totales"].values
    elif 'Fallecidos totales ' in data_by_date.columns:
        data[date] = data_by_date["Fallecidos totales "].values
    else:
        data[date] = data_by_date[" Casos fallecidos"].values

In [246]:
#hide
data.loc["Metropolitana"]

2020-03-24       2
2020-03-25       2
2020-03-26       3
2020-03-27       3
2020-03-28       3
              ... 
2020-06-02     894
2020-06-03     961
2020-06-04    1034
2020-06-05    1105
2020-06-06    1184
Name: Metropolitana, Length: 75, dtype: int64

In [247]:
#hide
_, week_first_death, _ = pd.to_datetime("2020-03-24").isocalendar()
week_first_death

13

In [267]:
#hide_input
display(Markdown(f"En la R.M., entre las semanas 12 y {current_week-1} hubo un total de {'{:,.0f}'.format(amended_deaths.loc[2020,12:22].sum()).replace(',', '.')} defunciones inscritas."))

En la R.M., entre las semanas 12 y 22 hubo un total de 10.456 defunciones inscritas.

In [268]:
#hide_input
display(Markdown(f"El número esperado de defunciones inscritas entre esas semanas es de {'{:,.0f}'.format(expected.loc[12:22].sum()).replace(',', '.')}, para una tasa de crecimiento de defunciones inscritas entre 2019/2020 de {100*(gr_mean-1):.2f}% para la R.M."))

El número esperado de defunciones inscritas entre esas semanas es de 8.736, para una tasa de crecimiento de defunciones inscritas entre 2019/2020 de 1.57% para la R.M.

In [269]:
#hide_input
display(Markdown(f"Exceso de mortalidad al 31 de mayo: {'{:,.0f}'.format((amended_deaths.loc[2020,12:22].sum()-expected.loc[12:22].sum())).replace(',', '.')}."))

Exceso de mortalidad al 31 de mayo: 1.720.

In [270]:
#hide_input
display(Markdown(f"Fallecimientos confirmados por Covid-19 oficiales al 31 de mayo: {'{:,.0f}'.format(data.loc['Metropolitana', '2020-05-31']).replace(',', '.')}."))

Fallecimientos confirmados por Covid-19 oficiales al 31 de mayo: 775.

In [272]:
#hide
diff = amended_deaths.loc[2020,12:22].sum()-expected.loc[12:22].sum() - data.loc['Metropolitana', '2020-05-31']

In [273]:
#hide_input
display(Markdown(f"Diferencia: {'{:,.0f}'.format(diff).replace(',', '.')}."))

Diferencia: 945.

In [263]:
#hide
casos_raw = pd.read_csv(
    "https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto3/CasosTotalesCumulativo.csv",
    index_col='Region')

In [264]:
#hide
casos_raw.loc["Metropolitana"].head(2)

2020-03-03    0
2020-03-04    1
Name: Metropolitana, dtype: int64

In [265]:
#hide
_, week_first_case, _ = pd.to_datetime("2020-03-04").isocalendar()
week_first_case

10

Gracias a [Patricio Reyes](https://pareyesv.github.io/) por la ayuda en realizar este trabajo.