#**Modelo pronóstico de COVID 19 (basado en modelo SEIR)**


##El modelo SEIR divide la población en cuatro categorias 
1.   Categoría "S": Individuos **S**usceptibles a ser infecctados
2.   Categoría "E": Individuos **E**xpuestos a la infección
3.   Categoría "I": Individuos **I**nfectados con capacidad de transmitir la infección a **E**xpuestos
4.    Categoría "R": Individuos **R**emovidos del modelo, inmunizados, recuperados o que fallecen tras la infección
###Adaptado de:
####*Campillo-Funollet E, Van Yperen J, Allman P, et al. Predicting and forecasting the impact of local outbreaks of COVID-19: use of SEIR-D quantitative epidemiological modelling for healthcare demand and capacity. International Journal of Epidemiology. 2021;50(4):1103-1113. doi:10.1093/ije/dyab106*

http://web.pdx.edu/~gjay/teaching/mth271_2020/html/09_SEIR_model.html


#Define el país de interés para el pronóstico



In [18]:
##Registrar la primera letra en mayuscula (ej.: Colombia)
pais= input()


Colombia


#**Paso 0: Prepara el ambiente de trabajo (paquetes y librerias para el análisis)**

In [19]:
##Importar librerias y paquetes
import numpy as np
import pandas as pd
pd.options.mode.chained_assignment = None  # default='warn'
import seaborn as sns
from matplotlib import pyplot as plt
import matplotlib.dates as mdates
%matplotlib inline

!pip install mpld3
import mpld3

!pip install lmfit
import lmfit
from lmfit.lineshapes import gaussian, lorentzian

from scipy.integrate import odeint
from scipy.integrate import solve_ivp
from sklearn.metrics import mean_absolute_error
from tqdm.auto import tqdm
import pickle
import joblib

import datetime

mpld3.enable_notebook()
from scipy.integrate import odeint
from functools import reduce
import warnings
warnings.filterwarnings('ignore')
#Paquetes
sns.set()
%matplotlib inline
sns.set_context("talk")
sns.set_style("ticks")




#**Casos positivos y fallecidos por fecha**
##Tomados a partir de la información actualizada y disponible en la URL: 
https://ourworldindata.org/covid-cases

In [20]:
#Leer fuente de datos desde el repositorio en our world in data"
data_all = pd.read_csv('https://covid.ourworldindata.org/data/owid-covid-data.csv')
##Seleccionamos las variables de interes, reemplazamos valores negativos por 0 y excluímos las que NO serán objeto de analísis en un nuevo data set "df1"
df1 = data_all.drop(columns=['iso_code',
 'continent',
 'new_cases_smoothed',
 'new_deaths_smoothed',
 'total_cases_per_million',
 'new_cases_per_million',
 'new_cases_smoothed_per_million',
 'total_deaths_per_million',
 'new_deaths_per_million',
 'new_deaths_smoothed_per_million',
 'reproduction_rate',
 'icu_patients',
 'icu_patients_per_million',
 'hosp_patients',
 'hosp_patients_per_million',
 'weekly_icu_admissions',
 'weekly_icu_admissions_per_million',
 'weekly_hosp_admissions',
 'weekly_hosp_admissions_per_million',
 'new_tests',
 'total_tests',
 'total_tests_per_thousand',
 'new_tests_per_thousand',
 'new_tests_smoothed',
 'new_tests_smoothed_per_thousand',
 'positive_rate',
 'tests_per_case',
 'tests_units',
 'people_fully_vaccinated',
 'total_boosters',
 'new_vaccinations_smoothed',
 'total_vaccinations_per_hundred',
 'people_vaccinated_per_hundred',
 'people_fully_vaccinated_per_hundred',
 'total_boosters_per_hundred',
 'new_vaccinations_smoothed_per_million',
 'stringency_index',
 'population_density',
 'median_age',
 'aged_65_older',
 'aged_70_older',
 'gdp_per_capita',
 'extreme_poverty',
 'cardiovasc_death_rate',
 'diabetes_prevalence',
 'female_smokers',
 'male_smokers',
 'handwashing_facilities',
 'hospital_beds_per_thousand',
 'life_expectancy',
 'human_development_index',
 'excess_mortality_cumulative',
 'excess_mortality_cumulative_absolute',
 'excess_mortality'])
df1['new_cases'] = df1['new_cases'].apply(lambda x : x if x > 0 else 0)
df1['new_deaths'] = df1['new_deaths'].apply(lambda x : x if x > 0 else 0)

#renombrar columnas para lograr coincidencia con código original
df1.rename(columns={
'new_cases':'infected_per_day',
'total_cases':'total_infected',
'total_deaths':'total_dead',
'new_deaths':'deadths_per_day'}, inplace=True)

#convertir 'date' a tipo fecha
df1['date'] = pd.to_datetime(df1['date'])
df1 = df1.sort_values(by='date')

#Acotar dataset al país definido
df1 = df1[df1.location == pais]
df1=df1.drop(columns=['location'])
display(df1)
##df1 representa el primer data set, se agregarán columnas para casos recuperados a partir de condiciones de 

Unnamed: 0,date,total_infected,infected_per_day,total_dead,deadths_per_day,total_vaccinations,people_vaccinated,new_vaccinations,population,excess_mortality_cumulative_per_million
23176,2020-03-06,1.0,1.0,,0.0,,,,51265841.0,
23177,2020-03-07,1.0,0.0,,0.0,,,,51265841.0,
23178,2020-03-08,1.0,0.0,,0.0,,,,51265841.0,-6.234171
23179,2020-03-09,1.0,0.0,,0.0,,,,51265841.0,
23180,2020-03-10,3.0,2.0,,0.0,,,,51265841.0,
...,...,...,...,...,...,...,...,...,...,...
23742,2021-09-23,4946811.0,1608.0,126032.0,26.0,39270198.0,25662320.0,212285.0,51265841.0,
23743,2021-09-24,4948513.0,1702.0,126068.0,36.0,,,,51265841.0,
23744,2021-09-25,4950253.0,1740.0,126102.0,34.0,39610550.0,25888829.0,,51265841.0,
23745,2021-09-26,4951675.0,1422.0,126145.0,43.0,,,,51265841.0,


#**Casos recuperados por fecha**
##Los encontrarás disponibles en la URL: 
https://data.humdata.org/dataset/novel-coronavirus-2019-ncov-cases?force_layout=desktop

In [27]:
who_data=pd.read_csv('https://data.humdata.org/hxlproxy/data/download/time_series_covid19_recovered_global_iso3_regions.csv?dest=data_edit&filter01=merge&merge-url01=https%3A%2F%2Fdocs.google.com%2Fspreadsheets%2Fd%2Fe%2F2PACX-1vTglKQRXpkKSErDiWG6ycqEth32MY0reMuVGhaslImLjfuLU0EUgyyu2e-3vKDArjqGX7dXEBV8FJ4f%2Fpub%3Fgid%3D1326629740%26single%3Dtrue%26output%3Dcsv&merge-keys01=%23country%2Bname&merge-tags01=%23country%2Bcode%2C%23region%2Bmain%2Bcode%2C%23region%2Bmain%2Bname%2C%23region%2Bsub%2Bcode%2C%23region%2Bsub%2Bname%2C%23region%2Bintermediate%2Bcode%2C%23region%2Bintermediate%2Bname&filter02=merge&merge-url02=https%3A%2F%2Fdocs.google.com%2Fspreadsheets%2Fd%2Fe%2F2PACX-1vTglKQRXpkKSErDiWG6ycqEth32MY0reMuVGhaslImLjfuLU0EUgyyu2e-3vKDArjqGX7dXEBV8FJ4f%2Fpub%3Fgid%3D398158223%26single%3Dtrue%26output%3Dcsv&merge-keys02=%23adm1%2Bname&merge-tags02=%23country%2Bcode%2C%23region%2Bmain%2Bcode%2C%23region%2Bmain%2Bname%2C%23region%2Bsub%2Bcode%2C%23region%2Bsub%2Bname%2C%23region%2Bintermediate%2Bcode%2C%23region%2Bintermediate%2Bname&merge-replace02=on&merge-overwrite02=on&tagger-match-all=on&tagger-01-header=province%2Fstate&tagger-01-tag=%23adm1%2Bname&tagger-02-header=country%2Fregion&tagger-02-tag=%23country%2Bname&tagger-03-header=lat&tagger-03-tag=%23geo%2Blat&tagger-04-header=long&tagger-04-tag=%23geo%2Blon&header-row=1&url=https%3A%2F%2Fraw.githubusercontent.com%2FCSSEGISandData%2FCOVID-19%2Fmaster%2Fcsse_covid_19_data%2Fcsse_covid_19_time_series%2Ftime_series_covid19_recovered_global.csv',
                     index_col=False)#cargar dataset de WHO, casos recuperados
recovery=who_data.drop(columns=['Province/State','Lat','Long','ISO 3166-1 Alpha 3-Codes','Region Code',
                                'Region Name','Sub-region Code','Sub-region Name','Intermediate Region Code','Intermediate Region Name'])#eliminar columnas no usadas
recovery.rename(columns={
'Country/Region':'country'}, inplace=True)#renombrar columna para facilitar manipulación
recovery = recovery[recovery.country == 'Colombia']#acotar selección al país de interés
recovery = recovery.T #transponer registros
recovery.reset_index(level=0, inplace=True)
recovery.rename(columns={
'index':'date',78:'total_recovered'}, inplace=True)#renombrar columnas
recovery = recovery.iloc[1: , :]#Eliminar primera fila
#Obtener datos de recuperados por día a partir de recuperados totales
recovery['date'] = pd.to_datetime(recovery['date'])
recovery = recovery.sort_values(by='date')
recovery['recovered_per_day'] = recovery['total_recovered'].diff().fillna(recovery['total_recovered'])#crea los valores de recuperados por día (no disponibles) a partir de los valores acumulados 

recovery.head()

Unnamed: 0,date,total_recovered,recovered_per_day
1,2020-01-22,0,0.0
2,2020-01-23,0,0.0
3,2020-01-24,0,0.0
4,2020-01-25,0,0.0
5,2020-01-26,0,0.0


#**Integra los datos**

In [28]:
#Unir los dos datasets df y Casos recuperados por fecha a partir de la fecha
df2=recovery
df3 = pd.merge(left=df1, right=df2, left_on='date', right_on='date')
df3['recovered_per_day'] = df3['recovered_per_day'].apply(lambda x : x if x > 0 else 0)

#Obtener datos suavizados (media movil cada 7 días)
df_smoothed = df3.rolling(7).mean().round(5)
df_smoothed.columns = [col + '_ma7' for col in df_smoothed.columns]

full_df = pd.concat([df3, df_smoothed], axis=1)
for column in full_df.columns:
    if column.endswith('_ma7'):
        original_column = column.strip('_ma7')
        full_df[column] = full_df[column].fillna(full_df[original_column])


In [29]:
#Indexar las fechas
full_df.index = pd.date_range(start=full_df.date.iloc[0], end=full_df.date.iloc[-1], freq='D')
full_df.index


DatetimeIndex(['2020-03-06', '2020-03-07', '2020-03-08', '2020-03-09',
               '2020-03-10', '2020-03-11', '2020-03-12', '2020-03-13',
               '2020-03-14', '2020-03-15',
               ...
               '2021-09-18', '2021-09-19', '2021-09-20', '2021-09-21',
               '2021-09-22', '2021-09-23', '2021-09-24', '2021-09-25',
               '2021-09-26', '2021-09-27'],
              dtype='datetime64[ns]', length=571, freq='D')

In [30]:
full_df.head()

Unnamed: 0,date,total_infected,infected_per_day,total_dead,deadths_per_day,total_vaccinations,people_vaccinated,new_vaccinations,population,excess_mortality_cumulative_per_million,total_recovered,recovered_per_day,total_infected_ma7,infected_per_day_ma7,total_dead_ma7,deadths_per_day_ma7,total_vaccinations_ma7,people_vaccinated_ma7,new_vaccinations_ma7,population_ma7,excess_mortality_cumulative_per_million_ma7,total_recovered_ma7,recovered_per_day_ma7
2020-03-06,2020-03-06,1.0,1.0,,0.0,,,,51265841.0,,0,0.0,1.0,1.0,,0.0,,,,51265841.0,,0.0,0.0
2020-03-07,2020-03-07,1.0,0.0,,0.0,,,,51265841.0,,0,0.0,1.0,0.0,,0.0,,,,51265841.0,,0.0,0.0
2020-03-08,2020-03-08,1.0,0.0,,0.0,,,,51265841.0,-6.234171,0,0.0,1.0,0.0,,0.0,,,,51265841.0,-6.234171,0.0,0.0
2020-03-09,2020-03-09,1.0,0.0,,0.0,,,,51265841.0,,0,0.0,1.0,0.0,,0.0,,,,51265841.0,,0.0,0.0
2020-03-10,2020-03-10,3.0,2.0,,0.0,,,,51265841.0,,0,0.0,3.0,2.0,,0.0,,,,51265841.0,,0.0,0.0
