#  El Problema

Vamos a utilizar los datos de la competición de predicción de los datos del Dengue. Mas información en [enlace](https://www.drivendata.org/competitions/44/dengai-predicting-disease-spread/)

El objetivo del ejercicio es por un lado jugar con una red neuronal para resolver un problema de regresión y por otro participar en una competición de datos. 


# Setting Preliminares

Antes de empezar cualquier análisis hay que asegurar que las librerías básicas y de uso general (numpy, pandas, etc.) están correctamente importadas. 

In [None]:
# carga de datos
import pandas as pd
import io
from google.colab import files
# manipulación y visualización
import matplotlib.pyplot as plt
import numpy as np
import itertools
import seaborn as sns



# Carga de Datos

En primer lugar cargamos los datos dentro del entorno con las funcionalidades que nos permite Google Colab

In [None]:
def upload_files ():
  uploaded = files.upload()
  for fn in uploaded.keys():
    print('User uploaded file "{name}" with length {length} bytes'.format(
        name=fn, length=len(uploaded[fn])))
    df = pd.read_csv(io.StringIO(uploaded[fn].decode('utf-8')))
    return df

En primer lugar subimos los datos de entrenamiento sin campo objetivo (dengue_features_train)


In [None]:
train_feat = upload_files()
train_feat.head()

Saving dengue_features_train.csv to dengue_features_train.csv
User uploaded file "dengue_features_train.csv" with length 266487 bytes


Unnamed: 0,city,year,weekofyear,week_start_date,ndvi_ne,ndvi_nw,ndvi_se,ndvi_sw,precipitation_amt_mm,reanalysis_air_temp_k,...,reanalysis_precip_amt_kg_per_m2,reanalysis_relative_humidity_percent,reanalysis_sat_precip_amt_mm,reanalysis_specific_humidity_g_per_kg,reanalysis_tdtr_k,station_avg_temp_c,station_diur_temp_rng_c,station_max_temp_c,station_min_temp_c,station_precip_mm
0,sj,1990,18,30/04/1990,0.1226,0.103725,0.198483,0.177617,12.42,297.572857,...,32.0,73.365714,12.42,14.012857,2.628571,25.442857,6.9,29.4,20.0,16.0
1,sj,1990,19,7/05/1990,0.1699,0.142175,0.162357,0.155486,22.82,298.211429,...,17.94,77.368571,22.82,15.372857,2.371429,26.714286,6.371429,31.7,22.2,8.6
2,sj,1990,20,14/05/1990,0.03225,0.172967,0.1572,0.170843,34.54,298.781429,...,26.1,82.052857,34.54,16.848571,2.3,26.714286,6.485714,32.2,22.8,41.4
3,sj,1990,21,21/05/1990,0.128633,0.245067,0.227557,0.235886,15.36,298.987143,...,13.9,80.337143,15.36,16.672857,2.428571,27.471429,6.771429,33.3,23.3,4.0
4,sj,1990,22,28/05/1990,0.1962,0.2622,0.2512,0.24734,7.52,299.518571,...,12.2,80.46,7.52,17.21,3.014286,28.942857,9.371429,35.0,23.9,5.8


In [None]:
train_feat.shape

(1456, 24)

Posteriormente subimos los datos que únicamente nos proporcionan el número de casos de dengue para las semanas de entrenamiento (dengue_labels_train)

In [None]:
train_labels = upload_files()
train_labels.head()

Saving dengue_labels_train.csv to dengue_labels_train.csv
User uploaded file "dengue_labels_train.csv" with length 19582 bytes


Unnamed: 0,city,year,weekofyear,total_cases
0,sj,1990,18,4
1,sj,1990,19,5
2,sj,1990,20,4
3,sj,1990,21,3
4,sj,1990,22,6


Mezclamos los dos conjuntos de datos para tenerlos en un único dataframe

In [None]:
train = pd.merge(train_feat,train_labels,on=['city', 'year', 'weekofyear'])
train1 = train[train['city']=='sj']
train2 = train[train['city']=='iq']
train1.head()
train2.head()


Unnamed: 0,city,year,weekofyear,week_start_date,ndvi_ne,ndvi_nw,ndvi_se,ndvi_sw,precipitation_amt_mm,reanalysis_air_temp_k,...,reanalysis_relative_humidity_percent,reanalysis_sat_precip_amt_mm,reanalysis_specific_humidity_g_per_kg,reanalysis_tdtr_k,station_avg_temp_c,station_diur_temp_rng_c,station_max_temp_c,station_min_temp_c,station_precip_mm,total_cases
936,iq,2000,26,1/07/2000,0.192886,0.132257,0.340886,0.2472,25.41,296.74,...,92.418571,25.41,16.651429,8.928571,26.4,10.775,32.5,20.7,3.0,0
937,iq,2000,27,8/07/2000,0.216833,0.2761,0.289457,0.241657,60.61,296.634286,...,93.581429,60.61,16.862857,10.314286,26.9,11.566667,34.0,20.8,55.6,0
938,iq,2000,28,15/07/2000,0.176757,0.173129,0.204114,0.128014,55.52,296.415714,...,95.848571,55.52,17.12,7.385714,26.8,11.466667,33.0,20.7,38.1,0
939,iq,2000,29,22/07/2000,0.227729,0.145429,0.2542,0.200314,5.6,295.357143,...,87.234286,5.6,14.431429,9.114286,25.766667,10.533333,31.5,14.7,30.0,0
940,iq,2000,30,29/07/2000,0.328643,0.322129,0.254371,0.361043,62.76,296.432857,...,88.161429,62.76,15.444286,9.5,26.6,11.48,33.3,19.1,4.0,0


Finalmente subimos los datos para generar la evaluación final (dengue_features_test)

In [None]:
test = upload_files()
test1= test[test['city']=='sj']
test2= test[test['city']=='iq']
test1.head()
test2.head()

Saving dengue_features_test.csv to dengue_features_test.csv
User uploaded file "dengue_features_test.csv" with length 82465 bytes


Unnamed: 0,city,year,weekofyear,week_start_date,ndvi_ne,ndvi_nw,ndvi_se,ndvi_sw,precipitation_amt_mm,reanalysis_air_temp_k,...,reanalysis_precip_amt_kg_per_m2,reanalysis_relative_humidity_percent,reanalysis_sat_precip_amt_mm,reanalysis_specific_humidity_g_per_kg,reanalysis_tdtr_k,station_avg_temp_c,station_diur_temp_rng_c,station_max_temp_c,station_min_temp_c,station_precip_mm
260,iq,2010,26,2010-07-02,0.183783,0.1425,0.225129,0.150214,82.29,297.648571,...,34.11,92.581429,82.29,17.654286,9.428571,27.44,10.76,33.8,21.5,11.2
261,iq,2010,27,2010-07-09,0.291657,0.272267,0.3307,0.320914,25.3,298.224286,...,9.1,83.885714,25.3,16.32,10.157143,27.025,9.625,33.0,21.2,8.9
262,iq,2010,28,2010-07-16,0.208543,0.366457,0.212629,0.255514,62.14,297.955714,...,61.09,92.057143,62.14,18.03,9.557143,26.95,10.35,33.4,21.6,22.6
263,iq,2010,29,2010-07-23,0.089286,0.063214,0.122057,0.081957,47.8,295.715714,...,19.6,88.97,47.8,15.394286,7.828571,26.9,9.7,33.3,14.2,4.8
264,iq,2010,30,2010-07-30,0.3061,0.327683,0.250086,0.267914,56.3,298.502857,...,18.93,78.61,56.3,15.468571,11.771429,27.05,11.85,33.5,16.9,3.0


In [None]:
test1.shape

(260, 24)

In [None]:
test2.shape

(156, 24)

# Preproceso

Aquí irían todas las funciones y transformaciones que permitieran el uso de las variables en los siguientes clasificadores, por ejemplo la transformación de las variables categóricas en numéricas, etc.

In [None]:
train1.drop("week_start_date", axis = 1, inplace = True)
train2.drop("week_start_date", axis = 1, inplace = True)
test1.drop("week_start_date", axis = 1, inplace = True)
test2.drop("week_start_date", axis = 1, inplace = True)
train1.head()


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  errors=errors,


Unnamed: 0,city,year,weekofyear,ndvi_ne,ndvi_nw,ndvi_se,ndvi_sw,precipitation_amt_mm,reanalysis_air_temp_k,reanalysis_avg_temp_k,...,reanalysis_relative_humidity_percent,reanalysis_sat_precip_amt_mm,reanalysis_specific_humidity_g_per_kg,reanalysis_tdtr_k,station_avg_temp_c,station_diur_temp_rng_c,station_max_temp_c,station_min_temp_c,station_precip_mm,total_cases
0,sj,1990,18,0.1226,0.103725,0.198483,0.177617,12.42,297.572857,297.742857,...,73.365714,12.42,14.012857,2.628571,25.442857,6.9,29.4,20.0,16.0,4
1,sj,1990,19,0.1699,0.142175,0.162357,0.155486,22.82,298.211429,298.442857,...,77.368571,22.82,15.372857,2.371429,26.714286,6.371429,31.7,22.2,8.6,5
2,sj,1990,20,0.03225,0.172967,0.1572,0.170843,34.54,298.781429,298.878571,...,82.052857,34.54,16.848571,2.3,26.714286,6.485714,32.2,22.8,41.4,4
3,sj,1990,21,0.128633,0.245067,0.227557,0.235886,15.36,298.987143,299.228571,...,80.337143,15.36,16.672857,2.428571,27.471429,6.771429,33.3,23.3,4.0,3
4,sj,1990,22,0.1962,0.2622,0.2512,0.24734,7.52,299.518571,299.664286,...,80.46,7.52,17.21,3.014286,28.942857,9.371429,35.0,23.9,5.8,6


Trataremos los valores perdidos que los hay dado que nos evitarían tener un modelo de los datos que tenemos.

In [None]:
pd.isnull(train1).sum()
pd.isnull(train2).sum()

city                                      0
year                                      0
weekofyear                                0
ndvi_ne                                   3
ndvi_nw                                   3
ndvi_se                                   3
ndvi_sw                                   3
precipitation_amt_mm                      4
reanalysis_air_temp_k                     4
reanalysis_avg_temp_k                     4
reanalysis_dew_point_temp_k               4
reanalysis_max_air_temp_k                 4
reanalysis_min_air_temp_k                 4
reanalysis_precip_amt_kg_per_m2           4
reanalysis_relative_humidity_percent      4
reanalysis_sat_precip_amt_mm              4
reanalysis_specific_humidity_g_per_kg     4
reanalysis_tdtr_k                         4
station_avg_temp_c                       37
station_diur_temp_rng_c                  37
station_max_temp_c                       14
station_min_temp_c                        8
station_precip_mm               

Los vamos a procesar de forma automática completandolos con el método ffill (con el valor previo)

In [None]:
# ffill: propagate last valid observation forward to next valid backfill
# Otra Opción: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.interpolate.html#pandas.DataFrame.interpolate
train1=train1.fillna(train1.mean())
pd.isnull(train1).any()
train2=train2.fillna(train2.mean())
pd.isnull(train1).any()

  This is separate from the ipykernel package so we can avoid doing imports until
  """


city                                     False
year                                     False
weekofyear                               False
ndvi_ne                                  False
ndvi_nw                                  False
ndvi_se                                  False
ndvi_sw                                  False
precipitation_amt_mm                     False
reanalysis_air_temp_k                    False
reanalysis_avg_temp_k                    False
reanalysis_dew_point_temp_k              False
reanalysis_max_air_temp_k                False
reanalysis_min_air_temp_k                False
reanalysis_precip_amt_kg_per_m2          False
reanalysis_relative_humidity_percent     False
reanalysis_sat_precip_amt_mm             False
reanalysis_specific_humidity_g_per_kg    False
reanalysis_tdtr_k                        False
station_avg_temp_c                       False
station_diur_temp_rng_c                  False
station_max_temp_c                       False
station_min_t

Y aplicamos lo mismo para el test

In [None]:
test1=test1.fillna(test1.mean())
pd.isnull(test1).any()
test2=test2.fillna(test2.mean())
pd.isnull(test2).any()

  """Entry point for launching an IPython kernel.
  This is separate from the ipykernel package so we can avoid doing imports until


city                                     False
year                                     False
weekofyear                               False
ndvi_ne                                  False
ndvi_nw                                  False
ndvi_se                                  False
ndvi_sw                                  False
precipitation_amt_mm                     False
reanalysis_air_temp_k                    False
reanalysis_avg_temp_k                    False
reanalysis_dew_point_temp_k              False
reanalysis_max_air_temp_k                False
reanalysis_min_air_temp_k                False
reanalysis_precip_amt_kg_per_m2          False
reanalysis_relative_humidity_percent     False
reanalysis_sat_precip_amt_mm             False
reanalysis_specific_humidity_g_per_kg    False
reanalysis_tdtr_k                        False
station_avg_temp_c                       False
station_diur_temp_rng_c                  False
station_max_temp_c                       False
station_min_t

También tenemos que transformar nuestras variables categóricas, en este caso una única variable que es la ciudad para lo que vamos a utilizar un *binarizador* para train y para test

In [None]:
from sklearn import preprocessing
lb = preprocessing.LabelBinarizer()
train1['city_bin'] = lb.fit_transform(train1['city'])
test1['city_bin'] = lb.fit_transform(test1['city'])
train2['city_bin'] = lb.fit_transform(train2['city'])
test2['city_bin'] = lb.fit_transform(test2['city'])

In [None]:
train1.tail()

Unnamed: 0,city,year,weekofyear,ndvi_ne,ndvi_nw,ndvi_se,ndvi_sw,precipitation_amt_mm,reanalysis_air_temp_k,reanalysis_avg_temp_k,...,reanalysis_sat_precip_amt_mm,reanalysis_specific_humidity_g_per_kg,reanalysis_tdtr_k,station_avg_temp_c,station_diur_temp_rng_c,station_max_temp_c,station_min_temp_c,station_precip_mm,total_cases,city_bin
931,sj,2008,13,0.07785,-0.0399,0.310471,0.296243,27.19,296.958571,296.957143,...,27.19,13.644286,2.885714,25.042857,5.785714,30.0,21.1,1.8,4,0
932,sj,2008,14,-0.038,-0.016833,0.119371,0.066386,3.82,298.081429,298.228571,...,3.82,14.662857,2.714286,26.242857,6.814286,30.6,22.2,0.5,3,0
933,sj,2008,15,-0.1552,-0.05275,0.137757,0.141214,16.96,297.46,297.564286,...,16.96,14.184286,2.185714,25.0,5.714286,29.4,21.7,30.7,1,0
934,sj,2008,16,0.0018,0.067469,0.2039,0.209843,0.0,297.63,297.778571,...,0.0,13.858571,2.785714,25.314286,6.242857,29.4,21.7,11.2,3,0
935,sj,2008,17,-0.037,-0.010367,0.077314,0.090586,0.0,298.672857,298.692857,...,0.0,15.671429,3.957143,27.042857,7.514286,31.7,23.3,0.3,5,0


Hay que seleccionar características, pueden ser todas, pero eso normalmente no es una buena idea, por lo que aquí dejaremos algunas seleccionadas.

In [None]:
selected_features = ['reanalysis_specific_humidity_g_per_kg', 'reanalysis_dew_point_temp_k', 
                 'station_avg_temp_c', 'station_min_temp_c']

Normalizamos los datos a partir de los coeficientes mínimo y máximo de la colección de entrenamiento.

**(Es opcional y no siempre efectivo)**

In [None]:
from sklearn.preprocessing import MinMaxScaler
# performin min-max scaling each continuous feature column to the range [0, 1]
scaler1 = MinMaxScaler()
scaler2 = MinMaxScaler()
X_train1 = scaler1.fit_transform(train1[selected_features])
X_test1 = scaler1.transform(test1[selected_features])
y_train1 = train1['total_cases']

X_train2 = scaler2.fit_transform(train2[selected_features])
X_test2 = scaler2.transform(test2[selected_features])
y_train2 = train2['total_cases']


# Construcción y evaluación preliminar del modelo

In [None]:
from sklearn import neighbors
from sklearn.model_selection import KFold
import matplotlib.pyplot as plt
from sklearn.metrics import mean_absolute_error
import numpy as np
from sklearn.neighbors import KNeighborsRegressor
from sklearn import metrics
#llamo al objeto o a la función 
KNN_reg1 = KNeighborsRegressor(n_neighbors=7)
# Entrenamiento de lo modelo 
regressor1 = KNN_reg1.fit(X_train1, y_train1)

#llamo al objeto o a la función 
KNN_reg2 = KNeighborsRegressor(n_neighbors=7)
# Entrenamiento de lo modelo 
regressor2 = KNN_reg2.fit(X_train2, y_train2)

# Generación del Resultado Final

Generamos el fichero de salida segun el Submission format

* Realizamos la predicción sobre el test

In [None]:
y_pred_knn1 = regressor1.predict(X_test1)
y_pred_knn2 = regressor2.predict(X_test2)

In [None]:
y_pred_knn1
y_pred_knn2

array([ 4.57142857,  3.        ,  4.42857143,  2.28571429,  1.85714286,
        9.42857143,  2.85714286,  2.14285714,  5.85714286,  9.14285714,
        2.42857143, 11.71428571,  5.71428571,  8.14285714,  5.14285714,
        7.        ,  6.85714286,  7.71428571, 13.57142857,  8.14285714,
        8.71428571,  9.57142857,  6.        , 10.71428571,  6.85714286,
        9.28571429,  4.42857143,  5.85714286,  8.85714286,  5.42857143,
        5.42857143,  5.85714286,  5.71428571, 25.14285714,  4.14285714,
        9.85714286,  4.85714286,  6.71428571,  7.85714286, 12.71428571,
        6.71428571, 16.28571429, 11.        ,  4.71428571, 11.85714286,
        2.85714286,  7.        ,  5.14285714,  2.71428571,  1.71428571,
        1.28571429,  1.42857143,  1.71428571,  2.28571429,  4.85714286,
        2.71428571,  2.57142857,  2.71428571,  4.57142857,  6.57142857,
        5.57142857,  3.        ,  2.71428571,  8.14285714,  4.57142857,
        7.57142857,  3.42857143, 13.28571429,  7.85714286,  8.  

* Redondeamos el resultado de sj

In [None]:
# round the result and cast to int
import numpy as np
yknn1 = np.rint(y_pred_knn1) # round
yknn1 = yknn1.astype(int) # cast to int
resknn1 = np.hstack(yknn1)
resknn1

yknn2 = np.rint(y_pred_knn2) # round
yknn2 = yknn2.astype(int) # cast to int
resknn2 = np.hstack(yknn2)
resknn2

array([ 5,  3,  4,  2,  2,  9,  3,  2,  6,  9,  2, 12,  6,  8,  5,  7,  7,
        8, 14,  8,  9, 10,  6, 11,  7,  9,  4,  6,  9,  5,  5,  6,  6, 25,
        4, 10,  5,  7,  8, 13,  7, 16, 11,  5, 12,  3,  7,  5,  3,  2,  1,
        1,  2,  2,  5,  3,  3,  3,  5,  7,  6,  3,  3,  8,  5,  8,  3, 13,
        8,  8,  9,  4, 30, 10,  4,  4, 11, 24,  7,  6, 14,  4,  5, 10, 12,
        5,  6,  5,  4,  6,  2, 11,  8, 13, 15,  9, 12,  9,  7, 12,  5,  3,
        4,  2,  9,  1,  1,  3,  2,  1,  2,  3,  3,  2,  2,  5,  2, 10,  1,
       17,  8, 12,  7,  9, 14, 11,  6,  6,  6,  8, 17,  3, 13,  6,  7,  9,
       24, 18, 12, 25,  7,  4,  7,  4, 12,  7,  7,  2,  8,  7,  4, 10,  4,
        8,  3,  5])

In [None]:
y_pred_no_negknn1 = resknn1.copy()
y_pred_no_negknn2 = resknn2.copy()

In [None]:
y_pred_no_negknn1[y_pred_knn1 < 0] = 0
y_pred_no_negknn1

y_pred_no_negknn2[y_pred_knn2 < 0] = 0
y_pred_no_negknn2

array([ 5,  3,  4,  2,  2,  9,  3,  2,  6,  9,  2, 12,  6,  8,  5,  7,  7,
        8, 14,  8,  9, 10,  6, 11,  7,  9,  4,  6,  9,  5,  5,  6,  6, 25,
        4, 10,  5,  7,  8, 13,  7, 16, 11,  5, 12,  3,  7,  5,  3,  2,  1,
        1,  2,  2,  5,  3,  3,  3,  5,  7,  6,  3,  3,  8,  5,  8,  3, 13,
        8,  8,  9,  4, 30, 10,  4,  4, 11, 24,  7,  6, 14,  4,  5, 10, 12,
        5,  6,  5,  4,  6,  2, 11,  8, 13, 15,  9, 12,  9,  7, 12,  5,  3,
        4,  2,  9,  1,  1,  3,  2,  1,  2,  3,  3,  2,  2,  5,  2, 10,  1,
       17,  8, 12,  7,  9, 14, 11,  6,  6,  6,  8, 17,  3, 13,  6,  7,  9,
       24, 18, 12, 25,  7,  4,  7,  4, 12,  7,  7,  2,  8,  7,  4, 10,  4,
        8,  3,  5])

* Redondeamos el resultado de iq

In [None]:
# round the result and cast to int
import numpy as np
y2 = np.rint(y_pred2) # round
y2 = y2.astype(int) # cast to int
res2 = np.hstack(y2)
res2

array([ 5,  3,  4,  2,  2,  9,  3,  2,  6,  9,  2, 12,  6,  8,  5,  7,  7,
        8, 14,  8,  9, 10,  6, 11,  7,  9,  4,  6,  9,  5,  5,  6,  6, 25,
        4, 10,  5,  7,  8, 13,  7, 16, 11,  5, 12,  3,  7,  5,  3,  2,  1,
        1,  2,  2,  5,  3,  3,  3,  5,  7,  6,  3,  3,  8,  5,  8,  3, 13,
        8,  8,  9,  4, 30, 10,  4,  4, 11, 24,  7,  6, 14,  4,  5, 10, 12,
        5,  6,  5,  4,  6,  2, 11,  8, 13, 15,  9, 12,  9,  7, 12,  5,  3,
        4,  2,  9,  1,  1,  3,  2,  1,  2,  3,  3,  2,  2,  5,  2, 10,  1,
       17,  8, 12,  7,  9, 14, 11,  6,  6,  6,  8, 17,  3, 13,  6,  7,  9,
       24, 18, 12, 25,  7,  4,  7,  4, 12,  7,  7,  2,  8,  7,  4, 10,  4,
        8,  3,  5])

In [None]:
y_pred_no_neg2 = res2.copy()

In [None]:
y_pred_no_neg2[y_pred2 < 0] = 0
y_pred_no_neg2

array([ 5,  3,  4,  2,  2,  9,  3,  2,  6,  9,  2, 12,  6,  8,  5,  7,  7,
        8, 14,  8,  9, 10,  6, 11,  7,  9,  4,  6,  9,  5,  5,  6,  6, 25,
        4, 10,  5,  7,  8, 13,  7, 16, 11,  5, 12,  3,  7,  5,  3,  2,  1,
        1,  2,  2,  5,  3,  3,  3,  5,  7,  6,  3,  3,  8,  5,  8,  3, 13,
        8,  8,  9,  4, 30, 10,  4,  4, 11, 24,  7,  6, 14,  4,  5, 10, 12,
        5,  6,  5,  4,  6,  2, 11,  8, 13, 15,  9, 12,  9,  7, 12,  5,  3,
        4,  2,  9,  1,  1,  3,  2,  1,  2,  3,  3,  2,  2,  5,  2, 10,  1,
       17,  8, 12,  7,  9, 14, 11,  6,  6,  6,  8, 17,  3, 13,  6,  7,  9,
       24, 18, 12, 25,  7,  4,  7,  4, 12,  7,  7,  2,  8,  7,  4, 10,  4,
        8,  3,  5])

* Generamos el fichero de salida

In [None]:
# generate output
outputknn1 = pd.DataFrame({ 'city': test1['city'], 'year': test1['year'], 'weekofyear': test1['weekofyear'], 
                       'total_cases': y_pred_no_negknn1})

outputknn2 = pd.DataFrame({ 'city': test2['city'], 'year': test2['year'], 'weekofyear': test2['weekofyear'], 
                       'total_cases': y_pred_no_negknn2})

outputknn= pd.concat([outputknn1,outputknn2], ignore_index=True)




In [None]:
from google.colab import files

with open('resultadoknn.csv', 'w') as f:
  outputknn.to_csv(f,  index = False)
  

files.download('resultadoknn.csv')

outputknn.head()

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

Unnamed: 0,city,year,weekofyear,total_cases
0,sj,2008,18,19
1,sj,2008,19,16
2,sj,2008,20,17
3,sj,2008,21,29
4,sj,2008,22,7
