## ANALISIS DE DATOS DE INTERVALOS DE LATIDOS DEL CORAZON (IBI)

El presente es para analizar los datos de la temperatura de la piel del smartwatch, el cual tiene un procesamiento de datos en 1.25Hz, que serian 1 registros por 0.80 segundos

In [1]:
# Importando Pandas y otras librerias
import pandas as pd
import numpy as np

In [2]:
# Leyendo el CSV
ibi_values = pd.read_csv('IBI_016.csv', engine='python', na_values="not available")

In [4]:
ibi_values.head()

Unnamed: 0,datetime,ibi
0,2020-07-16 09:30:51.629972,0.734409
1,2020-07-16 09:30:52.411258,0.781286
2,2020-07-16 09:30:53.176918,0.76566
3,2020-07-16 09:30:53.880075,0.703157
4,2020-07-16 09:30:54.676987,0.796911


In [5]:
ibi_values.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 317138 entries, 0 to 317137
Data columns (total 2 columns):
 #   Column    Non-Null Count   Dtype  
---  ------    --------------   -----  
 0   datetime  317138 non-null  object 
 1    ibi      317138 non-null  float64
dtypes: float64(1), object(1)
memory usage: 4.8+ MB


In [6]:
ibi_values.count()

datetime    317138
 ibi        317138
dtype: int64

In [7]:
ibi_values["datetime"].head()

0    2020-07-16 09:30:51.629972
1    2020-07-16 09:30:52.411258
2    2020-07-16 09:30:53.176918
3    2020-07-16 09:30:53.880075
4    2020-07-16 09:30:54.676987
Name: datetime, dtype: object

### Trabajando con Datetime
Lo primero sera convertir los datetime a el formato correcto, ya que lo esta detectando como object, lo siguiente sera colocar como index las fechas y al final agrupar por cada 5 minutos los datos para obtener el promedio y media de los datos


In [8]:
# Convertimos en fechas los datimetimes
ibi_values['datetime'] = pd.to_datetime(ibi_values['datetime'])
print(ibi_values.columns)

Index(['datetime', ' ibi'], dtype='object')


In [9]:


## Se coloca indices como datetime
ibi_values = ibi_values.set_index('datetime')
print(ibi_values.columns)


Index([' ibi'], dtype='object')


In [10]:
df_procesado_5min = ibi_values[' ibi'].resample('5min') 

### Calculamos le media, la mediana y demas factores de estadistica

En este caso tenemos que obtener el promedio, mediana, max, min, desviacion estandar y quartiles

In [11]:
# Funcion para calcular los cuartiles 1 y 3 que indican en el paper
# 
def quartiles(x):
    return pd.Series([x.quantile(0.25), x.quantile(0.75)], index=['q1', 'q3'])


In [12]:
# Crear a serie de dataframe de 5 min
series5min = quartiles(df_procesado_5min)
series5min.head()

q1    datetime
2020-07-16 09:30:00    0.750034
2020-...
q3    datetime
2020-07-16 09:30:00    0.796911
2020-...
dtype: object

In [13]:
# Definimos los metodos del dataframe a calcular
df_5min = df_procesado_5min.agg(['mean', 'median', 'max', 'min', 'std'])
print(df_5min.columns)
# Removemos las columnas que no necesitamos por ahora
# Supongamos que tienes tus datos en un DataFrame llamado 'df'
# columns_to_remove = [' temp']
# df_5min = df_5min.drop(columns=columns_to_remove)
df_5min.head(20)

Index(['mean', 'median', 'max', 'min', 'std'], dtype='object')


Unnamed: 0_level_0,mean,median,max,min,std
datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2020-07-16 09:30:00,0.776749,0.781286,0.906291,0.703157,0.032791
2020-07-16 09:35:00,0.785514,0.781286,0.953169,0.65628,0.050713
2020-07-16 09:40:00,0.773977,0.781286,1.000046,0.578151,0.073439
2020-07-16 09:45:00,0.798779,0.796911,0.968794,0.640654,0.048668
2020-07-16 09:50:00,0.793604,0.796911,0.98442,0.531274,0.066225
2020-07-16 09:55:00,0.812537,0.812537,1.046923,0.65628,0.046587
2020-07-16 10:00:00,0.623808,0.578151,0.906291,0.43752,0.132627
2020-07-16 10:05:00,0.547551,0.539087,0.687531,0.43752,0.062527
2020-07-16 10:10:00,0.592858,0.617216,0.734409,0.453146,0.075865
2020-07-16 10:15:00,0.694702,0.687531,0.890666,0.578151,0.04814


In [14]:
# Lo mismo aplicamos para 1 hora
df_procesado_1hora = ibi_values[' ibi'].resample('1h') 
# Obtenemos el promedio
df_1hora = df_procesado_1hora.agg(['mean', 'median', 'max', 'min', 'std'])

# Removemos las columnas que no necesitamos por ahora
# df_1hora = df_1hora.drop(columns=columns_to_remove)
df_1hora.head(20)

Unnamed: 0_level_0,mean,median,max,min,std
datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2020-07-16 09:00:00,0.794229,0.796911,1.046923,0.531274,0.054648
2020-07-16 10:00:00,0.715725,0.718783,1.0938,0.43752,0.077991
2020-07-16 11:00:00,0.711695,0.718783,1.20318,0.468771,0.083445
2020-07-16 12:00:00,0.767768,0.76566,1.046923,0.500023,0.069575
2020-07-16 13:00:00,0.71466,0.718783,1.20318,0.500023,0.075159
2020-07-16 14:00:00,0.72475,0.734409,1.31256,0.468771,0.102345
2020-07-16 15:00:00,0.623986,0.625029,0.906291,0.406269,0.070118
2020-07-16 16:00:00,0.598112,0.593777,1.062549,0.43752,0.076431
2020-07-16 17:00:00,0.649608,0.65628,0.968794,0.390643,0.066375
2020-07-16 18:00:00,0.709847,0.718783,1.000046,0.43752,0.069386


In [15]:
# Separar los cuartiles en columnas individuales
# Obtenemos los quantiles
df_5min_quantil1 = df_procesado_5min.quantile(0.25)
df_5min_quantil3 = df_procesado_5min.quantile(0.75)
df_1hora_quantil1 = df_procesado_1hora.quantile(0.25)
df_1hora_quantil3 = df_procesado_1hora.quantile(0.75)
df_5min['q1'] = df_5min_quantil1
df_5min['q3'] = df_5min_quantil3
df_5min.head(10)
# df_1hora[['q1', 'q3']] = [df_1hora_quantil1,df_1hora_quantil3]


Unnamed: 0_level_0,mean,median,max,min,std,q1,q3
datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2020-07-16 09:30:00,0.776749,0.781286,0.906291,0.703157,0.032791,0.750034,0.796911
2020-07-16 09:35:00,0.785514,0.781286,0.953169,0.65628,0.050713,0.750034,0.812537
2020-07-16 09:40:00,0.773977,0.781286,1.000046,0.578151,0.073439,0.734409,0.796911
2020-07-16 09:45:00,0.798779,0.796911,0.968794,0.640654,0.048668,0.76566,0.828163
2020-07-16 09:50:00,0.793604,0.796911,0.98442,0.531274,0.066225,0.76566,0.828163
2020-07-16 09:55:00,0.812537,0.812537,1.046923,0.65628,0.046587,0.781286,0.843789
2020-07-16 10:00:00,0.623808,0.578151,0.906291,0.43752,0.132627,0.531274,0.734409
2020-07-16 10:05:00,0.547551,0.539087,0.687531,0.43752,0.062527,0.500023,0.582058
2020-07-16 10:10:00,0.592858,0.617216,0.734409,0.453146,0.075865,0.519555,0.652373
2020-07-16 10:15:00,0.694702,0.687531,0.890666,0.578151,0.04814,0.65628,0.718783


In [16]:
df_5min.count()

mean      1824
median    1824
max       1824
min       1824
std       1817
q1        1824
q3        1824
dtype: int64

In [17]:
# Igual con 1hora de dataset
df_1hora['q1'] = df_1hora_quantil1
df_1hora['q3'] = df_1hora_quantil3
df_1hora.head(10)

Unnamed: 0_level_0,mean,median,max,min,std,q1,q3
datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2020-07-16 09:00:00,0.794229,0.796911,1.046923,0.531274,0.054648,0.76566,0.828163
2020-07-16 10:00:00,0.715725,0.718783,1.0938,0.43752,0.077991,0.687531,0.76566
2020-07-16 11:00:00,0.711695,0.718783,1.20318,0.468771,0.083445,0.65628,0.76566
2020-07-16 12:00:00,0.767768,0.76566,1.046923,0.500023,0.069575,0.734409,0.812537
2020-07-16 13:00:00,0.71466,0.718783,1.20318,0.500023,0.075159,0.671906,0.76566
2020-07-16 14:00:00,0.72475,0.734409,1.31256,0.468771,0.102345,0.65628,0.781286
2020-07-16 15:00:00,0.623986,0.625029,0.906291,0.406269,0.070118,0.578151,0.671906
2020-07-16 16:00:00,0.598112,0.593777,1.062549,0.43752,0.076431,0.531274,0.65628
2020-07-16 17:00:00,0.649608,0.65628,0.968794,0.390643,0.066375,0.625029,0.687531
2020-07-16 18:00:00,0.709847,0.718783,1.000046,0.43752,0.069386,0.671906,0.750034


In [18]:
df_1hora.count()

mean      160
median    160
max       160
min       160
std       160
q1        160
q3        160
dtype: int64

In [19]:
# Exportamos los resultados en un csv
df_5min.to_csv("IBI_5min.csv")
df_1hora.to_csv("IBI_1hora.csv")

### ARCHIVOS CSV GENERADOS CON EXITO PARA 5 MIN Y 1 HORA