## ANALISIS DE DATOS DE INTERVALOS DE LATIDOS DEL CORAZON (IBI)

El presente es para analizar los datos de la temperatura de la piel del smartwatch, el cual tiene un procesamiento de datos en 1.25Hz, que serian 1 registros por 0.80 segundos

In [45]:
# Importando Pandas y otras librerias
import pandas as pd
import numpy as np

In [46]:
# Leyendo el CSV
ibi_values = pd.read_csv('IBI_016.csv', engine='python', na_values="not available")

In [47]:
ibi_values.head()

Unnamed: 0,datetime,ibi
0,2020-07-16 09:30:51.629972,0.734409
1,2020-07-16 09:30:52.411258,0.781286
2,2020-07-16 09:30:53.176918,0.76566
3,2020-07-16 09:30:53.880075,0.703157
4,2020-07-16 09:30:54.676987,0.796911


In [48]:
ibi_values.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 317138 entries, 0 to 317137
Data columns (total 2 columns):
 #   Column    Non-Null Count   Dtype  
---  ------    --------------   -----  
 0   datetime  317138 non-null  object 
 1    ibi      317138 non-null  float64
dtypes: float64(1), object(1)
memory usage: 4.8+ MB


In [49]:
ibi_values.count()

datetime    317138
 ibi        317138
dtype: int64

In [50]:
ibi_values["datetime"].head()

0    2020-07-16 09:30:51.629972
1    2020-07-16 09:30:52.411258
2    2020-07-16 09:30:53.176918
3    2020-07-16 09:30:53.880075
4    2020-07-16 09:30:54.676987
Name: datetime, dtype: object

### Trabajando con Datetime
Lo primero sera convertir los datetime a el formato correcto, ya que lo esta detectando como object, lo siguiente sera colocar como index las fechas y al final agrupar por cada 5 minutos los datos para obtener el promedio y media de los datos


In [51]:
# Convertimos en fechas los datimetimes
ibi_values['datetime'] = pd.to_datetime(ibi_values['datetime'])
print(ibi_values.columns)

Index(['datetime', ' ibi'], dtype='object')


In [52]:


## Se coloca indices como datetime
ibi_values = ibi_values.set_index('datetime')
print(ibi_values.columns)


Index([' ibi'], dtype='object')


In [53]:
df_procesado_5min = ibi_values.resample('5min') 

### Calculamos le media, la mediana y demas factores de estadistica

En este caso tenemos que obtener el promedio, mediana, max, min, desviacion estandar y quartiles

In [54]:
# Funcion para calcular los cuartiles 1 y 3 que indican en el paper
# 
def quartiles(x):
    return pd.Series([x.quantile(0.25), x.quantile(0.75)], index=['q1', 'q3'])


In [55]:
# Crear a serie de dataframe de 5 min
series5min = quartiles(df_procesado_5min)
series5min.head()

q1                              ibi
datetime        ...
q3                              ibi
datetime        ...
dtype: object

In [56]:
# Definimos los metodos del dataframe a calcular
df_5min = df_procesado_5min.agg(['mean', 'median', 'max', 'min', 'std'])
print(df_5min.columns)
df_5min.head(20)

MultiIndex([(' ibi',   'mean'),
            (' ibi', 'median'),
            (' ibi',    'max'),
            (' ibi',    'min'),
            (' ibi',    'std')],
           )


Unnamed: 0_level_0,ibi,ibi,ibi,ibi,ibi
Unnamed: 0_level_1,mean,median,max,min,std
datetime,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2
2020-07-16 09:30:00,0.776749,0.781286,0.906291,0.703157,0.032791
2020-07-16 09:35:00,0.785514,0.781286,0.953169,0.65628,0.050713
2020-07-16 09:40:00,0.773977,0.781286,1.000046,0.578151,0.073439
2020-07-16 09:45:00,0.798779,0.796911,0.968794,0.640654,0.048668
2020-07-16 09:50:00,0.793604,0.796911,0.98442,0.531274,0.066225
2020-07-16 09:55:00,0.812537,0.812537,1.046923,0.65628,0.046587
2020-07-16 10:00:00,0.623808,0.578151,0.906291,0.43752,0.132627
2020-07-16 10:05:00,0.547551,0.539087,0.687531,0.43752,0.062527
2020-07-16 10:10:00,0.592858,0.617216,0.734409,0.453146,0.075865
2020-07-16 10:15:00,0.694702,0.687531,0.890666,0.578151,0.04814


In [57]:
# Lo mismo aplicamos para 1 hora
df_procesado_1hora = ibi_values[' ibi'].resample('1h') 
# Obtenemos el promedio
df_1hora = df_procesado_1hora.agg(['mean', 'median', 'max', 'min', 'std'])

# Removemos las columnas que no necesitamos por ahora
# df_1hora = df_1hora.drop(columns=columns_to_remove)
df_1hora.head(20)

Unnamed: 0_level_0,mean,median,max,min,std
datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2020-07-16 09:00:00,0.794229,0.796911,1.046923,0.531274,0.054648
2020-07-16 10:00:00,0.715725,0.718783,1.0938,0.43752,0.077991
2020-07-16 11:00:00,0.711695,0.718783,1.20318,0.468771,0.083445
2020-07-16 12:00:00,0.767768,0.76566,1.046923,0.500023,0.069575
2020-07-16 13:00:00,0.71466,0.718783,1.20318,0.500023,0.075159
2020-07-16 14:00:00,0.72475,0.734409,1.31256,0.468771,0.102345
2020-07-16 15:00:00,0.623986,0.625029,0.906291,0.406269,0.070118
2020-07-16 16:00:00,0.598112,0.593777,1.062549,0.43752,0.076431
2020-07-16 17:00:00,0.649608,0.65628,0.968794,0.390643,0.066375
2020-07-16 18:00:00,0.709847,0.718783,1.000046,0.43752,0.069386


In [58]:
# Separar los cuartiles en columnas individuales
# Obtenemos los quantiles
df_5min_quantil1 = df_procesado_5min.quantile(0.25)
df_5min_quantil3 = df_procesado_5min.quantile(0.75)
df_1hora_quantil1 = df_procesado_1hora.quantile(0.25)
df_1hora_quantil3 = df_procesado_1hora.quantile(0.75)
df_5min['q1'] = df_5min_quantil1
df_5min['q3'] = df_5min_quantil3
df_5min.head(10)
# df_1hora[['q1', 'q3']] = [df_1hora_quantil1,df_1hora_quantil3]


Unnamed: 0_level_0,ibi,ibi,ibi,ibi,ibi,q1,q3
Unnamed: 0_level_1,mean,median,max,min,std,Unnamed: 6_level_1,Unnamed: 7_level_1
datetime,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2
2020-07-16 09:30:00,0.776749,0.781286,0.906291,0.703157,0.032791,0.750034,0.796911
2020-07-16 09:35:00,0.785514,0.781286,0.953169,0.65628,0.050713,0.750034,0.812537
2020-07-16 09:40:00,0.773977,0.781286,1.000046,0.578151,0.073439,0.734409,0.796911
2020-07-16 09:45:00,0.798779,0.796911,0.968794,0.640654,0.048668,0.76566,0.828163
2020-07-16 09:50:00,0.793604,0.796911,0.98442,0.531274,0.066225,0.76566,0.828163
2020-07-16 09:55:00,0.812537,0.812537,1.046923,0.65628,0.046587,0.781286,0.843789
2020-07-16 10:00:00,0.623808,0.578151,0.906291,0.43752,0.132627,0.531274,0.734409
2020-07-16 10:05:00,0.547551,0.539087,0.687531,0.43752,0.062527,0.500023,0.582058
2020-07-16 10:10:00,0.592858,0.617216,0.734409,0.453146,0.075865,0.519555,0.652373
2020-07-16 10:15:00,0.694702,0.687531,0.890666,0.578151,0.04814,0.65628,0.718783


In [59]:
df_5min.count()

 ibi  mean      1824
      median    1824
      max       1824
      min       1824
      std       1817
q1              1824
q3              1824
dtype: int64

In [60]:
# Igual con 1hora de dataset
df_1hora['q1'] = df_1hora_quantil1
df_1hora['q3'] = df_1hora_quantil3
df_1hora.head(10)

Unnamed: 0_level_0,mean,median,max,min,std,q1,q3
datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2020-07-16 09:00:00,0.794229,0.796911,1.046923,0.531274,0.054648,0.76566,0.828163
2020-07-16 10:00:00,0.715725,0.718783,1.0938,0.43752,0.077991,0.687531,0.76566
2020-07-16 11:00:00,0.711695,0.718783,1.20318,0.468771,0.083445,0.65628,0.76566
2020-07-16 12:00:00,0.767768,0.76566,1.046923,0.500023,0.069575,0.734409,0.812537
2020-07-16 13:00:00,0.71466,0.718783,1.20318,0.500023,0.075159,0.671906,0.76566
2020-07-16 14:00:00,0.72475,0.734409,1.31256,0.468771,0.102345,0.65628,0.781286
2020-07-16 15:00:00,0.623986,0.625029,0.906291,0.406269,0.070118,0.578151,0.671906
2020-07-16 16:00:00,0.598112,0.593777,1.062549,0.43752,0.076431,0.531274,0.65628
2020-07-16 17:00:00,0.649608,0.65628,0.968794,0.390643,0.066375,0.625029,0.687531
2020-07-16 18:00:00,0.709847,0.718783,1.000046,0.43752,0.069386,0.671906,0.750034


In [61]:
df_1hora.count()

mean      160
median    160
max       160
min       160
std       160
q1        160
q3        160
dtype: int64

In [62]:
# Exportamos los resultados en un csv
df_5min.to_csv("IBI_5min.csv")
df_1hora.to_csv("IBI_1hora.csv")

### ARCHIVOS CSV GENERADOS CON EXITO PARA 5 MIN Y 1 HORA

Para esta parte ahora tenemos que calcular los calculos de VFC, para ello se esta utilizando una libreria reada por Digital Biomarkers Discovery, la cual se encargara de procesar los datos por las ventanas de 5 minutos

In [78]:
# Ahora generamos el calculo de VFC
# Primero importamos la libreria especial de Digital Biomarkers Discovery tiene ya creada
import BIL_HRV as bh
import os
import time

In [80]:
# Función para calcular MeanRR y MeanHR
TEMPORAL_NAME = 'test.csv'
def calculate_hr(df):
    time.sleep(0.2)
    df.fillna(0)
    # df['ibi'] = df[' ibi']
    df[' ibi'] = pd.to_numeric(df[' ibi'], errors='coerce')
    df = df.dropna(subset=[' ibi'])
    # df = df.drop([' ibi'], axis=1)
    df[' ibi'] = df[' ibi'].astype(float)
    df.to_csv(TEMPORAL_NAME)
    try:
        results = bh.hrv(TEMPORAL_NAME)
    except:
        print("Exception found, Default value response")
        # Crear un diccionario con valores vacíos
        results = {
            'MeanRR': 0.0,
            'MeanHR': 0.0,
            'MinHR': 0.0,
            'MaxHR': 0.0,
            'SDNN': 0.0,
            'RMSSD': 0.0,
            'NNx': 0.0,
            'pNNx': 0.0,
            'PowerVLF': 0.0,
            'PowerLF': 0.0,
            'PowerHF': 0.0,
            'PowerTotal': 0.0,
            'LF/HF': 0.0,
            'PeakVLF': 0.0,
            'PeakLF': 0.0,
            'PeakHF': 0.0,
            'FractionLF': 0.0,
            'FractionHF': 0.0
        }
    # Eliminar el archivo
    os.remove(TEMPORAL_NAME)
    return results

In [81]:
import warnings
# Resamplear el DataFrame a 5 minutos y aplicar la función
# Or if you are using > Python 3.11:
with warnings.catch_warnings(action="ignore"):
    resampled = df_procesado_5min.apply(calculate_hr).apply(pd.Series)

Exception found, Default value response
Exception found, Default value response
Exception found, Default value response
Exception found, Default value response
Exception found, Default value response
Exception found, Default value response
Exception found, Default value response
Exception found, Default value response
Exception found, Default value response
Exception found, Default value response
Exception found, Default value response
Exception found, Default value response
Exception found, Default value response
Exception found, Default value response
Exception found, Default value response
Exception found, Default value response
Exception found, Default value response
Exception found, Default value response
Exception found, Default value response
Exception found, Default value response
Exception found, Default value response
Exception found, Default value response
Exception found, Default value response
Exception found, Default value response
Exception found, Default value response


In [82]:
# Unir los resultados al DataFrame original
df_resampled = df_procesado_5min.mean()

df_resampled['MeanRR'] = resampled['MeanRR']
df_resampled['MeanHR'] = resampled['MeanHR']
df_resampled['MinHR'] = resampled['MinHR']
df_resampled['MaxHR'] = resampled['MaxHR']
df_resampled['SDNN'] = resampled['SDNN']
df_resampled['RMSSD'] = resampled['RMSSD']
df_resampled['NNx'] = resampled['NNx']
df_resampled['pNNx'] = resampled['pNNx']
df_resampled['PowerVLF'] = resampled['PowerVLF']
df_resampled['PowerLF'] = resampled['PowerLF']
df_resampled['PowerHF'] = resampled['PowerHF']
df_resampled['PowerTotal'] = resampled['PowerTotal']
df_resampled['LF/HF'] = resampled['LF/HF']
df_resampled['PeakVLF'] = resampled['PeakVLF']
df_resampled['PeakLF'] = resampled['PeakLF']
df_resampled['PeakHF'] = resampled['PeakHF']
df_resampled['FractionLF'] = resampled['FractionLF']
df_resampled['FractionHF'] = resampled['FractionHF']

In [83]:
df_resampled.head()

Unnamed: 0_level_0,ibi,MeanRR,MeanHR,MinHR,MaxHR,SDNN,RMSSD,NNx,pNNx,PowerVLF,PowerLF,PowerHF,PowerTotal,LF/HF,PeakVLF,PeakLF,PeakHF,FractionLF,FractionHF
datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1
2020-07-16 09:30:00,0.776749,778.1,77.1,73.6,78.8,11.7,49.8,24.0,26.1,39.84,57.64,334.6,432.08,0.17,0.02,0.14,0.34,14.7,85.3
2020-07-16 09:35:00,0.785514,787.8,76.2,73.1,80.3,15.3,73.7,27.0,32.1,95.94,288.62,673.65,1058.2,0.43,0.03,0.09,0.39,29.99,70.01
2020-07-16 09:40:00,0.773977,779.3,77.1,69.6,81.7,33.8,86.7,25.0,41.0,240.52,762.92,653.33,1656.78,1.17,0.02,0.09,0.3,53.87,46.13
2020-07-16 09:45:00,0.798779,797.9,75.2,71.6,79.5,18.3,66.9,59.0,37.3,101.57,249.34,788.83,1139.74,0.32,0.03,0.05,0.36,24.02,75.98
2020-07-16 09:50:00,0.793604,801.0,75.0,71.9,86.7,24.2,83.4,50.0,36.8,131.81,318.4,1115.08,1565.28,0.29,0.02,0.11,0.34,22.21,77.79


In [84]:
df_resampled.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 2172 entries, 2020-07-16 09:30:00 to 2020-07-23 22:25:00
Freq: 5T
Data columns (total 19 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0    ibi        1824 non-null   float64
 1   MeanRR      2172 non-null   float64
 2   MeanHR      2172 non-null   float64
 3   MinHR       2172 non-null   float64
 4   MaxHR       2172 non-null   float64
 5   SDNN        2172 non-null   float64
 6   RMSSD       2172 non-null   float64
 7   NNx         2172 non-null   float64
 8   pNNx        2172 non-null   float64
 9   PowerVLF    2172 non-null   float64
 10  PowerLF     2172 non-null   float64
 11  PowerHF     2172 non-null   float64
 12  PowerTotal  2172 non-null   float64
 13  LF/HF       2161 non-null   float64
 14  PeakVLF     2172 non-null   float64
 15  PeakLF      2172 non-null   float64
 16  PeakHF      2172 non-null   float64
 17  FractionLF  2161 non-null   float64
 18  FractionHF  2161 non-nu

In [85]:
df_resampled = df_resampled.dropna(subset=[' ibi'])
df_resampled.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1824 entries, 2020-07-16 09:30:00 to 2020-07-23 22:25:00
Data columns (total 19 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0    ibi        1824 non-null   float64
 1   MeanRR      1824 non-null   float64
 2   MeanHR      1824 non-null   float64
 3   MinHR       1824 non-null   float64
 4   MaxHR       1824 non-null   float64
 5   SDNN        1824 non-null   float64
 6   RMSSD       1824 non-null   float64
 7   NNx         1824 non-null   float64
 8   pNNx        1824 non-null   float64
 9   PowerVLF    1824 non-null   float64
 10  PowerLF     1824 non-null   float64
 11  PowerHF     1824 non-null   float64
 12  PowerTotal  1824 non-null   float64
 13  LF/HF       1813 non-null   float64
 14  PeakVLF     1824 non-null   float64
 15  PeakLF      1824 non-null   float64
 16  PeakHF      1824 non-null   float64
 17  FractionLF  1813 non-null   float64
 18  FractionHF  1813 non-null   floa

In [86]:
df_resampled.to_csv("IBI_5min_hr_data.csv")