<a href="https://colab.research.google.com/github/vamsi457/HeartRatePrediction/blob/main/HeartRate.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Predicting heart rate to monitor stress level

### Business Context

Anxiety and stress make your heart work harder. When you’re under stress your
body’s “fight or flight” response is triggered i.e. your body tenses, your blood
pressure rises and your heart beats faster. Stress hormones may damage the lining of
the arteries. In the current scenario post-covid, since most of us are indoors, stress
levels are at an all time high due to increasing anxieties which is leading to a higher
heart rate. And your body's response to stress may be a headache, back strain, or
stomach pains. Stress can also zap your energy, wreak havoc on your sleep and
make you feel cranky, forgetful and out of control.

Higher heart rate is not always better since pathological conditions can lead to an
increased heart rate. Tachycardia refers to a fast resting heart rate, usually over 100
beats per minute. Tachycardia can be dangerous, depending on its underlying cause
and on how hard the heart has to work.
An optimal level of heart rate is associated with health and self-regulatory capacity,
and adaptability or resilience. Higher levels of resting vagally-mediated heart rate are
linked to performance of executive functions like attention and emotional processing
by the prefrontal cortex.

Higher heart rates are usually connected with higher stress levels. When stress is
excessive, it can contribute to everything from high blood pressure , also called
hypertension, to asthma to ulcers to irritable bowel syndrome
Stress may affect behaviors and factors that increase heart disease risk: high blood
pressure and cholesterol levels, smoking, physical inactivity and overeating. Some
people may choose to drink too much alcohol or smoke cigarettes to “manage” their
chronic stress, however these habits can increase blood pressure and may damage
artery walls.

Thus, heart rate can be used to monitor your stress levels and keep it under check as
it is a useful indicator of good health.

A recent study speaks about effects of stress on increased heart attacks amongst
30-40 year olds:
https://economictimes.indiatimes.com/magazines/panache/heart-attacks-on-the-riseamong-30-40-year-olds-diabetes-hypertension-are-contributing-factors/articleshow/66997025.cms

### About the Data

The data comprises various attributes taken from signals measured using **ECG**
recorded for different individuals having different heart rates at the time the
measurement was taken. These various features contribute to the heart rate at the
given instant of time for the individual.

You have been provided with a total of **7 CSV files** with the names as follows:

**time_domain_features_train.csv** - This file contains all time domain features of heart
rate for training data

**frequency_domain_features_train.csv** - This file contains all frequency domain
features of heart rate for training data

**heart_rate_non_linear_features_train.csv** - This file contains all non linear features of
heart rate for training data

**time_domain_features_test.csv** - This file contains all time domain features of heart
rate for testing data

**frequency_domain_features_test.csv** - This file contains all frequency domain
features of heart rate for testing data

**heart_rate_non_linear_features_test.csv** - This file contains all non linear features of
heart rate for testing data

**sample_submission.csv** - This file contains the format in which you need to make
submissions to the portal


### Following is the data dictionary for the features you will come across in the files mentioned:


**MEAN_RR** - Mean of RR intervals

**MEDIAN_RR** - Median of RR intervals

**SDRR** - Standard deviation of RR intervals

**RMSSD** - Root mean square of successive RR interval differences

**SDSD** - Standard deviation of successive RR interval differences

**SDRR_RMSSD** - Ratio of SDRR / RMSSD

**pNN25** - Percentage of successive RR intervals that differ by more than 25 ms

**pNN50** - Percentage of successive RR intervals that differ by more than 50 ms

**KURT** - Kurtosis of distribution of successive RR intervals

**SKEW** - Skew of distribution of successive RR intervals

**MEAN_REL_RR** - Mean of relative RR intervals

**MEDIAN_REL_RR** - Median of relative RR intervals

**SDRR_REL_RR** - Standard deviation of relative RR intervals

**RMSSD_REL_RR** - Root mean square of successive relative RR interval differences

**SDSD_REL_RR** - Standard deviation of successive relative RR interval differences

**SDRR_RMSSD_REL_RR** - Ratio of SDRR/RMSSD for relative RR interval differences

**KURT_REL_RR** - Kurtosis of distribution of relative RR intervals

**SKEW_REL_RR** - Skewness of distribution of relative RR intervals

**uuid** - Unique ID for each patient

**VLF** - Absolute power of the very low frequency band (0.0033** - 0.04 Hz)

**VLF_PCT** - Principal component transform of VLF

**LF** - Absolute power of the low frequency band (0.04** - 0.15 Hz)

**LF_PCT** - Principal component transform of LF

**LF_NU** - Absolute power of the low frequency band in normal units

**HF** - Absolute power of the high frequency band (0.15** - 0.4 Hz)

**HF_PCT** - Principal component transform of HF

**HF_NU** - Absolute power of the highest frequency band in normal units

**TP** - Total power of RR intervals

**LF_HF** - Ratio of LF to HF

**HF_LF** - Ratio of HF to LF

**SD1** - Poincaré plot standard deviation perpendicular to the line of identity

**SD2** - Poincaré plot standard deviation along the line of identity

**Sampen** - sample entropy which measures the regularity and complexity of a time series

**higuci** - higuci fractal dimension of heartrate

**datasetId** - ID of the whole dataset

**condition** - condition of the patient at the time the data was recorded

**HR** - Heart rate of the patient at the time of data recorded


### Objective

The objective is to build a regressor model which can predict the heart rate of an individual. This prediction can help to monitor stress levels of the individual.

### Evaluation Metric

#### Mean Absolute Error :

$\frac{1}{n}\sum_{i=1}^{n}|x_i -x| $

n - total number of predicted samples

$x_i$ - predicted output

x - actual output

#### Submission Process

- You are required to submit a csv file which contains the uuid and its predicted
label(HR).
- Please note that file should be in a csv format as shown in
sample_submission.csv
- Please ensure that submission file contains all the test instances

Importing Libraraies


In [1]:
# For numerical calculations
import numpy as np

#For working on Dataset
import pandas as pd


# For plotting images and Graphs
%matplotlib inline
import matplotlib.pyplot as plt

# For Logging
import logging

# For encoding categorical variables
from sklearn.preprocessing import OneHotEncoder, StandardScaler

# Importing seaborn for statistical plots
import seaborn as sns

from sklearn.model_selection import train_test_split

from sklearn import linear_model
from sklearn.linear_model import LogisticRegression, LassoLars

from sklearn.metrics import mean_absolute_error

Importing Datasets


In [2]:
heart_rate_non_linear_features_train = pd.read_csv("/content/drive/MyDrive/AIML/HeartRate/heart_rate_non_linear_features_train.csv")
frequency_domain_features_train = pd.read_csv("/content/drive/MyDrive/AIML/HeartRate/frequency_domain_features_train.csv")
time_domain_features_train = pd.read_csv("/content/drive/MyDrive/AIML/HeartRate/time_domain_features_train.csv")

heart_rate_non_linear_features_test = pd.read_csv("/content/drive/MyDrive/AIML/HeartRate/heart_rate_non_linear_features_test.csv")
frequency_domain_features_test = pd.read_csv("/content/drive/MyDrive/AIML/HeartRate/frequency_domain_features_test.csv")
time_domain_features_test = pd.read_csv("/content/drive/MyDrive/AIML/HeartRate/time_domain_features_test.csv")


Enable Logging

In [5]:
# Enabling logging
logging.basicConfig(filemode='a', format='%(asctime)s - %(message)s')
logger = logging.getLogger('Heart Rate Prediction')
logger.setLevel(logging.DEBUG)

In [6]:
logger.info("Training Dataset Shape")
logger.info(heart_rate_non_linear_features_train.shape)
logger.info(frequency_domain_features_train.shape)
logger.info(time_domain_features_train.shape)


logger.info("Test Dataset Shape")

logger.info(heart_rate_non_linear_features_test.shape)
logger.info(frequency_domain_features_test.shape)
logger.info(time_domain_features_test.shape)

2020-12-14 07:37:53,519 - Training Dataset Shape
2020-12-14 07:37:53,524 - (369289, 7)
2020-12-14 07:37:53,526 - (369289, 12)
2020-12-14 07:37:53,527 - (369289, 20)
2020-12-14 07:37:53,529 - Test Dataset Shape
2020-12-14 07:37:53,530 - (41033, 7)
2020-12-14 07:37:53,531 - (41033, 12)
2020-12-14 07:37:53,532 - (41033, 19)


In [7]:
logger.info("Training Dataset Info")
logger.info(heart_rate_non_linear_features_train.info())
logger.info(frequency_domain_features_train.info())
logger.info(time_domain_features_train.info())

logger.info("Testing Dataset Info")
logger.info(heart_rate_non_linear_features_test.info())
logger.info(frequency_domain_features_test.info())
logger.info(time_domain_features_test.info())

2020-12-14 07:52:27,586 - Training Dataset Info
2020-12-14 07:52:27,659 - None
2020-12-14 07:52:27,698 - None
2020-12-14 07:52:27,746 - None
2020-12-14 07:52:27,747 - Testing Dataset Info
2020-12-14 07:52:27,765 - None
2020-12-14 07:52:27,779 - None
2020-12-14 07:52:27,793 - None


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 369289 entries, 0 to 369288
Data columns (total 7 columns):
 #   Column     Non-Null Count   Dtype  
---  ------     --------------   -----  
 0   uuid       369289 non-null  object 
 1   SD1        369289 non-null  float64
 2   SD2        369289 non-null  float64
 3   sampen     369289 non-null  float64
 4   higuci     369289 non-null  float64
 5   datasetId  369289 non-null  int64  
 6   condition  369289 non-null  object 
dtypes: float64(4), int64(1), object(2)
memory usage: 19.7+ MB
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 369289 entries, 0 to 369288
Data columns (total 12 columns):
 #   Column   Non-Null Count   Dtype  
---  ------   --------------   -----  
 0   uuid     369289 non-null  object 
 1   VLF      369289 non-null  float64
 2   VLF_PCT  369289 non-null  float64
 3   LF       369289 non-null  float64
 4   LF_PCT   369289 non-null  float64
 5   LF_NU    369289 non-null  float64
 6   HF       369289 non-null  floa

In [12]:
logger.info("Making Training Dataset")
final_dataset = pd.merge(heart_rate_non_linear_features_train, frequency_domain_features_train, on='uuid', how='outer')
final_dataset = pd.merge(final_dataset, time_domain_features_train, on='uuid', how='outer')

logger.info("Making Test Dataset")
final_test_dataset = pd.merge(heart_rate_non_linear_features_test, frequency_domain_features_test, on='uuid', how='outer')
final_test_dataset = pd.merge(final_test_dataset, time_domain_features_test, on='uuid', how='outer')


2020-12-14 07:54:51,585 - Making Training Dataset
2020-12-14 07:54:52,391 - Making Test Dataset


In [11]:
final_dataset.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 369289 entries, 0 to 369288
Data columns (total 37 columns):
 #   Column             Non-Null Count   Dtype  
---  ------             --------------   -----  
 0   uuid               369289 non-null  object 
 1   SD1                369289 non-null  float64
 2   SD2                369289 non-null  float64
 3   sampen             369289 non-null  float64
 4   higuci             369289 non-null  float64
 5   datasetId          369289 non-null  int64  
 6   condition          369289 non-null  object 
 7   VLF                369289 non-null  float64
 8   VLF_PCT            369289 non-null  float64
 9   LF                 369289 non-null  float64
 10  LF_PCT             369289 non-null  float64
 11  LF_NU              369289 non-null  float64
 12  HF                 369289 non-null  float64
 13  HF_PCT             369289 non-null  float64
 14  HF_NU              369289 non-null  float64
 15  TP                 369289 non-null  float64
 16  LF

In [30]:
final_dataset.head(10)

Unnamed: 0,uuid,SD1,SD2,sampen,higuci,datasetId,condition,VLF,VLF_PCT,LF,LF_PCT,LF_NU,HF,HF_PCT,HF_NU,TP,LF_HF,HF_LF,MEAN_RR,MEDIAN_RR,SDRR,RMSSD,SDSD,SDRR_RMSSD,HR,pNN25,pNN50,KURT,SKEW,MEAN_REL_RR,MEDIAN_REL_RR,SDRR_REL_RR,RMSSD_REL_RR,SDSD_REL_RR,SDRR_RMSSD_REL_RR,KURT_REL_RR,SKEW_REL_RR
0,89df2855-56eb-4706-a23b-b39363dd605a,11.001565,199.061782,2.139754,1.163485,2,no stress,2661.894136,72.203287,1009.249419,27.375666,98.485263,15.522603,0.421047,1.514737,3686.666157,65.018055,0.01538,885.157845,853.76373,140.972741,15.554505,15.553371,9.063146,69.499952,11.133333,0.533333,-0.856554,0.335218,-0.000203,-0.000179,0.01708,0.007969,0.007969,2.143342,-0.856554,0.335218
1,80c795e4-aa56-4cc0-939c-19634b89cbb2,9.170129,114.634458,2.174499,1.084711,2,interruption,2314.26545,76.975728,690.113275,22.954139,99.695397,2.108525,0.070133,0.304603,3006.487251,327.296635,0.003055,939.425371,948.357865,81.317742,12.964439,12.964195,6.272369,64.36315,5.6,0.0,-0.40819,-0.155286,-5.9e-05,0.000611,0.013978,0.004769,0.004769,2.930855,-0.40819,-0.155286
2,c2d5d102-967c-487d-88f2-8b005a449f3e,11.533417,118.939253,2.13535,1.176315,2,interruption,1373.887112,51.152225,1298.222619,48.335104,98.950472,13.769729,0.512671,1.049528,2685.879461,94.28091,0.010607,898.186047,907.00686,84.497236,16.305279,16.305274,5.182201,67.450066,13.066667,0.2,0.351789,-0.656813,-1.1e-05,-0.000263,0.018539,0.008716,0.008716,2.127053,0.351789,-0.656813
3,37eabc44-1349-4040-8896-0d113ad4811f,11.119476,127.318597,2.178341,1.179688,2,no stress,2410.357408,70.180308,1005.981659,29.290305,98.224706,18.181913,0.529387,1.775294,3434.52098,55.328701,0.018074,881.757865,893.46003,90.370537,15.720468,15.720068,5.748591,68.809562,11.8,0.133333,-0.504947,-0.386138,0.000112,0.000494,0.017761,0.00866,0.00866,2.050988,-0.504947,-0.386138
4,aa777a6a-7aa3-4f6e-aced-70f8691dd2b7,13.590641,87.718281,2.221121,1.249612,2,no stress,1151.17733,43.918366,1421.782051,54.24216,96.720007,48.215822,1.839473,3.279993,2621.175204,29.487873,0.033912,809.625331,811.184865,62.766242,19.213819,19.213657,3.266724,74.565728,20.2,0.2,-0.548408,-0.154252,-0.0001,-0.002736,0.023715,0.013055,0.013055,1.816544,-0.548408,-0.154252
5,fe7b4ab0-42d3-48d0-8479-7b022d6af0bc,7.026695,731.873468,0.582616,1.128483,2,no stress,3300.245844,95.316204,151.145149,4.365306,93.200171,11.02746,0.31849,6.799829,3462.418453,13.706252,0.072959,923.283866,617.79416,517.536544,9.965976,9.933933,51.930344,81.342254,1.2,0.6,-0.893858,1.026302,0.00075,0.00021,0.011061,0.005987,0.005987,1.847605,-0.893858,1.026302
6,d324b1ee-aaa1-4edb-9b46-c597cb0bbd8c,7.5287,116.295081,2.161461,1.158004,2,no stress,758.674608,61.022078,483.114475,38.858094,99.692575,1.489796,0.119828,0.307425,1243.278879,324.282351,0.003084,973.252908,964.65002,82.405179,10.644196,10.643638,7.741794,62.095066,2.0,0.0,-0.44267,0.102908,-0.000124,-0.000583,0.010997,0.004772,0.004772,2.30464,-0.44267,0.102908
7,cf272c21-98d8-45c1-9e2b-5ed1fe1864bd,6.703994,185.815874,1.110739,1.146555,2,no stress,1458.810124,75.758666,437.878087,22.739806,93.805918,28.913453,1.501528,6.194082,1925.601664,15.144441,0.066031,715.914682,679.499395,131.477151,9.477727,9.477717,13.872224,85.857703,2.533333,0.2,5.224736,2.452996,3.1e-05,3.8e-05,0.013206,0.006843,0.006843,1.929994,5.224736,2.452996
8,c65dcfb0-1774-4a2a-aa45-271083faeeaa,10.349326,122.621056,2.174233,1.122471,2,interruption,2124.9184,67.47932,1003.315816,31.861491,97.973018,20.757787,0.659188,2.026982,3148.992003,48.33443,0.020689,814.257021,827.52283,87.014459,14.632232,14.631275,5.946766,74.588857,7.733333,0.8,-0.455008,-0.371959,-0.000187,0.000714,0.018204,0.008242,0.008242,2.208637,-0.455008,-0.371959
9,79977df1-0e09-4873-bb31-15581002200b,8.498966,77.180192,2.1716,1.176054,2,no stress,1180.987047,69.230785,522.310281,30.618414,99.509898,2.572459,0.1508,0.490102,1705.869787,203.039304,0.004925,959.694591,957.8956,54.904529,12.0154,12.015343,4.569513,62.726998,3.266667,0.2,0.413338,-0.018134,5.1e-05,2e-05,0.012648,0.005719,0.005719,2.211739,0.413338,-0.018134


We have 'Object' Datatype columns of which one is 'UUID' which we can ignore and other is 'Condition, which we should either convert using 'Labeler' or 'One Hot Encoding'



In [10]:
final_test_dataset.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 41033 entries, 0 to 41032
Data columns (total 36 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   uuid               41033 non-null  object 
 1   SD1                41033 non-null  float64
 2   SD2                41033 non-null  float64
 3   sampen             41033 non-null  float64
 4   higuci             41033 non-null  float64
 5   datasetId          41033 non-null  int64  
 6   condition          41033 non-null  object 
 7   VLF                41033 non-null  float64
 8   VLF_PCT            41033 non-null  float64
 9   LF                 41033 non-null  float64
 10  LF_PCT             41033 non-null  float64
 11  LF_NU              41033 non-null  float64
 12  HF                 41033 non-null  float64
 13  HF_PCT             41033 non-null  float64
 14  HF_NU              41033 non-null  float64
 15  TP                 41033 non-null  float64
 16  LF_HF              410

In Test Dataset 'HR' Column (Target Variable) is missing

In [13]:
final_dataset.describe()

Unnamed: 0,SD1,SD2,sampen,higuci,datasetId,VLF,VLF_PCT,LF,LF_PCT,LF_NU,HF,HF_PCT,HF_NU,TP,LF_HF,HF_LF,MEAN_RR,MEDIAN_RR,SDRR,RMSSD,SDSD,SDRR_RMSSD,HR,pNN25,pNN50,KURT,SKEW,MEAN_REL_RR,MEDIAN_REL_RR,SDRR_REL_RR,RMSSD_REL_RR,SDSD_REL_RR,SDRR_RMSSD_REL_RR,KURT_REL_RR,SKEW_REL_RR
count,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0
mean,10.593708,154.178997,2.062471,1.182292,2.0,2199.58017,64.289242,946.530252,34.095182,95.566718,39.245603,1.615576,4.433282,3185.356025,115.9772,0.048506,846.650104,841.96589,109.352531,14.977498,14.976767,7.396597,73.941824,9.841143,0.866001,0.523235,0.041628,-1.756587e-06,-0.000465,0.018571,0.009701,0.009701,2.006817,0.523235,0.041628
std,2.914795,109.170222,0.206999,0.062192,0.0,1815.773422,16.774844,574.17178,16.04029,4.123365,45.398869,1.761073,4.123365,1923.227187,360.855129,0.049238,124.603984,132.321005,77.117025,4.120766,4.120768,5.143834,10.337453,8.195574,0.990189,1.790348,0.699522,0.0001630256,0.000868,0.005455,0.003897,0.003897,0.375845,1.790348,0.699522
min,3.911344,38.307745,0.434576,1.033984,2.0,159.480176,19.031219,90.048557,2.165119,69.879083,0.061783,0.00215,0.012825,377.692795,2.319952,0.000128,547.492221,517.293295,27.233947,5.529742,5.52963,2.660381,48.737243,0.0,0.0,-1.89482,-2.136278,-0.001233914,-0.004425,0.008987,0.00322,0.00322,1.169342,-1.89482,-2.136278
25%,8.36834,90.326864,2.032977,1.139929,2.0,1001.18928,52.909877,545.449386,22.305936,93.645734,10.720312,0.346803,1.228054,1828.147788,14.737458,0.012433,760.228533,755.750735,64.205641,11.830959,11.830671,4.541896,66.715776,3.666667,0.0,-0.352783,-0.359291,-7.281695e-05,-0.000917,0.014261,0.006984,0.006984,1.749801,-0.352783,-0.359291
50%,10.196621,116.221063,2.134214,1.174293,2.0,1667.903111,66.350237,782.716291,32.047025,96.64314,24.841938,1.039513,3.35686,2796.856587,28.789747,0.034735,822.951438,819.689595,82.608243,14.415918,14.415388,5.952112,74.217809,7.6,0.466667,0.040736,-0.060966,-9.330777e-07,-0.000312,0.017318,0.008691,0.008691,1.934416,0.040736,-0.060966
75%,12.679005,166.76485,2.181929,1.223621,2.0,2654.121052,76.825032,1201.432256,44.647115,98.771946,45.272368,2.245115,6.354266,4052.260157,80.429614,0.067854,924.117422,916.82157,118.237002,17.927144,17.924839,7.919841,80.334937,13.333333,1.466667,0.722833,0.282417,6.911667e-05,0.000131,0.021827,0.01146,0.01146,2.221232,0.722833,0.282417
max,18.836107,796.852945,2.234841,1.361219,2.0,12617.977191,97.738848,3291.548112,77.928847,99.987175,364.486936,13.095664,30.120917,13390.684098,7796.443096,0.431043,1322.016957,1653.12225,563.486949,26.629477,26.629392,54.52395,113.752309,39.4,5.466667,64.088107,6.7778,0.001244098,0.002095,0.036571,0.026955,0.026955,3.724134,64.088107,6.7778


In [18]:
onehotencoder = OneHotEncoder(drop = "first")

encoded = onehotencoder.fit_transform(final_dataset.condition.values.reshape(-1,1)).toarray() 
df_encoded = pd.DataFrame(encoded, columns = [(onehotencoder.get_feature_names()[i]).replace("x0", "condition").replace(" ", "_") for i in range(encoded.shape[1])])
final_dataset_encoded = pd.concat([final_dataset, df_encoded], axis=1).drop(["condition"], axis = 1)  # concats two dataframes and drop condition
final_dataset_encoded = final_dataset_encoded.drop(["uuid"], axis = 1)
final_dataset_encoded = final_dataset_encoded.drop(["datasetId"], axis = 1)

testEncoded = onehotencoder.fit_transform(final_test_dataset.condition.values.reshape(-1,1)).toarray() 
test_df_encoded = pd.DataFrame(testEncoded, columns = [(onehotencoder.get_feature_names()[i]).replace("x0", "condition").replace(" ", "_") for i in range(testEncoded.shape[1])])
final_test_dataset_encoded = pd.concat([final_test_dataset, test_df_encoded], axis=1).drop(["condition"], axis = 1)  # concats two dataframes and drop condition
final_test_dataset_encoded = final_test_dataset_encoded.drop(["uuid"], axis = 1)
final_test_dataset_encoded = final_test_dataset_encoded.drop(["datasetId"], axis = 1)

final_dataset_encoded.describe()

Unnamed: 0,SD1,SD2,sampen,higuci,VLF,VLF_PCT,LF,LF_PCT,LF_NU,HF,HF_PCT,HF_NU,TP,LF_HF,HF_LF,MEAN_RR,MEDIAN_RR,SDRR,RMSSD,SDSD,SDRR_RMSSD,HR,pNN25,pNN50,KURT,SKEW,MEAN_REL_RR,MEDIAN_REL_RR,SDRR_REL_RR,RMSSD_REL_RR,SDSD_REL_RR,SDRR_RMSSD_REL_RR,KURT_REL_RR,SKEW_REL_RR,condition_no_stress,condition_time_pressure
count,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0,369289.0
mean,10.593708,154.178997,2.062471,1.182292,2199.58017,64.289242,946.530252,34.095182,95.566718,39.245603,1.615576,4.433282,3185.356025,115.9772,0.048506,846.650104,841.96589,109.352531,14.977498,14.976767,7.396597,73.941824,9.841143,0.866001,0.523235,0.041628,-1.756587e-06,-0.000465,0.018571,0.009701,0.009701,2.006817,0.523235,0.041628,0.541803,0.17346
std,2.914795,109.170222,0.206999,0.062192,1815.773422,16.774844,574.17178,16.04029,4.123365,45.398869,1.761073,4.123365,1923.227187,360.855129,0.049238,124.603984,132.321005,77.117025,4.120766,4.120768,5.143834,10.337453,8.195574,0.990189,1.790348,0.699522,0.0001630256,0.000868,0.005455,0.003897,0.003897,0.375845,1.790348,0.699522,0.49825,0.378645
min,3.911344,38.307745,0.434576,1.033984,159.480176,19.031219,90.048557,2.165119,69.879083,0.061783,0.00215,0.012825,377.692795,2.319952,0.000128,547.492221,517.293295,27.233947,5.529742,5.52963,2.660381,48.737243,0.0,0.0,-1.89482,-2.136278,-0.001233914,-0.004425,0.008987,0.00322,0.00322,1.169342,-1.89482,-2.136278,0.0,0.0
25%,8.36834,90.326864,2.032977,1.139929,1001.18928,52.909877,545.449386,22.305936,93.645734,10.720312,0.346803,1.228054,1828.147788,14.737458,0.012433,760.228533,755.750735,64.205641,11.830959,11.830671,4.541896,66.715776,3.666667,0.0,-0.352783,-0.359291,-7.281695e-05,-0.000917,0.014261,0.006984,0.006984,1.749801,-0.352783,-0.359291,0.0,0.0
50%,10.196621,116.221063,2.134214,1.174293,1667.903111,66.350237,782.716291,32.047025,96.64314,24.841938,1.039513,3.35686,2796.856587,28.789747,0.034735,822.951438,819.689595,82.608243,14.415918,14.415388,5.952112,74.217809,7.6,0.466667,0.040736,-0.060966,-9.330777e-07,-0.000312,0.017318,0.008691,0.008691,1.934416,0.040736,-0.060966,1.0,0.0
75%,12.679005,166.76485,2.181929,1.223621,2654.121052,76.825032,1201.432256,44.647115,98.771946,45.272368,2.245115,6.354266,4052.260157,80.429614,0.067854,924.117422,916.82157,118.237002,17.927144,17.924839,7.919841,80.334937,13.333333,1.466667,0.722833,0.282417,6.911667e-05,0.000131,0.021827,0.01146,0.01146,2.221232,0.722833,0.282417,1.0,0.0
max,18.836107,796.852945,2.234841,1.361219,12617.977191,97.738848,3291.548112,77.928847,99.987175,364.486936,13.095664,30.120917,13390.684098,7796.443096,0.431043,1322.016957,1653.12225,563.486949,26.629477,26.629392,54.52395,113.752309,39.4,5.466667,64.088107,6.7778,0.001244098,0.002095,0.036571,0.026955,0.026955,3.724134,64.088107,6.7778,1.0,1.0


In [19]:
final_test_dataset_encoded.describe()

Unnamed: 0,SD1,SD2,sampen,higuci,VLF,VLF_PCT,LF,LF_PCT,LF_NU,HF,HF_PCT,HF_NU,TP,LF_HF,HF_LF,MEAN_RR,MEDIAN_RR,SDRR,RMSSD,SDSD,SDRR_RMSSD,pNN25,pNN50,KURT,SKEW,MEAN_REL_RR,MEDIAN_REL_RR,SDRR_REL_RR,RMSSD_REL_RR,SDSD_REL_RR,SDRR_RMSSD_REL_RR,KURT_REL_RR,SKEW_REL_RR,condition_no_stress,condition_time_pressure
count,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0,41033.0
mean,10.60226,153.475051,2.063608,1.182641,2196.457487,64.221039,948.851848,34.168102,95.586613,39.012113,1.61086,4.413387,3184.321449,117.181622,0.048261,846.856304,842.112932,108.856642,14.989588,14.988858,7.369326,9.888347,0.862545,0.520719,0.044934,-2.872766e-08,-0.000468,0.018561,0.00969,0.00969,2.00809,0.520719,0.044934,0.540004,0.172861
std,2.927793,108.442308,0.207427,0.062489,1815.262687,16.838147,577.413541,16.105461,4.102133,44.994281,1.750709,4.102133,1923.544936,368.039153,0.048859,124.422504,131.976225,76.602112,4.139121,4.139143,5.14646,8.262497,0.986774,1.791907,0.69437,0.0001620372,0.000865,0.005451,0.003886,0.003886,0.377263,1.791907,0.69437,0.498403,0.378132
min,3.912661,38.454442,0.435664,1.03442,160.86973,19.301549,92.66398,2.163917,70.682584,0.062809,0.002211,0.012971,378.053805,2.410942,0.00013,547.483802,517.293295,27.338606,5.531588,5.531493,2.664721,0.0,0.0,-1.894831,-2.136234,-0.00127524,-0.004425,0.008987,0.00322,0.00322,1.186548,-1.894831,-2.136234,0.0,0.0
25%,8.370315,90.23688,2.034956,1.140058,994.970318,52.721914,544.688588,22.299698,93.659224,10.551975,0.338246,1.205268,1820.982721,14.77094,0.0122,760.387193,755.686825,64.132798,11.833717,11.833463,4.51762,3.6,0.0,-0.352526,-0.351531,-7.011447e-05,-0.000919,0.014237,0.006972,0.006972,1.751317,-0.352526,-0.351531,0.0,0.0
50%,10.207964,116.339025,2.135147,1.174465,1665.049572,66.240263,783.452494,32.171726,96.658088,24.654848,1.034102,3.341912,2799.392704,28.922987,0.034575,822.611612,819.624875,82.710729,14.431598,14.431425,5.934226,7.6,0.466667,0.039643,-0.053997,2.305116e-07,-0.000312,0.017305,0.008698,0.008698,1.935984,0.039643,-0.053997,1.0,0.0
75%,12.710279,166.211038,2.182363,1.224328,2657.910801,76.845957,1210.276983,44.754617,98.794732,45.004948,2.23706,6.340776,4055.490025,81.969112,0.0677,925.05352,917.98488,117.822094,17.969971,17.969053,7.906338,13.333333,1.4,0.711312,0.287011,7.000961e-05,0.000125,0.021839,0.011451,0.011451,2.220445,0.711312,0.287011,1.0,0.0
max,18.796168,796.293973,2.234642,1.360587,12427.177612,97.739588,3290.093757,77.522361,99.987029,360.877726,13.087775,29.317416,13249.507794,7708.369846,0.414776,1321.597359,1653.12225,563.092252,26.572991,26.572927,54.438959,39.4,5.4,54.650655,6.656766,0.001211792,0.002079,0.036575,0.026861,0.026861,3.705506,54.650655,6.656766,1.0,1.0


In [20]:
targetVariable = final_dataset_encoded["HR"]
sourceDataset = final_dataset_encoded.drop(["HR"], axis = 1)

sourceTestDataset = final_test_dataset_encoded

Number of Stress level available


In [22]:
final_dataset["condition"].unique()

array(['no stress', 'interruption', 'time pressure'], dtype=object)

HeartBeat details, conditionwise

In [59]:
final_dataset.groupby(by=['condition']).agg(['min', 'max', 'count'])[['HR']]

Unnamed: 0_level_0,HR,HR,HR
Unnamed: 0_level_1,min,max,count
condition,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2
interruption,51.363126,113.752309,105150
no stress,50.975238,109.609794,200082
time pressure,48.737243,107.322036,64057
