<a href="https://colab.research.google.com/github/janprince/ml_stress_detection/blob/main/stress_detection.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Human Stress Detection
Using physiological data to detect human stress levels.

#### Features
Humidity
Temperature
Step count
Stress levels

Based on the human’s physical activity, the stress levels of the human being are detected and analyzed here. A dataset of 2001 samples is provided for human body humidity, body temperature and the number of steps taken by the user. Three different classifications of stress are performed, low stress, normal stress, and high stress. More information on how this data is analyzed can be found at “L. Rachakonda, S. P. Mohanty, E. Kougianos, and P. Sundaravadivel, “Stress-Lysis: A DNN-Integrated Edge Device for Stress Level Detection in the IoMT,” IEEE Trans. Conum. Electron., vol. 65, no. 4, pp. 474–483, 2019.”

### Exploratory data analysis

In [2]:
import pandas as pd

data = pd.read_csv("https://raw.githubusercontent.com/janprince/ml_stress_detection/main/stress_lysis.csv")

data.head()

Unnamed: 0,Humidity,Temperature,Step count,Stress Level
0,21.33,90.33,123,1
1,21.41,90.41,93,1
2,27.12,96.12,196,2
3,27.64,96.64,177,2
4,10.87,79.87,87,0


In [4]:
data.tail()

Unnamed: 0,Humidity,Temperature,Step count,Stress Level
1996,21.82,90.82,96,1
1997,10.45,79.45,45,0
1998,27.22,96.22,135,2
1999,12.46,81.46,64,0
2000,16.87,85.87,50,1


In [7]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2001 entries, 0 to 2000
Data columns (total 4 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   Humidity      2001 non-null   float64
 1   Temperature   2001 non-null   float64
 2   Step count    2001 non-null   int64  
 3   Stress Level  2001 non-null   int64  
dtypes: float64(2), int64(2)
memory usage: 62.7 KB


In [6]:
data["Stress Level"].value_counts()

1    790
2    710
0    501
Name: Stress Level, dtype: int64

### Splitting datasets

In [13]:
from sklearn.model_selection import train_test_split

# split labels from data
X = data.drop(["Stress Level"], axis=1)
y = data["Stress Level"].values

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=42)

### Training a model on training data

In [20]:
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import cross_val_score

model = KNeighborsClassifier(n_neighbors=5)

# fit regression model to training data
model.fit(X_train, y_train)

KNeighborsClassifier()

In [17]:
# accuracy of predict on training set
model.score(X_train, y_train)

0.9941666666666666

In [19]:
# accuracy using cross-validation
scores = cross_val_score(model, X_train, y_train, cv=5)
scores

array([0.97916667, 0.95833333, 0.9875    , 0.9875    , 0.99166667])