# Challenge: Workshop and Challenge on Detection of Stress and Mental Health Using Wearable Sensors

<ul>
    <li><a href="#1">1. Data retrieval and cleaning</a></li>
</ul>
   
<ul>
   <li>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#1.1">1.1.Import libraries</a></li>
</ul>

<ul>
   <li>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#1.2">1.2. Retrieve dataset</a></li>
</ul>
<ul>
   <li>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#1.4">1.3. Selecting a sample</a></li>
</ul>


<ul>
   <li><a href="#4">4. Data statistics</a></li>
</ul>


<a id="1"></a>

## 1. Data retrieval + cleaning

<a id="1.1"></a>
### 1.1 Import libraries

In [1]:
import json
import os
import pandas as pd
import requests
import sys
import numpy as np

In [3]:
pip install --upgrade pip  

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.


<a id="1.2"></a>
### 1.2. Retrieve SMILE

The SMILE dataset was collected from 45 healthy adult participants (39 females and 6 males) in Belgium. The average age of participants was 24.5 years old, with a standard deviation of 3.0 years. Each participant contributed to an average of 8.7 days of data. Two types of wearable sensors were used for data collection. One was a wrist-worn device (Chillband, IMEC, Belgium) designed for the measurement of skin conductance (SC), ST, and acceleration data (ACC). The second sensor was a chest patch (Health Patch, IMEC, Belgium) to measure ECG and ACC. It contains a sensor node designed to monitor ECG at 256 Hz and ACC at 32 Hz continuously throughout the study period. Participants could remove the sensors while showering or before doing intense exercises. Also, participants received notifications on their mobile phones to report their momentary stress levels daily. 

https://compwell.rice.edu/workshops/embc2022/dataset

In [4]:
dataset = np.load('dataset/dataset_smile_challenge.npy', allow_pickle=True).item()

    dict
        dictionary with dataset, with keys:
         * `train`
          * `deep_features`
           * `ECG_features_C`
           * `ECG_features_T`
           * `masking`
          * `hand_crafted_features`
           * `ECG_features`
           * `ECG_masking`
           * `GSR_features`
           * `GSR_masking`
          * `labels`
         * `test`
          * Same structure as `train`

Let's explore the contents of the dataset directory

In [5]:


# for training and testing data:

dataset_train = dataset['train']

dataset_test = dataset['test']

# for deep features.

deep_features = dataset_train['deep_features']

# conv1d backbone based features for ECG signal.

deep_features['ECG_features_C'] 

# transformer backbone basde features for ECG signal  

deep_features['ECG_features_T']   

# for hand-crafted features.

handcrafted_features = dataset_train['hand_crafted_features']

# handcrafted features for ECG signal

handcrafted_features['ECG_features'] 

 # handcrafted features for GSR signal. 

handcrafted_features['GSR_features'] 

# for labels.

labels = dataset_train['labels']  # labels.

In [6]:
len(dataset['train']['labels'])

2070

In [7]:
dataset['train'].keys()

dict_keys(['deep_features', 'hand_crafted_features', 'labels'])

Now we have a DataFrame with the contents of the metadata file.

<a id="4"></a>
## 4. Data statistics 

In [8]:
handcrafted_features['ECG_features'].shape

(2070, 60, 8)

Load SMILE dataset as a dictionary from npy file.
Each feature matrix has 3 dimensions:
* sequence (of 60 minutes)
* window (5 minute with 4 min overlap)
* feature

## ECG Features
* feature and label vector construction
* creation of classifier

In [9]:
len(handcrafted_features['ECG_features'])

2070

In [10]:
handcrafted_features['ECG_features'][0].shape #tem uma sequencia de 60 minutos ?

(60, 8)

In [11]:
# representa as features extraidas de 1 janela da sequencia (correspondente a 5 minutos)
handcrafted_features['ECG_features'][0][0] 

array([0.14565595, 0.15295387, 0.02935256, 0.01325846, 0.4879581 ,
       0.27220871, 0.14978604, 0.05602099])

In [12]:
nfeatures=len(handcrafted_features['ECG_features'][0][0])
n= len(handcrafted_features['ECG_features'])*len(handcrafted_features['ECG_features'][0])
# variaveis e iniciação a zero
handcrafted_features_vector=np.zeros((n,nfeatures)) #X
labels_vectors=np.zeros(n) #y
#
count=0
for i in range(len(handcrafted_features['ECG_features'])):
    label=dataset_train['labels'][i]
    for j in range(len(handcrafted_features['ECG_features'][i])):
        if(np.sum(np.isnan(handcrafted_features['ECG_features'][i][j]))==0):
            # nao considerar os nan
            handcrafted_features_vector[count,0:nfeatures]=handcrafted_features['ECG_features'][i][j]
            labels_vectors[count]=label
            count=count+1

In [13]:
data = np.array([[1, 2, 3], [0, 0, 0], [4, 5, 6], [0, 0, 0], [7, 8, 9], [0, 0, 0]])
data

array([[1, 2, 3],
       [0, 0, 0],
       [4, 5, 6],
       [0, 0, 0],
       [7, 8, 9],
       [0, 0, 0]])

In [14]:
data1= np.delete(data,range(4,6),0)
data1

array([[1, 2, 3],
       [0, 0, 0],
       [4, 5, 6],
       [0, 0, 0]])

In [15]:
data[~np.all(data ==0, axis=1)]

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [16]:
handcrafted_features_vector[count:n,:]

array([[0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 

In [17]:
handcrafted_features_vector_new1= np.delete(handcrafted_features_vector, range(count,n), 0)

In [18]:
handcrafted_features_vector_new1.shape

(124129, 8)

In [19]:
handcrafted_features_vector_new = handcrafted_features_vector[~np.all(handcrafted_features_vector ==0, axis=1)]

In [20]:
handcrafted_features_vector_new.shape

(109404, 8)

In [21]:
print(count)
print(n)
n-count

124129
124200


71

In [22]:
handcrafted_features_vector

array([[0.14565595, 0.15295387, 0.02935256, ..., 0.27220871, 0.14978604,
        0.05602099],
       [0.16164244, 0.03791351, 0.00815169, ..., 0.27300616, 0.15005658,
        0.06164434],
       [0.10225159, 0.00794716, 0.00300362, ..., 0.22226743, 0.10549292,
        0.10110349],
       ...,
       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
        0.        ],
       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
        0.        ],
       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
        0.        ]])

In [23]:
handcrafted_features_vector

array([[0.14565595, 0.15295387, 0.02935256, ..., 0.27220871, 0.14978604,
        0.05602099],
       [0.16164244, 0.03791351, 0.00815169, ..., 0.27300616, 0.15005658,
        0.06164434],
       [0.10225159, 0.00794716, 0.00300362, ..., 0.22226743, 0.10549292,
        0.10110349],
       ...,
       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
        0.        ],
       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
        0.        ],
       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
        0.        ]])

In [24]:
np.sum(np.isnan(handcrafted_features_vector[0]))

0

In [25]:
len(labels_vectors)

124200

### Classifier

In [None]:
from sklearn import svm
clf = svm.SVC()
clf.fit(handcrafted_features_vector_new1, labels_vectors[0:count])

In [None]:
handcrafted_features['GSR_features'].shape

In [None]:
2070/24/45

In [None]:
dataset['test']['hand_crafted_features']['GSR_features'].shape

In [None]:
986/24/45

In [None]:
testDataset = dataset_test['hand_crafted_features']['ECG_features']

predicted_dataset = clf.predict();

In [None]:
print(predicted_dataset)

In [None]:
print('hey')