Link to Data: https://archive.ics.uci.edu/ml/datasets/WESAD+%28Wearable+Stress+and+Affect+Detection%29

## Getting Data Program
This program is where I retieved the data from the pickle file and created the rows (windows) to be able to anyalze it using machine learning. This Program only requires EDA and BVP data. 

In [3]:
import numpy as np
from scipy.signal import butter, lfilter, freqz, filtfilt
import matplotlib.pyplot as plt
import pandas as pd
import pickle
import json
import time
#import heartrate

### Butter Low Pass Filter
here is where we did hte butter low pass filter which would shrink the high ends of the data to better run models on the data and gather certain key values such as heart rate and breath intake rate. 

In [4]:
def butter_lowpass(cutoff, fs, order=5):
    nyq = 0.5 * fs
    normal_cutoff = cutoff / nyq
    b, a = butter(order, normal_cutoff, btype='low', analog=False)
    return b, a

In [5]:
def butter_lowpass_filter(data, cutoff, fs, order=5):
    b, a = butter_lowpass(cutoff, fs, order=order)
    y = filtfilt(b, a, data,axis=0)
    return y

### Get Info
This method is called int the function shifting windows. It is used to return a dataframe / row of one window of time. This was used for EDA data. 

In [6]:
def get_info(val,start,end):
    mean = val.mean()
    sd = np.std(val)
    maxy = val.max()
    miny = val.min()
    rangey = maxy - miny
    slope = (val[-1] - val[0]) / (end-start)
    return pd.DataFrame.from_dict({"mean":[mean],"std":[sd],'range':[rangey],'max':[maxy],'min':[miny],'slope':[slope],"start":[start],"end":[end]})

### Shifing Window
shifingWindow recives a dataframe of raw EDA data (df), a total window of time int form (shiftWin), and an int of the number of rows to shift by (shiftStep). 

Returns a dataframe where each row is a window of time with columns mean, std, range, max, min, and slope for that window of time. Also has a column named label which is the target column which states whither the person was in a stressful situation or not. 

In [8]:
def shifingWindow(df,shiftWin, shiftStep):
    returnDf = pd.DataFrame(columns= ['mean','std','range','max','min','slope','label'])
    for i in range(0,len(df),shiftStep):
        maxRange = i + shiftWin
        oneRow = get_info(df[i:maxRange]['y'].values,i,maxRange)
        returnDf = returnDf.append(oneRow.iloc[0],ignore_index=True)
        returnDf['label'] = df['lables'].iloc[0]
    return returnDf


### Pickles
A quick method that opens a pickle file and returns a dataframe. 

In [9]:
def openPickle(thePickle):
    with open(thePickle, 'rb') as f:
        u = pickle._Unpickler(f)
        u.encoding = 'latin1'
        return u.load()

### Getting One Set of Raw Data
This function returns one set of data from the pickle file. For example, p would be the pickle file of a subject's data, and one would be the string 'EDA' which would get the eda data from that subject. It would only return the stress and amusment data.

In [10]:
def getOne(p,one):
    oneDict = p['signal']['chest'][one]
    labels = p['label']
    onedf = pd.DataFrame(data=oneDict).reset_index()
    onedf['lables'] = pd.Series(labels, index=onedf.index)
    y = butter_lowpass_filter(oneDict, 5, 700, 6)
    onedf['y'] = pd.DataFrame(y)
    #onedf.rename(index=str, columns={"index": "index", 0: "y",'lables':'lables'},inplace=True)
    stressDf = onedf[(onedf['lables'] == 2)]
    amuseDf = onedf[(onedf['lables'] == 3)]
    return (stressDf,amuseDf)

### Graph Data
Another quick function that graphs a part of the data. Where df is the pre-processed data, and i is the column you want to graph by .

In [11]:
def graphData(df,i):
    #plt.plot(t, df, 'b-', label='data')

    for test in list(df.columns):
        if "Unnamed" in test or "start" in test or "end" in test or "label" in test or "subject" in test:
            df = df.drop(test,axis=1)
    columns = list(df.columns)    
    for by in columns:
        fig = plt.figure()
        fig.add_axes()        
        temp = df.reset_index()
        num = 101 + ((i%10)*10) + (i//10)*100
        plt.plot(range(len(temp[temp['stress'] == 1])), temp[by][temp['stress'] == 1], 'g-', linewidth=2,label='stress')
        plt.plot(range(len(temp[temp['amuse']==1])), temp[by][temp['amuse'] == 1], 'r-', linewidth=2,label='amuse')
        plt.title('Subject '+str(i)+" "+by)
        plt.xlabel('row')
        plt.ylabel(by)
        plt.savefig('S'+str(i)+'\\'+by+'.png')
        #plt.show()
    plt.xlabel('Time [sec]')
    #plt.xlim([1010 ,1000])


### Saving Data
This file creates a new .csv file of the processed data with windows. Creates a seperate file for amuse and stress data. 

In [13]:
def writeFile(name,stress,amuse,base,subject,final = False):
    if final:
        end = 'final'
    else:
        end = 'raw'
    with open(''+subject+'\\'+subject+''+name+'stress'+end+'.csv','w') as f:
        f.write(stress.to_csv())
    with open(''+subject+'\\'+subject+''+name+'amuse'+end+'.csv','w') as f:
        f.write(amuse.to_csv())

### Putting it All Together
Here we gather and orginize the data by going through subjects 2 - 17 (skipping 12 because that data wasn't included), opening the pickle, writing the EDA file and the BVP file for each subject. Then saving the file in each subject's folder. BVP data needed a seperate python file called heartrate. 

In [14]:
def main():

    for i in [2,3,4,5,6,7,8,9,10,11,13,14,15,16,17]:
        
        p = openPickle('S'+str(i)+'\S'+str(i)+'.pkl')
        
        stressEDA , amuseEDA = getOne(p,'EDA')
        writeFile('EDA',stressEDA,amuseEDA,'S'+str(i))
        writeBaseFile('EDA',baseEDA,'S'+str(i))
        stressInfo = shifingWindow(stressEDA,420,175)
        amuseInfo = shifingWindow(amuseEDA,420,175)
        writeFile('EDA',stressInfo,amuseInfo,'S'+str(i),True)

        stressECG, amuseECG = getOne(p,'ECG')
        writeFile('BVP',stressECG,amuseECG,'S'+str(i))
        #writeBaseFile('BVP',baseECG,'S'+str(i))
        #df = pd.read_csv('S'+str(i)+'\S'+str(i)+'combined.csv')
        #graphData(df,i)
        rristress = heartrate.findrrv(stressECG)
        rriamuse = heartrate.findrrv(amuseECG)
        writeFile('BVP',rristress,rriamuse,'S'+str(i),True)
        print('done with '+str(i))
                  
#main()