# Introduction

## Data Source

* Please download your heart rate data from https://stila.pms.ifi.lmu.de
* Pick a day from the overview calendar, be aware that you don't pick an event
* Click the "Download HR data" button to download the heart rate data

## Packages
use pip3 to install the following package
```python
pip3 install numpy
pip3 install SciPy
pip3 install matplotlib
```

## Location of the heart rate raw data

For further analysis, please save the heart rate raw data in the **"dataSource"** directory

Please run the following cell to see your heart rate data

## Assignment 1 (Input Data)
* Download your heart rate data from https://stila.pms.ifi.lmu.de
* put the heartrate_2016-XX-XX.csv file in to the dataSource folder
* change the filename to your downloaded csv file

In [27]:
# enable matplotlib inline codes
#%matplotlib auto
#%matplotlib notebook
%matplotlib nbagg
#%matplotlib inline

In [28]:
import matplotlib.pyplot as plt
import numpy as np
import os
import csv
import datetime as dt




### Assignment 1
### Please only replace the filename with the name of your csv file
#filename = 'heartrate_2016-10-05.csv'
filename = 'heartrate_2016-10-27.csv'

### Don't change the following code in this cell.

# Get the current working directory
currentDir=os.getcwd()

heartRateFile = os.path.abspath(os.path.join(currentDir,'dataSource',filename))
if os.path.exists(heartRateFile):
    print('Oberserving HeartRateFile Path is:\n%s' %(heartRateFile))
else:
    raise Exception(
      'File %s \n does not exist!' %(heartRateFile)
    )

Oberserving HeartRateFile Path is:
C:\Users\Chrisi\Desktop\Studium\Seminar\dataSource\heartrate_2016-10-27.csv


In [4]:
def parseCSV(file, coding='utf-8'):
    # a possible encoding can be 'iso-8859-15'
    with open(file, 'r', encoding=coding) as csvfile:
        reader = csv.reader(csvfile)
        content = list(reader)
        header = content.pop(0) # remove first header
    return header, content

hrHeader, hrContent = parseCSV(heartRateFile, 'utf-8') # with default windows encoding

print(hrHeader)
print('The Format of single HR Data: ', hrContent[0])
print("Loaded No. of Heart Rate Raw Data is: %d" %(len(hrContent)))

# The timestamp in data sets are utc timestamps
hrRaw = [heartrate for timestamp, heartrate in hrContent]
timestampRaw = [dt.datetime.fromtimestamp(int(timestamp)/1000) for timestamp, heartrate in hrContent]

#print(len(hrRaw))
#print(len(timestampRaw))
print('HR Data begins at: ',timestampRaw[0])
print('HR Data stops at: ',timestampRaw[-1])
print(type(timestampRaw[0]))

['\ufeff"timestamp"', 'heartrate']
The Format of single HR Data:  ['1477519200000', '69']
Loaded No. of Heart Rate Raw Data is: 7927
HR Data begins at:  2016-10-27 00:00:00
HR Data stops at:  2016-10-28 00:00:00
<class 'datetime.datetime'>


## Examing Heart Rate Raw Data
the following sections uses plots to examing heart rate raw data

In [29]:
## show plot interactive, if you want inline, comment out this line.


def ploting(xvals, yvals, xlabel_str, ylabel_str, title, style):
    """
    creates a plot
    """
    # close all old plots
    plt.close("all")
    # plotting
    plt.plot(xvals, yvals, style)
    plt.xlabel(xlabel_str)
    plt.ylabel(ylabel_str)    
    plt.title(title)
    # show the grid line/ help line in plot
    plt.grid(True)
    #plt.legend(loc='upper right')
    #plt.axis([0, 210, 0, 0.04 ])
    #plt.figure(figsize=[9,6])
    plt.show()
   
## Main activity and time

xlabel_str = "time line"
ylabel_str = "heart rate (BPS)"
title = "heart rate row data"

# shows a point plot
ploting(timestampRaw,hrRaw, xlabel_str, ylabel_str, title, 'bo')


<IPython.core.display.Javascript object>

In [30]:
# shows a line plot
ploting(timestampRaw,hrRaw, xlabel_str, ylabel_str, title, 'b-')

<IPython.core.display.Javascript object>

In [31]:
plt.close('all')

## Segmenting the heart rate raw data
* segments heart rate raw data to 10 minutes segments
* for each segments, the heart rate feature is calculated.

In [32]:
def getBeginTimestamp(datetime):
    """
    this method returns the timestampe of the day by the given datetime.
    If the given datetime is 2016-10-05 01:29:20
    this method returns the timestamp of 2016-10-05 00:00:00
    """
    # extract the date string from the given datetime object
    str_current_date = dt.datetime.strftime(datetime, '%Y-%m-%d')
    # print("Current Observed Date: ",str_current_date)
    # convert the extracted date string to datetime object
    currentDate = dt.datetime.strptime(str_current_date, '%Y-%m-%d')
    # get the utc timestamp of the date string of the given datetime object
    int_ts_currentDate = int(dt.datetime.timestamp(currentDate))
    return int_ts_currentDate;

#current date timestamp
curDateTS = getBeginTimestamp(timestampRaw[0]) 
print("timestamp of current Date: " , curDateTS)

# list of timestamps saved in int
intTimestampRaw = [int(int(timestamp)/1000) for timestamp, heartRate in hrContent]

def segmentingDate(intTimestampRaw, hrRaw, curDateTS): 
    """
    this method make 10 minutes segments from heart rate raw data
    """
    segments = {}
    deliminator = 10 * 60
    segment_id = 0
    for id in range(len(intTimestampRaw)):
        timestamp = intTimestampRaw[id]
        heartrate = int(hrRaw[id])
        segment_id = (timestamp - curDateTS) // deliminator
        segment_label = segment_id * deliminator + curDateTS
        segments.setdefault(segment_label,[]).append(heartrate)
    return segments
# segments is a dictionary with segment timestamp as key, 
# and all heart rate values of the segment is saved as value
# in the segment dictionary
segments = segmentingDate(intTimestampRaw, hrRaw, curDateTS)
segments_labels = []
for label in segments.keys():
    segments_labels.append(int(label))
# in place sorting
list.sort(segments_labels)     
    
print("The number of segments is: ", len(segments))




timestamp of current Date:  1477519200
The number of segments is:  130


## Examing a sample segment
plots a sample segment

In [33]:
# Display segments

def displaySegment(idx):
    segment_label = segments_labels[idx]
    print("ploting segement with label: ", dt.datetime.fromtimestamp(segment_label))
    segment = segments[segment_label]
    # print(segment)
    ploting(range(len(segment)),segment, xlabel_str, ylabel_str, "segment "+ str(segment_label), 'b-')

displaySegment(70)

ploting segement with label:  2016-10-27 14:10:00


<IPython.core.display.Javascript object>

In [34]:
## Ein Segment zuviel!?!?
dt.datetime.fromtimestamp(segments_labels[-1])

datetime.datetime(2016, 10, 28, 0, 0)

## Example of featureCalculation function
the following featureCalculation function calculates the meanHR feature

* meanHR: the mean of the heart rate within a segment

Note: 

The following meanHR plot shall be very similar to the stila portal's heart rate plot, since the stila portal's plot does the mean aggregation of heart rates automatically while zooming. 

In [54]:

# this function calculates the heart rate value within a segment
# the type of heart rate value within the segment is integer
def featureCalculation(segment):
    return sum(segment)/len(segment)

In [55]:
# enable matplotlib inline codes

# Ploting the segments
featureName = "meanHR"
title = "meanHR  to datetime"
featureValues = [featureCalculation(segments[segment_label]) for segment_label in segments_labels]
labelValues = [dt.datetime.fromtimestamp(segment_label) for segment_label in segments_labels]
ploting(labelValues, featureValues, 'datetime', featureName, title, 'g-')

## This meanHR plot shall be very similar 
## to the stila portal's heart rate plot

<IPython.core.display.Javascript object>

## Assignment 2 (featureCalculation function)
You shall modify the following featureCalculation function to calculate your assigned heart rate feature


In [37]:
# template function, adapt it to your heart rate feature
# this function calculates meanRR
# RR = 60/HR
# meanRR should display a reverse line plot to meanHR line plot

def featureCalculation(segment):
    return sum([60/value for value in segment])/len(segment)

## Assignment 3 (plotting features)
You shall only modify
* the name of your feature
* title of the plot

In [38]:
# Ploting the segments
featureName = "meanRR"
title = "meanRR  to datetime"

In [51]:
## You don't need to change this section for complete the assigment
featureValues = [featureCalculation(segments[segment_label]) for segment_label in segments_labels]
labelValues = [dt.datetime.fromtimestamp(segment_label) for segment_label in segments_labels]
ploting(labelValues, featureValues, 'datetime', featureName, title, 'y-')

<IPython.core.display.Javascript object>

# SDRR
Von Christian Lemke

## recap

### Normal distribution (Normal- oder Gauß-Verteilung) (nach Carl Friedrich Gauß)



The probability density of the normal distribution is:
<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/6404a4c69c536278a5933085f0d5f4a9ca9f2b2a">

$\mu$ (mu) is mean or expectation of the distribution (and also its median and mode).

$\sigma$ (sigma) is standard deviation

$\sigma ^{2}$ is variance

In [40]:
# in python (example from book)
import math
def normal_pdf(x, mu=0, sigma=1):
    sqrt_to_pi = math.sqrt(2 * math.pi)
    return (math.exp(-(x-mu) ** 2 / 2 / sigma ** 2) / (sqrt_to_pi * sigma))

In [41]:
# plot
xs = [x/10.0 for x in range(-50,50)]
plt.plot(xs, [normal_pdf(x) for x in xs], '-', label='mu=0,sigma=1')
plt.plot(xs, [normal_pdf(x,sigma=2) for x in xs], '-', label='mu=0,sigma=2')
plt.plot(xs, [normal_pdf(x,sigma=0.5) for x in xs], '-', label='mu=0,sigma=0.5')
plt.plot(xs, [normal_pdf(x,mu=-1) for x in xs], '-', label='mu=-1,sigma=1')
plt.legend()
plt.title('Various Normal pdfs')
plt.show()

### a recap: standard deviation

## Wiki: standard deviation
In statistics, the standard deviation (SD, also represented by the Greek letter sigma σ or the Latin letter s) is a measure that is used to quantify the amount of variation or dispersion of a set of data values.[1] A low standard deviation indicates that the data points tend to be close to the mean (also called the expected value) of the set, while a high standard deviation indicates that the data points are spread out over a wider range of values.

## Simple Example

In [18]:
# import numpy as np (first cell)

# https://docs.scipy.org/doc/numpy/reference/generated/numpy.std.html

In [50]:
# sd Bsp:

bspValues = [0,0,0,0,0,0,0,0,-1,-0.5,1,0.5]

print( 'mean: ', sum(bspValues)/len(bspValues))
print ('std:', np.std(bspValues))
ploting(bspValues, bspValues, 'datetime', featureName, title, 'b-')


mean:  0.0
std: 0.456435464588


<IPython.core.display.Javascript object>

### Own Data

In [20]:
# Ploting the segments
featureName = "SDRR"
title = "SDRR to datetime"

In [21]:
def featureCalculation_RR(segment):
    return [60/hrValue for hrValue in segment]

def featureCalculation_SDRR(segment):
    """
    Takes a Segment of hrValues
    Transforms them to rrValues
    Calculates the standard deviation
    """
    return np.std([rrValues for rrValues in featureCalculation_RR(segment)])

In [22]:
# compute

featureValues = [featureCalculation_SDRR(segments[segment_label]) for segment_label in segments_labels]
labelValues = [dt.datetime.fromtimestamp(segment_label) for segment_label in segments_labels]

In [23]:
print( len(featureValues), len(labelValues))

130 130


In [24]:
# plot all
ploting(labelValues, featureValues, 'datetime', featureName, title, 'b-')

<IPython.core.display.Javascript object>

In [25]:
type(labelValues[68])

datetime.datetime

In [26]:
# time span: exam + lecture 
timeSpan1Start = 66
timeSpan1End = 94
print ('from', labelValues[timeSpan1Start] , 'to',  labelValues[timeSpan1End])

from 2016-10-27 13:30:00 to 2016-10-27 18:10:00


In [1]:
# plot exam time span

ploting(labelValues[timeSpan1Start:timeSpan1End], featureValues[timeSpan1Start:timeSpan1End], 'datetime', featureName, title, 'b-')

NameError: name 'ploting' is not defined