# Introduction

## Data Source

* Please download your heart rate data from https://stila.pms.ifi.lmu.de
* Pick a day from the overview calendar, be aware that you don't pick an event
* Click the "Download HR data" button to download the heart rate data

## Packages
use pip3 to install the following package
```python
pip3 install numpy
pip3 install SciPy
pip3 install matplotlib
```

## Location of the heart rate raw data

For further analysis, please save the heart rate raw data in the **"dataSource"** directory

Please run the following cell to see your heart rate data

In [85]:
import matplotlib.pyplot as plt
import numpy as np
import os
import csv
import datetime as dt

# enable matplotlib inline codes
%matplotlib inline

filename = 'heartrate_2016-10-05.csv'
# Get the current working directory
currentDir=os.getcwd()

heartRateFile = os.path.abspath(os.path.join(currentDir,'dataSource',filename))
if os.path.exists(heartRateFile):
    print('Oberserving HeartRateFile Path is:\n%s' %(heartRateFile))
else:
    raise Exception(
      'File %s \n does not exist!' %(heartRateFile)
    )

Oberserving HeartRateFile Path is:
/Users/yingdingwang/VCS/github/pda16ws17/dataSource/heartrate_2016-10-05.csv


In [87]:
def parseCSV(file, coding='utf-8'):
    # a possible encoding can be 'iso-8859-15'
    with open(file, 'r', encoding=coding) as csvfile:
        reader = csv.reader(csvfile)
        content = list(reader)
        header = content.pop(0) # remove first header
    return header, content

hrHeader, hrContent = parseCSV(heartRateFile, 'utf-8') # with default windows encoding

print(hrHeader)
print('The Format of single HR Data: ', hrContent[0])
print("Loaded No. of Heart Rate Raw Data is: %d" %(len(hrContent)))

# The timestamp in data sets are utc timestamps
hrRaw = [heartrate for timestamp, heartrate in hrContent]
timestampRaw = [dt.datetime.fromtimestamp(int(timestamp)/1000) for timestamp, heartrate in hrContent]

#print(len(hrRaw))
#print(len(timestampRaw))
print('HR Data begins at: ',timestampRaw[0])
print('HR Data stops at: ',timestampRaw[-1])
print(type(timestampRaw[0]))

['\ufeff"timestamp"', 'heartrate']
The Format of single HR Data:  ['1475623760000', '70']
Loaded No. of Heart Rate Raw Data is: 4285
HR Data begins at:  2016-10-05 01:29:20
HR Data stops at:  2016-10-05 23:16:20
<class 'datetime.datetime'>


## Examing Heart Rate Raw Data
the following sections uses plots to examing heart rate raw data

In [88]:
## show plot interactive, if you want inline, comment out this line.
%matplotlib auto

def ploting(xvals, yvals, xlabel_str, ylabel_str, title, style):
    """
    creates a plot
    """
    # close all old plots
    plt.close("all")
    # plotting
    plt.plot(xvals, yvals, style)
    plt.xlabel(xlabel_str)
    plt.ylabel(ylabel_str)    
    plt.title(title)
    # show the grid line/ help line in plot
    plt.grid(True)
    #plt.legend(loc='upper right')
    #plt.axis([0, 210, 0, 0.04 ])
    #plt.figure(figsize=[9,6])
    plt.show()
   
## Main activity and time

xlabel_str = "time line"
ylabel_str = "heart rate (BPS)"
title = "heart rate row data"

# shows a point plot
ploting(timestampRaw,hrRaw, xlabel_str, ylabel_str, title, 'bo')


Using matplotlib backend: MacOSX


In [89]:
# shows a line plot
ploting(timestampRaw,hrRaw, xlabel_str, ylabel_str, title, 'b-')

In [90]:
plt.close('all')

## Segmenting the heart rate raw data
* segments heart rate raw data to 10 minutes segments
* for each segments, the heart rate feature is calculated.

In [91]:
def getBeginTimestamp(datetime):
    """
    this method returns the timestampe of the day by the given datetime.
    If the given datetime is 2016-10-05 01:29:20
    this method returns the timestamp of 2016-10-05 00:00:00
    """
    # extract the date string from the given datetime object
    str_current_date = dt.datetime.strftime(datetime, '%Y-%m-%d')
    # print("Current Observed Date: ",str_current_date)
    # convert the extracted date string to datetime object
    currentDate = dt.datetime.strptime(str_current_date, '%Y-%m-%d')
    # get the utc timestamp of the date string of the given datetime object
    int_ts_currentDate = int(dt.datetime.timestamp(currentDate))
    return int_ts_currentDate;

#current date timestamp
curDateTS = getBeginTimestamp(timestampRaw[0]) 
print("timestamp of current Date: " , curDateTS)

# list of timestamps saved in int
intTimestampRaw = [int(int(timestamp)/1000) for timestamp, heartRate in hrContent]

def segmentingDate(intTimestampRaw, hrRaw, curDateTS): 
    """
    this method make 10 minutes segments from heart rate raw data
    """
    segments = {}
    deliminator = 10 * 60
    segment_id = 0
    for id in range(len(intTimestampRaw)):
        timestamp = intTimestampRaw[id]
        heartrate = int(hrRaw[id])
        segment_id = (timestamp - curDateTS) // deliminator
        segment_label = segment_id * deliminator + curDateTS
        segments.setdefault(segment_label,[]).append(heartrate)
    return segments
# segments is a dictionary with segment timestamp as key, 
# and all heart rate values of the segment is saved as value
# in the segment dictionary
segments = segmentingDate(intTimestampRaw, hrRaw, curDateTS)
segments_labels = []
for label in segments.keys():
    segments_labels.append(int(label))
# in place sorting
list.sort(segments_labels)     
    
print("The number of segments is: ", len(segments))




timestamp of current Date:  1475618400
The number of segments is:  71


## Examing a sample segment
plots a sample segment

In [92]:
# Display segments
def displaySegment(idx):
    segment_label = segments_labels[idx]
    print("ploting segement with label: ", dt.datetime.fromtimestamp(segment_label))
    segment = segments[segment_label]
    # print(segment)
    ploting(range(len(segment)),segment, xlabel_str, ylabel_str, "segment "+ str(segment_label), 'b-')

displaySegment(3)

ploting segement with label:  2016-10-05 09:40:00


## Example of featureCalculation function
the following featureCalculation function calculates the meanHR feature

* meanHR: the mean of the heart rate within a segment

Note: 

The following meanHR plot shall be very similar to the stila portal's heart rate plot, since the stila portal's plot does the mean aggregation of heart rates automatically while zooming. 

In [99]:
# this function calculates the heart rate value within a segment
# the type of heart rate value within the segment is integer
def featureCalculation(segment):
    return sum(segment)/len(segment)

In [100]:
# Ploting the segments
featureName = "meanHR"
title = "meanHR  to datetime"
featureValues = [featureCalculation(segments[segment_label]) for segment_label in segments_labels]
labelValues = [dt.datetime.fromtimestamp(segment_label) for segment_label in segments_labels]
ploting(labelValues, featureValues, 'datetime', featureName, title, 'g-')

## This meanHR plot shall be very similar 
## to the stila portal's heart rate plot

## Assignment 1 (featureCalculation function)
You shall modify the following featureCalculation function to calculate your assigned heart rate feature


In [101]:
# template function, adapt it to your heart rate feature
# this function calculates meanRR
# RR = 60/HR
def featureCalculation(segment):
    return sum([60/value for value in segment])/len(segment)

## Assignment 2 (plotting features)
You shall only modify
* the name of your feature
* title of the plot

In [102]:
# Ploting the segments
featureName = "meanRR"
title = "meanRR  to datetime"

In [103]:
## You don't need to change this section for complete the assigment
featureValues = [featureCalculation(segments[segment_label]) for segment_label in segments_labels]
labelValues = [dt.datetime.fromtimestamp(segment_label) for segment_label in segments_labels]
ploting(labelValues, featureValues, 'datetime', featureName, title, 'y-')