# Exercises on Dictionaries, Dataframes and Visualisation

* Author: Johannes Maucher
* Last Update: 07.10.2018
* Skills required:
    - Import data from .json
    - Pandas Dataframes
    - Bokeh visualisation


## To be submitted:
This notebook, enhanced with the solutions to the questions. Your solution should contain 
   * the implemented code in code-cells, 
   * the output of this code
   * Your remarks, discussion, comments on the solution in markdown-cells.
   * .ipynb and .html version of the notebook


## Task:
In this task data, which has been recorded by a Polar sport watch during a long-distance run, shall be visualized. The data is available in a json file. The python package `json` and it's method `load()` can be applied to read data from a .json-File into a Python dictionary.
1. Load the contents of file `polarV800.json` into a dictionary
2. Determine the keys of this dictionary and display the values of some keys
3. The value of the key `samples` is a dictionary itself. This dictionary has the following key-value pairs:

    1. key "0": value is a list of heartrate-values, sampled every second during a training measurement
    2. key "1": value is a list of the speed-values, sampled every second during a training measurement
    3. key "2": value is a list of the cadence-values, sampled every second during a training measurement
    4. key "3": value is a list of the altitude-values, sampled every second during a training measurement
    5. key "10": value is a list of distance-values, sampled every second during a training measurement
    
  Write these 5 sequences into the columns of a dataframe.

4. Visualize these 5 sequences by applying Bokeh. The different plots shall be linked. The plots shall contain interactive elements. Moreover, the heart-rate time-series shall be backgrounded by different colors for each of the heart-rate zones. The lower and upper limits ot the 5 different heart-rate-zones can be obtained from the key `heart-rate-zones`. 

### 1.

In [51]:
import json

In [52]:
with open("polarV800.json") as f:
    file=json.load(f)

In [53]:
type(file)

dict

### 2.

In [54]:
print(file.keys())

dict_keys(['activity-zones', 'calories', 'detailed-sport-info', 'device', 'distance', 'duration', 'has-route', 'heart-rate', 'heart-rate-zones', 'id', 'member-id', 'organisationApplicationUserId', 'polar-user', 'recording-rate', 'samples', 'samples-with-missing-segments', 'sport', 'start-time', 'transaction-id', 'upload-time'])


In [55]:
print(file.items())

dict_items([('activity-zones', []), ('calories', 1620), ('detailed-sport-info', 'ROAD_RUNNING'), ('device', 'Polar V800'), ('distance', 22375.0), ('duration', 'PT1H40M51.937S'), ('has-route', True), ('heart-rate', {'average': 152, 'maximum': 161}), ('heart-rate-zones', [{'index': 0, 'inzone': 'PT7S', 'lower': 93, 'upper': 112}, {'index': 1, 'inzone': 'PT18S', 'lower': 112, 'upper': 130}, {'index': 2, 'inzone': 'PT13M38S', 'lower': 130, 'upper': 149}, {'index': 3, 'inzone': 'PT1H26M37S', 'lower': 149, 'upper': 167}, {'index': 4, 'inzone': 'PT0S', 'lower': 167, 'upper': 186}]), ('id', 18491923), ('member-id', '13'), ('organisationApplicationUserId', 1575025), ('polar-user', '/users/3909976'), ('recording-rate', 1), ('samples', {'0': '71,71,73,76,78,82,84,87,90,92,95,98,101,103,106,108,110,112,114,116,118,119,119,120,121,122,123,124,124,125,126,127,128,128,129,130,131,131,132,132,133,133,133,134,134,134,135,135,135,135,136,136,136,137,137,137,137,137,138,138,139,139,139,140,140,141,141,14

In [56]:
file["calories"]

1620

In [57]:
file["heart-rate-zones"]

[{'index': 0, 'inzone': 'PT7S', 'lower': 93, 'upper': 112},
 {'index': 1, 'inzone': 'PT18S', 'lower': 112, 'upper': 130},
 {'index': 2, 'inzone': 'PT13M38S', 'lower': 130, 'upper': 149},
 {'index': 3, 'inzone': 'PT1H26M37S', 'lower': 149, 'upper': 167},
 {'index': 4, 'inzone': 'PT0S', 'lower': 167, 'upper': 186}]

In [58]:
file["samples"]

{'0': '71,71,73,76,78,82,84,87,90,92,95,98,101,103,106,108,110,112,114,116,118,119,119,120,121,122,123,124,124,125,126,127,128,128,129,130,131,131,132,132,133,133,133,134,134,134,135,135,135,135,136,136,136,137,137,137,137,137,138,138,139,139,139,140,140,141,141,142,142,142,142,142,142,142,142,142,143,143,143,143,143,143,143,143,143,143,143,143,143,143,143,143,143,143,143,143,143,143,142,142,142,142,142,142,142,143,143,143,143,142,143,142,142,142,142,142,142,142,142,142,142,141,142,142,142,142,142,142,142,141,141,141,141,140,140,140,139,139,139,139,139,139,139,139,139,139,139,139,139,140,140,140,140,140,139,139,138,138,138,138,138,138,137,137,137,137,137,137,137,136,136,136,136,136,135,135,135,135,135,135,135,134,134,133,133,133,132,132,132,132,132,133,133,133,134,134,133,134,134,134,134,134,134,134,134,134,134,134,134,134,134,133,133,134,134,133,133,133,133,133,133,133,133,133,133,133,133,133,133,133,133,132,132,133,133,133,133,133,133,133,133,133,133,133,133,133,133,133,133,134,134,1

In [59]:
file["samples"].keys()

dict_keys(['0', '1', '10', '2', '3', '9'])

### 3.

In [60]:
import pandas as pd
print(pd.__version__)

0.20.3


In [61]:
import numpy as np
arr=np.array([])
temp=[]
DF=pd.DataFrame()

for i in file["samples"].keys():
    temp=file["samples"][i]
    valList=temp.split(",") 
    tmpArr=np.array(valList, dtype=np.float32)
    DF[i]=tmpArr
display(DF.head(20))


Unnamed: 0,0,1,10,2,3,9
0,71.0,2.3,1.3,0.0,509.631989,30.299999
1,71.0,3.6,3.2,0.0,509.631989,30.299999
2,73.0,5.1,8.0,0.0,508.71701,30.200001
3,76.0,6.9,10.4,0.0,508.71701,30.299999
4,78.0,8.8,13.4,0.0,508.71701,30.299999
5,82.0,10.3,16.799999,37.0,508.71701,30.299999
6,84.0,11.6,20.799999,37.0,508.71701,30.4
7,87.0,12.4,25.299999,68.0,508.71701,30.299999
8,90.0,12.9,30.4,73.0,507.651001,30.299999
9,92.0,13.4,35.700001,76.0,507.651001,30.299999


In [88]:
print(DF["10"].max())

22374.7


In [85]:
print(DF.min())

index      0.000000
0         71.000000
1          2.300000
10         1.300000
2          0.000000
3        380.700012
9         19.500000
dtype: float64


### 4. 

In [62]:
from bokeh.plotting import figure 
from bokeh.io import output_notebook, show
from bokeh.models import ColumnDataSource,HoverTool, BoxAnnotation, Range1d
from bokeh.layouts import gridplot
output_notebook()

**Als Template für die Visualisierung wurde die Vorlage aus Lecture "08BokehTCXvisualisation" verwendet **

In [63]:
DF.reset_index(inplace = True)


**Extrahieren der Puls Zonen **

In [238]:
frqLo=[]
frqHi=[]
for i in range(0,5):
    frqLo.append(file["heart-rate-zones"][i]["lower"])
    frqHi.append(file["heart-rate-zones"][i]["upper"])

In [239]:
source = ColumnDataSource(DF)

HRC=["grey","blue","green","yellow","red"]

options = dict(plot_width=800, plot_height=300,
               tools="pan,wheel_zoom,box_zoom,box_select,lasso_select,reset")

p1 = figure(title="heartrate over time",y_range=Range1d(90,190), **options)
r1=p1.line("index","0", color="blue", source=source)
for i,col in enumerate(HRC):
    p1.add_layout(BoxAnnotation(bottom=frqLo[i],top=frqHi[i], fill_alpha=0.3, fill_color=col))

p2 = figure(title="speed over time",x_range=p1.x_range,y_range=Range1d(DF["1"].min(),DF["1"].max()), **options)
r2=p2.line("index","1", color="green", source=source)

p3 = figure(title="cadence over time",x_range=p1.x_range,y_range=Range1d(DF["2"].min(),DF["2"].max()), **options)
r3=p3.line("index","2", color="red", source=source)

p4 = figure(title="altitude over time",x_range=p1.x_range,y_range=Range1d(DF["3"].min(),DF["3"].max()), **options)
r4=p4.line("index","3", color="red", source=source)

p5 = figure(title="distance over time",x_range=p1.x_range,y_range=Range1d(DF["10"].min(),DF["10"].max()), **options)
r5=p5.line("index","10", color="red", source=source)


pall = gridplot([[p1],[p2],[p3],[p4],[p5]], toolbar_location="right")

In [240]:
def updateGraphs(filename):
    istream = open(path+filename,'r')
    xml = istream.read()
    points = parsetcx(xml)
    trainingDF=createDataframe(points,columnIndex)
    #print(trainingDF.head())
    source = ColumnDataSource(trainingDF)
    r1.data_source.data=source.data
    r2.data_source.data=source.data
    r3.data_source.data=source.data
    r4.data_source.data=source.data
    r5.data_source.data=source.data
    
    rmap.data_source.data=ColumnDataSource(
        data = dict(lat=trainingDF["lat"].values[::SS].tolist(),
                    lon=trainingDF["long"].values[::SS].tolist())).data
    push_notebook(handle=handle1)
    push_notebook(handle=handle2)

In [241]:
handle1=show(pall,notebook_handle=True)

  elif np.issubdtype(type(obj), np.float):
