## Load Libraries

In [1]:
import pandas as pd
import numpy as np
from pandas import Timestamp
import os
from datetime import datetime, timedelta

## Download raw data

* Log in with own credentials to **Garmin Connect** webapp. 
* There is an option to download day-by-day data, within a 7-days time interval. In our case, 6 files have been downloaded, 1 containing the week before starting the trip, 4 concerning the days during the trip and another 1 for the week after the trip.
* [This](https://connect.garmin.com/modern/report/60/wellness/last_seven_days) is the link to the download source for **resting heartbeats** and [this](https://connect.garmin.com/modern/report/26/wellness/last_seven_days) for the **sleep** data.

## Load data files for both sleep and hrh


In [2]:
# Read the directories with the data and save file_names in two list
path_to_sleep = 'python_data/Garmin_Sleep/'
path_to_rhr = 'python_data/Garmin_HeartRate/'

csv_files_sleep = [single_csv for single_csv in os.listdir(path_to_sleep) if single_csv.endswith('.csv')]
csv_files_rhr = [single_csv for single_csv in os.listdir(path_to_rhr) if single_csv.endswith('.csv')]

In [3]:
# Check if filenames are parsed correctly
print csv_files_sleep
print csv_files_rhr

['1_SLEEP_DURATION_3006_0607.csv', '2_SLEEP_DURATION_0707_1307.csv', '3_SLEEP_DURATION_1407_2007.csv', '4_SLEEP_DURATION_2107_2707.csv', '5_SLEEP_DURATION_2807_0308.csv', '6_SLEEP_DURATION_0408_1008.csv']
['1_RESTING_HEART_RATE_3006_0607.csv', '2_RESTING_HEART_RATE_0707_1307.csv', '3_RESTING_HEART_RATE_1407_2007.csv', '4_RESTING_HEART_RATE_2107_2707.csv', '5_RESTING_HEART_RATE_2807_0308.csv', '6_RESTING_HEART_RATE_0408_1008.csv']


# Build the sleep dataframe

In [9]:
# Sleep df
df_sleep = pd.DataFrame()
for file_name in csv_files_sleep:
    df_tmp = pd.read_csv(path_to_sleep+file_name)
    df_sleep = pd.concat([df_sleep, df_tmp])

In [10]:
df_sleep

Unnamed: 0,Sleep Time,Hrs,Hrs.1
0,Sat,6.0,6:02 hrs
1,Sun,5.3,5:15 hrs
2,Mon,6.8,6:45 hrs
3,Tue,9.8,9:47 hrs
4,Thu,10.6,10:35 hrs
0,Sat,5.2,5:13 hrs
1,Sun,7.6,7:38 hrs
2,Mon,3.7,3:39 hrs
3,Tue,5.9,5:52 hrs
4,Thu,9.1,9:07 hrs


As far as we see, the sleeping data input of Garmin is missing some days, and it isn't that reliable with a first look at the sleeping hours. We should probably try to download data from the **MiFit** I was also wearing during the trip. The problem is that **Xiaomi** doesn't support a web app where you can download data, so the only option is to manually create the datafiles in a similar format than the one provided by **Garmin** and then parse it the same way.

I modified the same files in order to avoid duplicate useless files so I will source again the same files with different inputs this time.

In [4]:
# Sleep df (Manually Modified Version)
df_sleep = pd.DataFrame()
for file_name in csv_files_sleep:
    df_tmp = pd.read_csv(path_to_sleep+file_name)
    df_sleep = pd.concat([df_sleep, df_tmp])

In [5]:
df_sleep

Unnamed: 0,day,sleep_duration,deep,light,awake
0,Jun 30,8:51 hrs,3:35 hrs,5:16 hrs,0:00 hrs
1,Jul 1,8:43 hrs,4:08 hrs,4:35 hrs,0:00 hrs
2,Jul 2,5:52 hrs,2:40 hrs,3:12 hrs,0:00 hrs
3,Jul 3,5:52 hrs,2:09 hrs,3:43 hrs,0:00 hrs
4,Jul 4,10:25 hrs,4:11 hrs,6:14 hrs,0:00 hrs
5,Jul 5,1:01 hrs,0:00 hrs,1:01 hrs,0:00 hrs
6,Jul 6,10:31 hrs,3:46 hrs,6:45 hrs,0:00 hrs
0,Jul 7,6:47 hrs,1:29 hrs,5:18 hrs,0:00 hrs
1,Jul 8,8:20 hrs,2:21 hrs,5:59 hrs,0:00 hrs
2,Jul 9,7:07 hrs,2:12 hrs,4:55 hrs,0:00 hrs


In order to create a more sophisticated graph based on Eric Boam's [seven months of sleep](http://www.ericboam.com/Seven-Months-of-Sleep-1), it was necessary to manually create a csv dataset based on the **MiFit** app. And here is how it looks like.

In [7]:
# Source manually created data to a dataframe
df_sleep_2 = pd.read_csv(path_to_sleep+'SLEEP_TIME.csv')
df_sleep_2

Unnamed: 0,day_no,asleep,awake,sleep_duration
0,1,2017-07-05T00:54:00.000,2017-07-05T01:55:00.000,1:01 hrs
1,2,2017-07-06T01:06:00.000,2017-07-06T11:37:00.000,10:31 hrs
2,3,2017-07-07T02:39:00.000,2017-07-07T09:26:00.000,6:47 hrs
3,4,2017-07-08T02:01:00.000,2017-07-08T10:21:00.000,8:20 hrs
4,5,2017-07-09T02:21:00.000,2017-07-09T09:28:00.000,7:07 hrs
5,6,2017-07-10T02:04:00.000,2017-07-10T10:42:00.000,8:38 hrs
6,7,2017-07-11T03:10:00.000,2017-07-11T08:21:00.000,5:11 hrs
7,8,2017-07-11T23:11:00.000,2017-07-12T09:00:00.000,9:49 hrs
8,9,2017-07-13T02:14:00.000,2017-07-13T09:13:00.000,6:59 hrs
9,10,2017-07-14T00:50:00.000,2017-07-14T08:58:00.000,8:08 hrs


## Useful functions

In [8]:
# Modify the date to look like the rest
def dayTransformer(s):
    month = s.split(' ')[0]
    day = s.split(' ')[1]
    year = '2017'
    
    if month=='Jan':
        month='01'
    elif month=='Feb':
        month='02'
    elif month=='Mar':
        month='03'
    elif month=='Apr':
        month='04'
    elif month=='May':
        month='05'
    elif month=='Jun':
        month='06'
    elif month=='Jul':
        month='07'
    elif month=='Aug':
        month='08'
    elif month=='Sep':
        month='09'
    elif month=='Oct':
        month='10'
    elif month=='Nov':
        month='11'
    elif month=='Dec':
        month='12'
    
    if len(day)<2:
        day = '0'+day
        
    return year+'-'+month+'-'+day

In [9]:
# Remove 'hrs' from sleep_duration
def removeItem(s):
    hrs = s.split(' ')[0]
    hr = hrs.split(':')[0]
    mm = hrs.split(':')[1]
    if len(hr)<2:
        hr = '0'+hr
    return hr+':'+mm

In [10]:
# Transform hh:mm to minutes
def hoursToMins(s):
    hr = int(s.split(':')[0])
    mm = int(s.split(':')[1])
    
    return str(hr*60 + mm)

### Modify sleep data

In [12]:
# Remove hrs string and add a zero digit on hours
df_sleep_2['sleep_duration']=df_sleep_2['sleep_duration'].apply(lambda x: removeItem(x))

# Create a minutes column that calculates the total number of minutes
df_sleep_2['sleep_min']=df_sleep_2['sleep_duration'].apply(lambda x: hoursToMins(x))

In [13]:
df_sleep_2

Unnamed: 0,day_no,asleep,awake,sleep_duration,sleep_min
0,1,2017-07-05T00:54:00.000,2017-07-05T01:55:00.000,01:01,61
1,2,2017-07-06T01:06:00.000,2017-07-06T11:37:00.000,10:31,631
2,3,2017-07-07T02:39:00.000,2017-07-07T09:26:00.000,06:47,407
3,4,2017-07-08T02:01:00.000,2017-07-08T10:21:00.000,08:20,500
4,5,2017-07-09T02:21:00.000,2017-07-09T09:28:00.000,07:07,427
5,6,2017-07-10T02:04:00.000,2017-07-10T10:42:00.000,08:38,518
6,7,2017-07-11T03:10:00.000,2017-07-11T08:21:00.000,05:11,311
7,8,2017-07-11T23:11:00.000,2017-07-12T09:00:00.000,09:49,589
8,9,2017-07-13T02:14:00.000,2017-07-13T09:13:00.000,06:59,419
9,10,2017-07-14T00:50:00.000,2017-07-14T08:58:00.000,08:08,488


The above dataset is ready to be parsed by the **D3.js** library as is.

In [14]:
# Save it to a csv for D3
df_sleep_2.to_csv('../d3_visualizations/sleep_barchart/sleep.csv', index=False)

# Build the RHR(resting HR) dataframe

In [33]:
# Resting HR df
df_rhr = pd.DataFrame()
for file_name in csv_files_rhr:
    df_tmp = pd.read_csv(path_to_rhr+file_name)
    df_rhr = pd.concat([df_rhr, df_tmp])

In [34]:
df_rhr[:5]

Unnamed: 0,day,bpm
0,Jun 30,72
1,Jul 1,79
2,Jul 2,72
3,Jul 3,66
4,Jul 4,73


In [35]:
# Reset index
df_rhr = df_rhr.reset_index()

# Drop extra index
df_rhr = df_rhr.drop('index', axis=1)

### Modify HR data

In [36]:
# Rename the columns
df_rhr.rename(columns={'day': 'day','bpm': 'rest_hr',}, inplace=True)

# Transform day to (YYYY-MM-DD) format
df_rhr['day'] = df_rhr['day'].apply(lambda x: dayTransformer(x))

In [40]:
# Create day_no column
df_rhr['day_no']= 1 + df_rhr.index

In [41]:
df_rhr[:5]

Unnamed: 0,day,rest_hr,day_no
0,2017-06-30,72,1
1,2017-07-01,79,2
2,2017-07-02,72,3
3,2017-07-03,66,4
4,2017-07-04,73,5


In [42]:
# Save it to a csv for D3
df_rhr.to_csv('../d3_visualizations/restingHR_linegraph/restingHR.csv', index=False)

# Discussion

The idea from now on is to:
* Rename each column (except from day) to blabla_angelos
* Merge HR with sleep dataframes
* Then merge Andreas and Angelos dataframe to a single CSV with all data included