# Deconstructing the Fitbit Sleep Score

In this project I will use different Machine Learning models in order to get a better understanding of the Fitbit Sleep Score. For those people who have a Fitbit, you've probably been wondering how exactly Fitbit comes up with your sleep score. Sometimes you sleep for shorter periods of time with similar amounts of REM and deep sleep but still get a better sleep score. Other times you have rather low amounts of REM and deep sleep but a better score than a night of higher amounts of those. What's the secret behind this?
That's precisely what I will answer throughout this project.

In [None]:
# Import all relevant libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error
from pprint import pprint
from xgboost import XGBRegressor
from sklearn.ensemble import RandomForestRegressor

In [None]:
# Read the data
url = 'https://raw.githubusercontent.com/srijp/Fitbit-Sleep-Score/master/Fitbit_Sleep_JB_041219_010720.csv'
sleep_data = pd.read_csv(url)

In [None]:
sleep_data.head()

Unnamed: 0,Start Time,End Time,Minutes Asleep,Minutes Awake,Number of Awakenings,Time in Bed,Minutes REM Sleep,Minutes Light Sleep,Minutes Deep Sleep,overall_score
0,30/6/20 21:57,1/7/20 5:59,402,79,40,481,32,282,88,71.0
1,29/6/20 21:35,30/6/20 6:02,444,63,36,507,51,332,61,78.0
2,28/6/20 22:01,29/6/20 6:01,420,60,36,480,37,335,48,78.0
3,27/6/20 22:05,28/6/20 9:27,567,115,51,682,83,390,94,75.0
4,26/6/20 21:40,27/6/20 7:35,495,100,35,595,75,335,85,78.0


In [None]:
# Drop the last row as it doesn't have any sleep score data
sleep_data.dropna(subset=['overall_score'], inplace=True)

For now I will focus on the columns from Minutes Asleep to Minutes Deep Sleep as the features and the overall_score as the label as that most closely resembles the data that the Fitbit App provides to its users. The Number of Awakenings column seems interesting but isn't provided in the app either so I'll drop that one for now as well.

In [None]:
# Obtain column names for features
feats = sleep_data.columns[2:9]

X = sleep_data[feats].astype(float)
X.drop('Number of Awakenings', axis=1, inplace=True)
y = sleep_data['overall_score']

In [None]:
# Split data into training and validation set
X_train, X_valid, y_train, y_valid = train_test_split(X, y, test_size=0.2, random_state=42)
# Remember: because now I'm looking at a Random Forest Regressor, scaling is not needed

In [None]:
# Create the model using early stopping and a relatively "slow" learning rate


In [None]:
# Define a function for scoring the model and returning its accuracy
def evaluate(model, test_features, test_labels):
    predictions = model.predict(test_features)
    errors = abs(predictions - test_labels)
    mape = 100 * np.mean(errors / test_labels)
    accuracy = 100 - mape
    score = model.score(test_features, test_labels)
    print('Model Performance')
    print('Average Error: {:0.4f}.'.format(np.mean(errors)))
    print('Accuracy = {:0.2f}%.'.format(accuracy))
    print('Score = {:0.4f}.'.format(score))
    return accuracy

In [None]:
# Get random forest for use
rf_base = RandomForestRegressor(random_state=42)
rf_base.fit(X_train, y_train)

RandomForestRegressor(random_state=42)

In [None]:
# Look at feature importances
feature_list = list(X.columns)
importances = list(rf_base.feature_importances_)

# List of tuples with variable and importance ans subsequent sorting
feature_importances = [(feature, round(importance, 2)) for feature, importance in zip(feature_list, importances)]
feature_importances = sorted(feature_importances, key = lambda x: x[1], reverse=True)

# Print out features and corresponding importances
[print('Variable: {:20} Importance: {}'.format(*pair)) for pair in feature_importances]

Variable: Minutes Asleep       Importance: 0.61
Variable: Minutes REM Sleep    Importance: 0.19
Variable: Minutes Awake        Importance: 0.12
Variable: Time in Bed          Importance: 0.03
Variable: Minutes Deep Sleep   Importance: 0.03
Variable: Minutes Light Sleep  Importance: 0.02


[None, None, None, None, None, None]

In [None]:
# Define function for converting hours and minutes into minutes
def hours_to_mins(time):
    hour = time[0]
    mins = time[1]
    mins = mins + hour * 60
    return mins

In [None]:
X_train.columns

Index(['Minutes Asleep', 'Minutes Awake', 'Time in Bed', 'Minutes REM Sleep',
       'Minutes Light Sleep', 'Minutes Deep Sleep'],
      dtype='object')

In [None]:
yesterday = [(7,12), (1,20), (8,32), (1,3), (4,45), (1,24)]

In [None]:
# Define function to transform input times
def get_input(times):
    transformed = []
    for time in times:
        transformed.append(hours_to_mins(time))
    transformed = np.array(transformed)
    transformed = transformed.reshape(1, -1)
    return transformed

In [None]:
# Convert last nights sleep score
last_night = get_input(yesterday)
last_night

array([[432,  80, 512,  63, 285,  84]])