# Fitbit Data Analysis

## Introduction
This notebook aims to explore and analyze the Fitbit dataset. The dataset contains various CSV files with information on daily activities, calories burned, heart rate, sleep patterns, and more. We will start by listing the directory structure of the dataset, then proceed to load and inspect some key CSV files. Basic statistics and visualizations will be generated to understand the data better.

## Table of Contents
1. [Import Libraries](#Import-Libraries)
2. [List Directory Structure](#List-Directory-Structure)
3. [Read CSV File](#Read-CSV-File)
4. [Basic Statistics](#Basic-Statistics)
5. [Check Missing Values](#Check-Missing-Values)
6. [Visualize Data](#Visualize-Data)
7. [Main Execution](#Main-Execution)


## Accessing the Kaggle Dataset

This repository includes a GitHub Actions workflow that continuously tests the process of connecting to the Kaggle API, via github secrets to pass username and key securely as environment variables, and then checks if the files are correctly downloaded, unzipped and imported to python using pandas.

To do some work and then replicate it in the workflow, we are going to do an analogue operation.

First we will have to install the kaggle API, if it is not installed already: 

In [None]:
!pip install kaggle

Then, we will import The following libraries: 

* `os`  
* `KaggleApi`
* `load_dotenv`

In [34]:
import os
from kaggle.api.kaggle_api_extended import KaggleApi
from dotenv import load_dotenv
import zipfile
import pandas as pd

Since I don't want you, the reader of this notebook, to know my Kaggle credentials, we will use the `os` module function `getenv` to retrieve these two environment variables: 'KAGGLE_USERNAME' and 'KAGGLE_KEY'.

 These environment variables are set by a local .env file that is included in the .gitignore, so it is no comitted and you can't see it :-)  

In [37]:

load_dotenv()

kaggle_username = os.getenv('KAGGLE_USERNAME')
kaggle_key = os.getenv('KAGGLE_KEY')

Now we can check really quick whether or not these variables are empty:

In [38]:
if kaggle_username is None or kaggle_key is None:
    raise ValueError("Kaggle credentials are not set in the environment variables.")

Since we already Have a way to authenticate, we can now connect to the Kaggle API and download the dataset:

In [39]:
#Authentication with kaggle API
api = KaggleApi()
api.authenticate()

Now we can download the files included in the dataset website as follows:


In [40]:
# Defining the dataset
dataset = 'arashnic/fitbit'

# Downloading the dataset
api.dataset_download_files(dataset, path='./data', unzip=False)

Dataset URL: https://www.kaggle.com/datasets/arashnic/fitbit


ApiException: (502)
Reason: Bad Gateway
HTTP response headers: HTTPHeaderDict({'Content-Length': '136', 'Content-Type': 'text/html; charset=UTF-8', 'Date': 'Tue, 17 Sep 2024 21:15:13 GMT', 'Alt-Svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000'})
HTTP response body: b'<!doctype html><meta charset="utf-8"><meta name=viewport content="width=device-width, initial-scale=1"><title>502</title>502 Bad Gateway'
