<a href="https://www.kaggle.com/code/amirmotefaker/smartwatch-data-analysis?scriptVersionId=125616141" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# Introduction

- There is a lot of competition among the brands in the smartwatch industry. Smartwatches are preferred by people who like to take care of their fitness. Analyzing the data collected on your fitness is one of the use cases of Data Science in healthcare. So if you want to learn how to analyze smartwatch fitness data, this notebook is for you. In this notebook, I will take you through the task of Smartwatch Data Analysis using Python.

## Dataset

- This dataset generated by respondents to a distributed survey via Amazon Mechanical Turk between 03.12.2016-05.12.2016. Thirty eligible Fitbit users consented to the submission of personal tracker data, including minute-level output for physical activity, heart rate, and sleep monitoring. Individual reports can be parsed by export session ID (column A) or timestamp (column B). Variation between output represents use of different types of Fitbit trackers and individual tracking behaviors / preferences.

# Import libraries

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import plotly.express as px
import plotly.graph_objects as go

# Read Data

In [None]:
data = pd.read_csv("/kaggle/input/fitbit/Fitabase Data 4.12.16-5.12.16/dailyActivity_merged.csv")

In [None]:
print(data.head())

# null Values

- Does the dataset contain null values or not?

In [None]:
print(data.isnull().sum())

- So the dataset does not have any null values. 

- Let’s have a look at the information about columns in the dataset:

In [None]:
print(data.info())

- The column containing the date of the record is an object. We may need to use dates in our analysis, so let’s convert this column into a datetime column:

In [None]:
# Changing datatype of ActivityDate

data["ActivityDate"] = pd.to_datetime(data["ActivityDate"], 
                                      format="%m/%d/%Y")
print(data.info())

- Look carefully at all the columns. You will see information about very active, moderately active, inactive, and inactive minutes in the dataset.

- Let's combine all these columns as total minutes before moving forward:

In [None]:
data["TotalMinutes"] = data["VeryActiveMinutes"] + data["FairlyActiveMinutes"] + data["LightlyActiveMinutes"] + data["SedentaryMinutes"]
print(data["TotalMinutes"].sample(5))

- Let's take a look at the descriptive statistics of the data set:

In [None]:
print(data.describe())

# Analyze the Calories

- The dataset has a “Calories” column; it contains the data about the number of calories burned in a day. 

- Let’s have a look at the relationship between calories burned and the total steps walked in a day:

In [None]:
figure = px.scatter(data_frame = data, x="Calories",
                    y="TotalSteps", size="VeryActiveMinutes", 
                    trendline="ols", 
                    title="Relationship between Calories & Total Steps")
figure.show()

- You can see that there is a linear relationship between the total number of steps and the number of calories burned in a day. 

- Let’s look at the average total number of active minutes in a day:

In [None]:
label = ["Very Active Minutes", "Fairly Active Minutes", 
         "Lightly Active Minutes", "Inactive Minutes"]
counts = data[["VeryActiveMinutes", "FairlyActiveMinutes", 
               "LightlyActiveMinutes", "SedentaryMinutes"]].mean()
colors = ['gold','lightgreen', "pink", "blue"]

fig = go.Figure(data=[go.Pie(labels=label, values=counts)])
fig.update_layout(title_text='Total Active Minutes')
fig.update_traces(hoverinfo='label+percent', textinfo='value', textfont_size=30,
                  marker=dict(colors=colors, line=dict(color='black', width=3)))
fig.show()

### Chart observations:
- 81.3% of all inactive minutes in a day
- 15.8% of sedentary minutes per day
- On average, only 21 minutes (1.74%) were very active
- 1.11% (13 minutes) of relatively active minutes per day

- Transformed the data type of the ActivityDate column to the datetime column above. 

- Let’s use it to find the weekdays of the records and add a new column to this dataset as “Day”:

In [None]:
data["Day"] = data["ActivityDate"].dt.day_name()
print(data["Day"].head())