<a href="https://colab.research.google.com/github/vipasha321/Data-Science-Analytics-Projects/blob/main/Fitness_Watch_Data_Analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Fitness Watch Data Analysis involves analyzing the data collected by fitness wearables or smartwatches to gain insights into users’ health and activity patterns. These devices track metrics like steps taken, energy burned, walking speed, and more.

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [4]:
import pandas as pd
import plotly.io as pio
import plotly.graph_objects as go
import plotly.express as px

In [5]:
pio.templates.default = "plotly_white"

In [7]:
data = pd.read_csv("drive/My Drive/DATA/Apple-Fitness-Data.csv")
print(data.head())

         Date       Time  Step Count  Distance  Energy Burned  \
0  2023-03-21  16:01:23           46   0.02543         14.620   
1  2023-03-21  16:18:37          645   0.40041         14.722   
2  2023-03-21  16:31:38           14   0.00996         14.603   
3  2023-03-21  16:45:37           13   0.00901         14.811   
4  2023-03-21  17:10:30           17   0.00904         15.153   

   Flights Climbed  Walking Double Support Percentage  Walking Speed  
0                3                              0.304          3.060  
1                3                              0.309          3.852  
2                4                              0.278          3.996  
3                3                              0.278          5.040  
4                3                              0.281          5.184  


In [8]:
data.isnull().sum()

Date                                 0
Time                                 0
Step Count                           0
Distance                             0
Energy Burned                        0
Flights Climbed                      0
Walking Double Support Percentage    0
Walking Speed                        0
dtype: int64

In [9]:
# Step Count Over Time
fig1=px.line(data,x="Time", y="Step Count",
               title="Step Count Over Time")
fig1.show()

In [10]:
# Distance Covered Over Time
fig2 = px.line(data, x="Time",
               y="Distance",
               title="Distance Covered Over Time")
fig2.show()

In [11]:
# Energy Burned Over Time
fig3 = px.line(data, x="Time",
               y="Energy Burned",
               title="Energy Burned Over Time")
fig3.show()

In [12]:
# Walking Speed Over Time
fig4 = px.line(data, x="Time",
               y="Walking Speed",
               title="Walking Speed Over Time")
fig4.show()

In [13]:
# Calculate Average Step Count per Day
average_step_count_per_day = data.groupby("Date")["Step Count"].mean().reset_index()


In [14]:
average_step_count_per_day

Unnamed: 0,Date,Step Count
0,2023-03-21,137.636364
1,2023-03-22,354.233333
2,2023-03-23,109.125
3,2023-03-24,64.666667
4,2023-03-25,117.0
5,2023-03-26,101.0
6,2023-03-27,48.875
7,2023-03-28,163.75
8,2023-03-29,169.578947
9,2023-03-30,384.181818


In [15]:
fig5 = px.bar(average_step_count_per_day, x="Date",
              y="Step Count",
              title="Average Step Count per Day")
fig5.update_xaxes(type='category')
fig5.show()

In [16]:
# Calculate Walking Efficiency
data["Walking Efficiency"] = data["Distance"] / data["Step Count"]

fig6 = px.line(data, x="Time",
               y="Walking Efficiency",
               title="Walking Efficiency Over Time")
fig6.show()

In [17]:
# Create Time Intervals
time_intervals = pd.cut(pd.to_datetime(data["Time"]).dt.hour,
                        bins=[0, 12, 18, 24],
                        labels=["Morning", "Afternoon", "Evening"],
                        right=False)

In [18]:
time_intervals

0      Afternoon
1      Afternoon
2      Afternoon
3      Afternoon
4      Afternoon
         ...    
144    Afternoon
145    Afternoon
146    Afternoon
147    Afternoon
148    Afternoon
Name: Time, Length: 149, dtype: category
Categories (3, object): ['Morning' < 'Afternoon' < 'Evening']

In [19]:
data["Time Interval"] = time_intervals

In [20]:
# Variations in Step Count and Walking Speed by Time Interval
fig7 = px.scatter(data, x="Step Count",
                  y="Walking Speed",
                  color="Time Interval",
                  title="Step Count and Walking Speed Variations by Time Interval",
                  trendline='ols')
fig7.show()

In [21]:
# compare the daily average of all the health and fitness metrics

# Reshape data for treemap
daily_avg_metrics = data.groupby("Date").mean().reset_index()
daily_avg_metrics


The default value of numeric_only in DataFrameGroupBy.mean is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.



Unnamed: 0,Date,Step Count,Distance,Energy Burned,Flights Climbed,Walking Double Support Percentage,Walking Speed,Walking Efficiency
0,2023-03-21,137.636364,0.086225,14.721273,2.909091,0.294273,4.352727,0.000626
1,2023-03-22,354.233333,0.230261,15.158233,2.466667,0.310467,3.5028,0.000707
2,2023-03-23,109.125,0.075796,14.303,2.375,0.312375,3.762,0.000676
3,2023-03-24,64.666667,0.042067,15.268667,2.666667,0.307333,3.936,0.000648
4,2023-03-25,117.0,0.080747,15.060222,2.555556,0.297778,3.52,0.000684
5,2023-03-26,101.0,0.06876,18.504091,2.0,0.291,3.135273,0.000669
6,2023-03-27,48.875,0.032664,23.656625,2.0,0.284625,4.4505,0.000664
7,2023-03-28,163.75,0.102727,14.853917,4.0,0.300417,4.902,0.000651
8,2023-03-29,169.578947,0.115884,13.363737,1.684211,0.298842,4.234737,0.000675
9,2023-03-30,384.181818,0.252494,13.236909,2.545455,0.293182,4.434545,0.000676


In [22]:
daily_avg_metrics_melted = daily_avg_metrics.melt(id_vars=["Date"],
                                                  value_vars=["Step Count", "Distance",
                                                              "Energy Burned", "Flights Climbed",
                                                              "Walking Double Support Percentage",
                                                              "Walking Speed"])

In [23]:
daily_avg_metrics_melted

Unnamed: 0,Date,variable,value
0,2023-03-21,Step Count,137.636364
1,2023-03-22,Step Count,354.233333
2,2023-03-23,Step Count,109.125000
3,2023-03-24,Step Count,64.666667
4,2023-03-25,Step Count,117.000000
...,...,...,...
67,2023-03-28,Walking Speed,4.902000
68,2023-03-29,Walking Speed,4.234737
69,2023-03-30,Walking Speed,4.434545
70,2023-03-31,Walking Speed,3.468000


In [24]:
# Treemap of Daily Averages for Different Metrics Over Several Weeks
fig = px.treemap(daily_avg_metrics_melted,
                 path=["variable"],
                 values="value",
                 color="variable",
                 hover_data=["value"],
                 title="Daily Averages for Different Metrics")
fig.show()

- The above graph represents each health and fitness metric as a rectangular tile. The size of each tile corresponds to the value of the metric and the colour of the tiles represents the metric itself. Hover data displays the exact average value for each metric when interacting with the visualization.

- The Step Count metric dominates the visualization due to its generally higher numerical values compared to other metrics, making it difficult to visualize variations in the other metrics effectively. As the value of step count is higher than the value of all other metrics

In [25]:
# Select metrics excluding Step Count
metrics_to_visualize = ["Distance", "Energy Burned", "Flights Climbed",
                        "Walking Double Support Percentage", "Walking Speed"]


In [26]:
# Reshape data for treemap
daily_avg_metrics_melted = daily_avg_metrics.melt(id_vars=["Date"], value_vars=metrics_to_visualize)


In [27]:
daily_avg_metrics_melted

Unnamed: 0,Date,variable,value
0,2023-03-21,Distance,0.086225
1,2023-03-22,Distance,0.230261
2,2023-03-23,Distance,0.075796
3,2023-03-24,Distance,0.042067
4,2023-03-25,Distance,0.080747
5,2023-03-26,Distance,0.06876
6,2023-03-27,Distance,0.032664
7,2023-03-28,Distance,0.102727
8,2023-03-29,Distance,0.115884
9,2023-03-30,Distance,0.252494


In [28]:
fig = px.treemap(daily_avg_metrics_melted,
                 path=["variable"],
                 values="value",
                 color="variable",
                 hover_data=["value"],
                 title="Daily Averages for Different Metrics (Excluding Step Count)")
fig.show()