# Google capstone project case 2:
## How Can a Wellness Technology
## Play it Smart?
### ELIAS TURK
##### Data analyst

##### Date: 27/9/2023

##### Reach me at : elias.n.turk@gmail.com or https://eliasturk.github.io/



![Screenshot 2023-10-14 132951.png](https://drive.google.com/uc?id=1S9COJDi4iGgG3ioC3zf0RiQxROQoMKmy)




# I.	INTRODUCTION: ( Ask phase)
This report is divided into sections that correlate to the main steps of the data analysis process to make it easier to explore our study in detail. We'll start by thoroughly comprehending the data sources, as well as the procedures used to prepare and process the data. We will next go into the core of our analysis and reveal the revelations that have resulted from our work. Finally, we will explain our findings and provide broad suggestions designed to direct Bellabeat's marketing plan in a new direction.

The Bellabeat Data Analysis Report is more than just a report; it is a physical representation of our dedication to using data to foster growth, innovation, and empowerment. Come along with us on this analytical adventure as we enlighten the future.



#### •	BUISNESS TASK: 
I have been given a crucial task in my capacity as a junior data analyst at Bellabeat: to meticulously analyze the smart device data associated with Bellabeat's broad variety of product offerings. This job requires a thorough analysis of consumer behavior about our smart wellness products, which include the Bellabeat app, the Leaf fitness tracker, the Time fitness watch, the Spring smart water bottle, and the premium Bellabeat membership program. Each product depicts the user's interests, trends, and behavior in a distinctive way. My goal is to uncover all the valuable insights that are concealed in this data, information that could completely alter Bellabeat's marketing approach.



# II. Data collection ( Prepare phase)

#### ● Where is your data stored? 
The data used in this analysis is sourced from Fitbit Fitness Tracker Data, which is available under the CC0: Public Domain license By Mobius(Kaggle user). The data files are stored in a specific dataset folder, and for this analysis, they have been made available on Kaggle.
#### ● How is the data organized? Is it in long or wide format?
The data is provided in both long and wide formats. For our analysis, we have chosen to work with the long format. The long format is preferred when dealing with time series data as it allows for easier manipulation and analysis of minute-level observations.
#### ● Are there issues with bias or credibility in this data? Does your data ROCCC? 
The dataset primarily consists of data from Fitbit users who do not use Bella Beat smart devices. As a result, the findings and insights drawn from this dataset should be interpreted within the context of Fitbit users specifically. It is important to acknowledge that the dataset's demographic scope is limited to Fitbit users, and any conclusions should be considered relevant to this user group. While we have taken steps to ensure data integrity, it's essential to be aware of potential biases within this specific user population when assessing the credibility and generalizability of our analysis.
#### ● How are you addressing licensing, privacy, security, and accessibility? 
The dataset used in this analysis is publicly available and released under the CC0: Public Domain license, which means it can be used for various purposes, including research and analysis. We have also complied with the terms and conditions of data usage outlined by Kaggle and Mobius to ensure ethical and responsible data handling.
#### ● How did you verify the data’s integrity?
We conducted data profiling to identify missing values, outliers, and inconsistencies. Additionally, we cross-verified the data with domain knowledge and compared it with expected patterns. 
#### ● How does it help you answer your question?
This dataset provides valuable insights into the physical activity, heart rate, and sleep monitoring patterns of Fitbit users. By analyzing this data, we aim to answer specific research questions related to health and activity patterns during the specified time frame. The dataset allows us to explore trends and correlations that can contribute to a better understanding of fitness behaviors and health outcomes. We then could translate them in the buisness world to aid bellabeat's marketing strategy!
#### ● Are there any problems with the data?
The data is relatively small when we talk about the number of users who tracked other features such as sleep,heartrate and weight. The duration of daily activity data is for 31 days which is fine. The dataset actually came abit preprocessed so we had to deal with minimal preprocessing!

# III. Data Preprocessing ( Process phase)

#### ● What tools are you choosing and why?
For joining tables and cleaning data, i used : 
1. Excel's power query editor to join hourly data to hourly activity. 
2. Data analysis & data cleaning as well as creating tables for further visualization and analysis.
3. PowerBi to visualize the tables and SQL queries.
#### ● Have you ensured your data’s integrity?
I have taken several steps to ensure data integrity while working with the public dataset. First, I ensured that the dataset comes from a reputable source, which is a critical factor in maintaining data integrity. Additionally, I have thoroughly reviewed the dataset's documentation and metadata to understand how the data was collected and processed.

During my analysis, I implemented rigorous data cleaning and validation procedures to identify and address any anomalies, such as missing values, outliers, or inconsistencies. This process helped ensure that the dataset's integrity was maintained throughout my work.

I also kept a close eye on the dataset's version and checked for any updates or corrections provided by the data provider to ensure that I was working with the most accurate and up-to-date information.

In summary, I can confidently say that I have taken all necessary precautions to maintain data integrity and have diligently checked for anomalies while working with the dataset.
#### ● What steps have you taken to ensure that your data is clean?
1. Check for duplicate entries in all datasets.
2. Check for NULL or missing values.
3. Transform the data and data types to work with them.


### Data cleaning:
This is one small part of the data cleaning, I tried to log the most important steps but many other cleaning procedures were executed at different stages. On-the-fly data cleaning & transformation has been done as well.

In [None]:
# /*-- Modifiying data types for easier manipulation.
# --1) Total distance from string to numeric
ALTER TABLE daily_activity
ALTER COLUMN TotalDistance float;
# --2) LoggedActivitiesDistance from string to numeric
ALTER TABLE daily_activity
ALTER COLUMN LoggedActivitiesDistance float;
# --3) VeryActiveDistance from string to numeric
ALTER TABLE daily_activity
ALTER COLUMN VeryActiveDistance float;
# --4) LoggedActivitiesDistance from string to numeric
ALTER TABLE daily_activity
ALTER COLUMN ModeratelyActiveDistance float;
# --5) LoggedActivitiesDistance from string to numeric
ALTER TABLE daily_activity
ALTER COLUMN LightActiveDistance float;
# --6) LoggedActivitiesDistance from string to numeric
ALTER TABLE daily_activity
ALTER COLUMN SedentaryActiveDistance float;
# --7) TotalSleepRecords in sleepDay from string to numeric
ALTER TABLE sleepDay
ALTER COLUMN TotalSleepRecords int;
# --8) Change ActivityHour to datetime2
ALTER TABLE Bellabeat.dbo.hourly_activity
ALTER COLUMN ActivityHour datetime2;
# --10) Renaming field SleepDay in sleepDay table.
EXEC sp_RENAME 'sleepDay.SleepDay', 'days_tracked', 'COLUMN';
# --11) Renaming field SleepDay in sleepDay table.
EXEC sp_RENAME 'minuteSleep.value', 'quality', 'COLUMN';

# --Removing duplicates from sleepday
SELECT [Id], [days_tracked], [TotalSleepRecords], [TotalMinutesAsleep], [TotalTimeInBed]
FROM [Bellabeat].[dbo].[sleepDay]
GROUP BY [Id], [days_tracked], [TotalSleepRecords], [TotalMinutesAsleep], [TotalTimeInBed]
HAVING COUNT(*) > 1;

WITH Duplicates AS (
    SELECT
        Id,
        days_tracked,
        TotalSleepRecords,
        TotalMinutesAsleep,
        TotalTimeInBed,
        ROW_NUMBER() OVER (PARTITION BY Id, days_tracked, TotalSleepRecords, TotalMinutesAsleep, TotalTimeInBed ORDER BY (SELECT 0)) AS RowNum
    FROM
        [Bellabeat].[dbo].[sleepDay]
)
DELETE FROM Duplicates
WHERE RowNum > 1;

# -- Removing duplicates in minutesleep
SELECT [Id], [date], [quality], [logId]
FROM [Bellabeat].[dbo].[minuteSleep]
GROUP BY [Id], [date], [quality], [logId]
HAVING COUNT(*) > 1;


WITH Duplicates AS (
    SELECT
        [Id],
        [date],
        [quality],
        [logId],
        ROW_NUMBER() OVER (PARTITION BY [Id], [date], [quality], [logId] ORDER BY (SELECT 0)) AS RowNum
    FROM
        [Bellabeat].[dbo].[minuteSleep]
)
DELETE FROM Duplicates
WHERE RowNum > 1;



### Data transformation: 
This is where i joined tables together and segmented the users that tracked different features by aggregating each into separate tables.

In [None]:


SELECT
	da.Id,
    da.ActivityDate AS date,
    da.TotalSteps AS Total_Steps,
	ROUND(da.TotalDistance, 2) AS Total_Distance,
	ROUND(da.LoggedActivitiesDistance, 2) AS Logged_Activities_Distance,
	ROUND(da.VeryActiveDistance, 2) AS Very_Active_Distance,
	ROUND(da.ModeratelyActiveDistance, 2) AS Moderately_Active_Distance,
	ROUND(da.LightActiveDistance, 2) AS Light_Active_Distance,
	ROUND(da.SedentaryActiveDistance, 2) As Sedentary_Active_Distance,
	ROUND(da.VeryActiveMinutes, 2) AS Very_Active_Minutes,
	ROUND(da.FairlyActiveMinutes, 2) AS Fairly_Active_Minutes,
	ROUND(da.LightlyActiveMinutes, 2) AS Light_Active_Minutes,
	da.SedentaryMinutes,
	da.Calories,
    ha.daily_avg_intensity , -- Leave Avg_Intensity as is
    ha.daily_total_intensity , -- Leave Total_Intensity as is
    ha.daily_MET
--INTO daily_activity_updated
FROM
    Bellabeat.dbo.daily_activity AS da
INNER JOIN
    DailyAggregated AS ha
ON
    da.Id = ha.Id
    AND da.ActivityDate = ha.date
ORDER BY
    da.Id,
    da.ActivityDate;


/* NOW WE CAN QUERY MORE INFORMATION ABOUT USERS AND THE PREVIOUS QUERY WILL BE USED*/


/* daily_activity_updated contains MET and physical data for 33 users.
Now we need to join this data with the data of the users that enabled other features */



-- data of 14 users who tracked heart
-- Lets try  with people who enabled heart rate
-- Average heart rate per day for each user
SELECT
    A.ID,
    CAST(A.ActivityHour AS DATE) AS date,
    AVG(B.Value) AS daily_avg_hr
--INTO daily_heartrate
FROM
    Bellabeat.dbo.hourly_activity AS A
INNER JOIN
    Bellabeat.dbo.heartrate_seconds AS B
ON
    A.ID = B.ID
    AND CAST(A.ActivityHour AS DATE) = CAST(B.Time AS DATE)
GROUP BY
    A.ID,
    CAST(A.ActivityHour AS DATE)
ORDER BY
    A.ID,
    CAST(A.ActivityHour AS DATE);

-- Now we join both tables together
USE Bellabeat;
SELECT dau.Id,
	   dau.date,
	   hr.daily_avg_hr 
--INTO daily_activity_hr_data
FROM Bellabeat.dbo.daily_heartrate AS hr
INNER JOIN Bellabeat.dbo.daily_activity_updated AS dau
ON dau.Id = hr.ID
AND dau.date = hr.date
ORDER BY Id,date;

-- JOIN SleepDay aggregates with daily_activity_updated
SELECT
    dau.id,
	dau.date,
    CAST(ROUND((sa.TotalMinutesAsleep / 60.0), 2) AS FLOAT) AS Hours_Slept,
    CAST(ROUND((sa.TotalTimeInBed / 60.0), 2) AS FLOAT) AS Timein_Bed
--INTO daily_activity_sleep_data
FROM
    Bellabeat.dbo.daily_activity_updated AS dau
INNER JOIN Bellabeat.dbo.sleepDay AS sa
ON dau.Id = sa.id
AND dau.date = sa.days_tracked 
ORDER BY Id,date;

USE Bellabeat;
-- Users who enabled sleep and heart
SELECT 
    hr.*,
    sl.Hours_Slept,
    sl.Timein_Bed
    
--INTO daily_activity_sleep_hr_data
FROM 
    Bellabeat.dbo.daily_activity_hr_data hr
INNER JOIN 
    Bellabeat.dbo.daily_activity_sleep_data sl
ON 
    hr.Id = sl.Id
    AND hr.date = sl.date;



#### ● How can you verify that your data is clean and ready to analyze?
 I implemented data validation checks to ensure the accuracy and consistency of the data. This included checking for missing values, duplicate records, and data format errors. I also joined tables together using SQL and transformed the data to load it to PowerBI and analyze it.
Here is my tables diagram in powerbi:
![diagram](https://drive.google.com/uc?id=1POazCNc-SZ7OYXthwubkUoFtAzhFZXDS)

fact tables : 
1. daily_activity_updated 
2. daily_activity_sleep_data
3. daily_activity_sleep_hr_data

dim tables :

1. unique_users: Created using DAX in PowerBI . It takes distinct values from ID 
UniqueUsers = VALUES(daily_activity_updated[ID])
2. unique_dates: Also using DAX to take distinct dates.
unique_dates = SUMMARIZE(daily_activity_updated, daily_activity_updated[Date])
3. user_category : created using SQL up above

### Relationships

1. Many to one relationships from fact tables to unique_ids.(1 direction)
2. Many to one relationships from fact tables to unique_dates.(1 direction)
3. one to one relationship between user_category and unique_id.

#### ● Have you documented your cleaning process so you can review and share those results?
Yes
### Note: The number of users(population) is relatively small.
I discovered that :
1. 33 Users track their physical activity.
2. 24 of them also track sleep. 
3. 14 of them also track heartrate.(Not included)
4. 12 Users track sleep and heartrate.
5. 3 Users are tracking everything including weight.(Not included in analysis due to small sample size)

#### Classifying users into categories based on average steps per 
● per the following article https://www.10000steps.org.au/articles/counting-steps/

In [None]:
SELECT
    [Id],
    AVG([Total_Steps]) AS mean_daily_steps,
    AVG([Calories]) AS mean_daily_calories,
    CASE
        WHEN AVG([Total_Steps]) < 5000 THEN 'sedentary'
        WHEN AVG([Total_Steps]) >= 5000 AND AVG([Total_Steps]) < 7500 THEN 'lightly active'
        WHEN AVG([Total_Steps]) >= 7500 AND AVG([Total_Steps]) < 10000 THEN 'fairly active'
        WHEN AVG([Total_Steps]) >= 10000 THEN 'very active'
    END AS user_category
-- INTO user_category
FROM [Bellabeat].[dbo].[daily_activity_updated]
GROUP BY [Id

# IV. Data Analysis (Analyze phase)

#### ● How should you organize your data to perform analysis on it?
I created a folder and put all the datasets i used in this analysis in it. I then imported these datasets and preprocessed them and kept them in my database in MS SQL Server called Bellabeat as seen above.
#### ● Has your data been properly formatted?
My data has been properly formatted and preprocessed in Ms SQL Server.



### Hypothesis:

Total intensity gets higher as users enable more features.



##### Users tracking physical activity(all 33 users) : daily_activity_updated:

1. Id: User ID or identifier associated with the data record.
2. Date: The date for which the data is recorded.
3. Total_Steps: Total number of steps taken on the specified date.
4. Total_Distance: Total distance covered (e.g., in miles or kilometers) on the specified date.
5. Logged_Activities_Distance: Distance covered during logged activities on the specified date.
6. Very_Active_Distance: Distance covered during very active activities on the specified date.
7. Moderately_Active_Distance: Distance covered during moderately active activities on the specified date.
8. Light_Active_Distance: Distance covered during light active activities on the specified date.
9. Sedentary_Active_Distance: Distance covered during sedentary (inactive) activities on the specified date.
10. Very_Active_Minutes: Total minutes spent in very active activities on the specified date.
11. Fairly_Active_Minutes: Total minutes spent in fairly active activities on the specified date.
12. Light_Active_Minutes: Total minutes spent in light active activities on the specified date.
13. SedentaryMinutes: Total minutes spent in sedentary (inactive) activities on the specified date.
14. Calories: Total calories burned on the specified date.
15. daily_avg_intensity: Daily average intensity level, possibly related to activity or exercise.
16. daily_total_intensity: Daily total intensity level, possibly related to activity or exercise.
17. daily_MET: MET (Metabolic Equivalent of Task) value associated with daily activity.



### Physical intensity analysis: 
#### Segmentation by Activity Level analysis:(user_category)
![download](https://drive.google.com/uc?id=1qP0YGbR2q0mg0sJ3j8lJICTh2Uv6vHLP)

1. 8 out of 33 users (24%) of users are sedentary.
2. 9 out of 33 users (27%) of users are lightly active.
3. 9 out of 33 users (27%) of users are fairly active.
4. 7 out of 33 users (21%) of users are very active.




##### Steps & Distance 

![download(2)](https://drive.google.com/uc?id=1LdxKV20A5Yz4QYMcGUN-CedKFgED_K2F)


![download(3).png](https://drive.google.com/uc?id=16VnzFsbUiLdyU9Eba0kBHCWkqAcjUpxq)


● We find a strong positive correlation between the steps and the distances that each user category takes.

● Highly active users tend to have the most total distances and steps averages.




##### MET and total intensity

![download(4).png](https://drive.google.com/uc?id=1Ewz3JkEchqxZQOFF1vUettaRYeGk8IqA)

![download5.png](https://drive.google.com/uc?id=1EDQTXksB2VyQ0D-GpEKdtq9ChrmUcZAH)


● We find a strong positive correlation between MET and intensity for the users categories.

● Highly active users tend to have the highest MET and intensity averages.


#### Intensity and Calories

![download(6).png](https://drive.google.com/uc?id=1_yITKsNE6NCcf5IaifTQfx3_8Xpn1-EU)


● We find a strong positive correlation between intensity and calories burned.

● Average of calories burned by user category is also the highest for very active users.

# Share phase -1
### Recommendations :

● Segmentation Focus: Target marketing efforts towards the very active and fairly active user categories, as they make up the majority (48%) and are more likely to engage.

● Promote Activity Tracking: Emphasize the importance of tracking steps and distances, as there's a strong correlation. Encourage users to set and achieve distance-based goals.

● Highlight MET and Intensity: Promote MET and intensity tracking, especially for highly active users, to emphasize the health benefits of their activity levels.

● Calorie Burn Awareness: Educate users about the relationship between intensity and calories burned, and encourage them to stay active for better calorie management.

● Engage Sedentary Users: Develop targeted campaigns to engage sedentary users and motivate them to increase their activity levels gradually.

● Personalized Content: Use user data to deliver personalized content and recommendations based on their activity level and goals.

● Incentivize Very Active Users: Reward very active users to maintain their high activity levels and serve as brand ambassadors or influencers.

● Community Building: Create a community or social platform where users can share their achievements, encouraging friendly competition and motivation across all user categories.


### Trends over time analysis:

Checking possible patterns and trends.

![download(7).png](https://drive.google.com/uc?id=1dzn6EgiD3QCpW8OCNsDRh-GWv9XKLS30)

● In the first 3 weeks , saturday  had the highest average intensity. 

● We notice a big drop in average intensity on the last day the data was recorded ( Thursday may 12).

To investigate this further lets see the average steps by date.

![download(8)](https://drive.google.com/uc?id=16Y5cWU9SFJSkHjZGXbqHB1gtjZobIvAS)


● The high intensity for the first 3 saturdays are justified by the amount of the average steps taken.

● We notice that Thursday may 12 indeed has also low average steps taken resulting in lower average intensity levels.
Let's check why!

In order to understand this better, i first tried to look what happened on this date on the internet but nothing came out. And since we do not have any demographic information of the users,common sensee led me to investigate the data further!

![download(9)](https://drive.google.com/uc?id=1BIRDOFbUnsxuMTKeJbDZofLG_5X1OI0J)



● We notice that on the thursday, users who kept tracking their data dropped down to 19 in total.

● We also notice that the user retention kept decreasing ( more users stopped tracking by time).

● Fairly active users were the most stable and only 2 of them stopped tracking their data,while lightly active users went from 9 users to 3 (highest drop).

● Very active users and sedentary users also witnessed big drops in the user retentions but of same levels.



#### Physical intensity features by date:

![image-2.png](https://drive.google.com/uc?id=1-t0PsZAk0nEYq0w0gXT6oJx7gg9WonNL)

● The daily averages of the different activity tracking features in a decreasing trend as time goes on.

![image-2.png](https://drive.google.com/uc?id=13l5XNoenmr4vm2Nfuv_-p3a1t1XqtTJM)


● Here  we realize that the most activity accounted for in our data goes for fairly_active users category followed by lightly active categories and that's simply because the majority of the users fall into these categories. 

● sedentary users account for the least  total intensities in both months.

● The total intensities for all categories including sedentary users was more in April than May.

![image-2.png](https://drive.google.com/uc?id=1ByU7vNLJ2v84H5xkdIsUVcn5acMfQ2pE)


● The reason for the decreasing averages over time and the huge different in total intensities between the 2 months is also because more people recorded their activity in April than in May across all categories.

![image.png](https://drive.google.com/uc?id=1h1bDMEBtWeNIzqsQj6PRE6rINqjfsYEy)



● We notice minimal activity between 6 and 11 am, then users tend to be more active around noon till 3 PM. Users are most active between 5 and 7 pm.
● Saturday has the highest average of total steps followed by tuesday for our users.

# Share phase - 2
### Recommendations :

● Thursday Recovery Strategy: Create a "Thursday Recovery" campaign to address the significant drop in average intensity and user retention on Thursday, May 12. Offer incentives or content to motivate users to stay active on Thursdays.

● User Retention Strategies: Develop targeted strategies to improve user retention. This includes personalized notifications, rewards, or content recommendations to keep users engaged and motivated. Analyze user drop-off points and implement strategies to address them.

● Encourage More Data Tracking in May: Launch a campaign to motivate users to continue tracking their data in May, addressing the observed decline in data recording between April and May. Offer incentives or challenges to keep users engaged.

● Seasonal Campaign: Run a seasonal campaign emphasizing the benefits of staying active in May compared to April. Use data to illustrate the advantages of maintaining fitness levels throughout the year.

● In-App Features: Develop in-app features or notifications that motivate users to set goals and monitor their progress. Gamify the experience to make it more engaging and encourage continued use.

● Feedback and Survey: Collect feedback from users who stopped tracking their data and implement changes based on their input. Understand why users are dropping off and address their concerns to improve retention.

● Promote Midday and Evening Engagement:

Capitalize on the observation that users tend to be more active around noon until 3 PM and are most active between 5 and 7 PM. Create marketing campaigns, notifications, or challenges specifically designed to encourage and motivate user activity during these time slots. Offer rewards or incentives for reaching activity goals during these periods.
Target Inactive Mornings:

Given that there is minimal activity between 6 and 11 AM, consider running campaigns that target users during this time frame to encourage them to start their day with physical activity. Provide content or challenges to motivate users to be more active in the morning.

● Weekday Activity Focus:

Leverage the insight that Saturday and Tuesday have the highest average total steps. Focus your marketing efforts on weekdays, particularly on Saturdays and Tuesdays. Offer special promotions, challenges, or content that encourage users to be more active during these days.

● Weekend Campaigns:

On Saturdays, when user activity is at its peak, launch special weekend campaigns to boost user engagement. Provide exclusive weekend challenges or rewards to keep users active and engaged during the weekend.

These new recommendations focus on addressing specific issues related to the drop in user engagement and retention, encouraging continued data tracking, and making the app experience more engaging.








### Sleep analysis:

![image-3.png](https://drive.google.com/uc?id=15zDn64UmcvFcyjK0bIQT6WNKasMOP2rh)


● 24/33 users have enabled sleep. - fairly active:8 - sedentary:7 - very active:5 - lightly active:4.


![image-3.png](https://drive.google.com/uc?id=1syYkeT1IdozqpfhC46-VcaM-j_WKOG3R)



● Despite having the lowest amount of users, lightly active users have the highest daily sleep averages, and very active has the lowest sleep average.


![image.png](https://drive.google.com/uc?id=1RIQtClhN6MtOtGLlwLttOMGccqTTs6dL)


● The high average on 24th of april is expained by the lack of sedentary users data points on this date and where both very active and fairly active had a spike.

![image.png](https://drive.google.com/uc?id=1sHXoroXPKdjG3u-J5w5WoUK5jS02hF7Z)

● fairly_active users are the most users who tracked their sleep the longest and at stable rate (user retention).


![image.png](https://drive.google.com/uc?id=1mm40t9D9McU-W_xQvvdCElTMg45_lv85)


● The average time to sleep for very active users is the highest with 1h:20 while fairly active uers are the lowest with average of 48 minutes.


![image-2.png](https://drive.google.com/uc?id=1WEK40z9ve8FoB973j9TnFjMLXt03XkFK)


●  We notice that avg of hours slept increases on days right after there's a spike in physical intensity

![image-2.png](https://drive.google.com/uc?id=12DqJWSeMIHvtb1l1OLtaLc3HukEsamqN)


● There is a moderate negative correlation of -0.48 . This suggests that there is a moderate opposite relationship between hours slept and total steps. As the total steps on a given day increases average slept hours decreases.

● After analysing those results and looking at very active users data. I conclude that those who get highest average of sleep take less to fall asleep.I also conclude highly active users in this sample are not healthy and despite seeing a spike in sleep after a high intensity average day , users who slept less had higher intensity averages.

# Share phase - 3
### Recomendations : 

● Targeted Sleep Analysis Campaigns:

Create separate campaigns for different user activity level groups.
Tailor messages and incentives to encourage better sleep tracking and quality based on each group's needs.

● Promote Healthy Sleep Habits:

Develop content and features to educate users about healthy sleep habits.
Emphasize factors like consistent sleep patterns, adequate sleep duration, and an ideal sleep environment.

● Enhance Data Collection:

Simplify the process of logging sleep data to improve data collection.
Implement reminders and notifications to prompt users to update their sleep information regularly.

● Sleep-Activity Balance:

Educate users on balancing physical activity and sleep.
Provide guidance on optimizing physical activity without compromising sleep quality.

● User Insights for Health Improvement:

Provide personalized health insights based on users' sleep and activity data.
Offer recommendations to improve overall health, considering activity levels, sleep duration, and efficiency.

● Goal-Oriented Sleep Tracking:

Allow users to set sleep-related goals and monitor their progress.
Gamify the experience by rewarding users for achieving their sleep goals.

● Correlation Awareness:

Educate users about the observed negative correlation between steps and sleep duration.
Help users understand how to balance physical activity and rest for better health.

● User-Specific Insights:

Offer personalized insights based on user data, such as tips for falling asleep quickly or improving sleep quality.

● Health Assessment for Highly Active Users:

Conduct health assessments for highly active users to address potential health concerns.
Provide resources and guidance on maintaining a healthy balance between fitness and well-being.

● User Engagement Challenges:

Create sleep tracking and healthy habits challenges.
Encourage users to participate in challenges to improve sleep routines and overall health.
These recommendations are designed to cater to different user groups, considering their unique sleep patterns and activity levels, and encourage healthier sleep habits and overall well-being.







#### Sleep and heart analysis:

![image.png](https://drive.google.com/uc?id=1IWU2kssfKGXveaWFVWcN_W2kgDmGX68T)


● 5 out of 12 who tracked health and sleep as well fall in the fairly active category followed by 3 sedentary users.

● very active and fairly active users had the lowest average heart rate with 66.84 and 72.16 respectively.

![image.png](https://drive.google.com/uc?id=1gufSLe4YITFca_lnbPT8zJ_RJaGRCmZT)


● Categories with higher heart rate averages had higher hours slept average

![image.png](https://drive.google.com/uc?id=1j-jHgDDp9v6m17TmMdD4LafoC8zSiKUa)

● Looking at the relationship between heartrate,phyisical activity and sleep we can say that there is a negative weak correlation.

![image.png](https://drive.google.com/uc?id=1du4i79CY_QJ_4yfCNVZS6r1nFpHeKOUt)

● The correlation between avg heart rate and total intensity is postively weak. We notice that sedentary and lightly active users are closer to the trend than fairly active that has a high error value

# Share phase - 4
### Recomendations : 

● Promote Sleep-Heart Rate Relationship Awareness:

Educate users on the correlation between heart rate, physical activity, and sleep. Create content and resources that explain how heart rate fluctuations may impact sleep quality and provide tips for improving sleep patterns.

● Targeted Engagement for Fairly Active Users:

Since fairly active users had a significant presence among those who tracked both health and sleep, focus on creating content and campaigns that cater to their specific needs. Emphasize the importance of maintaining a balanced heart rate for better sleep quality.

● Highlight Benefits for Sedentary Users:

Given that sedentary users are also among those who tracked health and sleep, develop content and campaigns targeting this group. Showcase how increasing physical activity can positively impact both heart rate and sleep quality.

● Encourage Heart Rate Tracking:

Run campaigns to encourage more users to enable heart rate tracking alongside sleep monitoring. Stress the importance of this combination for a comprehensive understanding of one's health and wellness.

● Personalized Health and Sleep Plans:

Utilize the heart rate data to provide personalized health and sleep improvement plans for users. These plans could offer specific recommendations on how to achieve better sleep and maintain an optimal heart rate.
User Education on Heart Rate Zones:
Provide information about heart rate zones and how they relate to sleep quality. Help users understand the significance of different heart rate ranges and how to manage them for better rest.

● In-App Heart Rate-Sleep Analytics:

Develop in-app features that allow users to analyze their heart rate data in relation to their sleep patterns. This can provide users with insights into how their activities affect their heart rate and, subsequently, their sleep quality.

● Challenges and Rewards:

Create challenges or reward programs that motivate users to maintain lower and healthier heart rates, especially for very active and fairly active users who had the lowest average heart rates. Offer incentives for reaching heart rate and sleep-related goals.

# Hypothesis result :
![image.png](https://drive.google.com/uc?id=1AoFAa2XUwS1aKrLmYCB9-TEqwapJk_HO)



● As per my hypothesis, we can deduct that indeed users intensity averagtes increases as they enable more features. We can also conduct an ANOVA Test but there's no need since it can be clearly observed based on this analysis