<h2>The mission statement</h2>

<h4> Bellabeat, a high-tech company that manufactures health-focused smart products wants to analyse the usage of one of their products in order to gain insight into how people are already using their smart devices.Then, using this information, she would like high-level recommendations for how these trends can inform Bellabeat marketing strategy.</h4>

## PHASE 1: ASK

### The Key Objectives:

**1.Identify the Business task:**
* Based on how their customers use their fitness smart devices, the organization may better target their marketing efforts to meet their needs. Make high-level suggestions for how these trends can influence Bellabeat's marketing approach using the information provided.

**2. Consider key stakeholders:**
* The main stakeholders here are Urška Sršen, Bellabeat’s co-founder and Chief Creative Officer; Sando Mur, Mathematician and Bellabeat’s cofounder; And the rest of the Bellabeat marketing analytics team.

**3. Deliverables-The business task:**
* Here, The business task is defined as looking for user usage patterns of their smart devices in order to collect knowledge that would later help to better guide marketing decisions. Consequently, it would be:
How do our users use smart devices?
1.  What are some trends in smart device usage?
2. How could these trends apply to Bellabeat customers? 
3. How could these trends help influence Bellabeat’s marketing strategy?


## PHASE 2 : PREPARE

### The Key Objectives:

**1.Determine the credibility of the data.**

* This data set contains a personal fitness tracker from thirty Fitbit users. Thirty eligible Fitbit users consented to the submission of personal tracker data, including minute-level output for physical activity, heart rate, and sleep monitoring. It includes information that can be used to explore users’ habits.

**2. Data organization**
* It has Daily activity, Daily calories, Daily steps, Heart rate, Hourly steps, Hourly calories, Hourly Intensities, Minute Calories, Minute Intensities, Minute Sleep, Sleep In a Day and the weight log monitored for 30 Fitbit users. It is stored in excel format.

**3. .Sort and filter the data:**
* As my analysis is focused on identifying high-level trends in consumption, I'm going to concentrate on the daily time for my investigation. The daily activity and sleep data are the ones that intrigue me the most because they are likely to reveal some interesting trends;  in order to conduct my study, I will need to combine some tables.

In [None]:
#Importing libraries
library(tidyverse)

In [None]:
#Importing the data
activity <- read.csv("/kaggle/input/fitbit/Fitabase Data 4.12.16-5.12.16/dailyActivity_merged.csv")
sleep <- read.csv("/kaggle/input/fitbit/Fitabase Data 4.12.16-5.12.16/sleepDay_merged.csv")
weight <- read.csv("/kaggle/input/fitbit/Fitabase Data 4.12.16-5.12.16/weightLogInfo_merged.csv")
calories <- read.csv("/kaggle/input/fitbit/Fitabase Data 4.12.16-5.12.16/dailyCalories_merged.csv")

In [None]:
#looking at the data with head and colnames function
head(activity)
colnames(activity)

head(sleep)
colnames(sleep)

head(weight)
colnames(weight)

head(calories)
colnames(calories)

In [None]:
sleep2 <- sleep %>%
  rename(ActivityDate=SleepDay) %>% 
  separate(ActivityDate, c("ActivityDate","Time"),sep=" ") %>% 
  select(-"Time")

weight2 <- weight %>%
  rename(ActivityDate=Date) %>% 
  separate(ActivityDate, c("ActivityDate", "Time"), sep=" ") %>% 
  select(-"Time", -"LogId")

calories2 <- calories %>%
  rename(ActivityDate=ActivityDay)
  


In [None]:
#Lets take a peak into the data
head(activity)
head(sleep2)
head(weight2)
head(calories2)

In [None]:
#Mean of the activity minutes
mean(activity$VeryActiveMinutes)
mean(activity$FairlyActiveMinutes)
mean(activity$LightlyActiveMinutes)
mean(activity$SedentaryMinutes)

In [None]:
# Understanding the summary statistics

##Number of distinct users
n_distinct(activity$Id)
n_distinct(sleep2$Id)
n_distinct(weight2$Id)
n_distinct(calories$Id)


### Statistical Summary of the data

#### 1. Daily Activity

In [None]:
activity %>%
    select(TotalSteps,
           TotalDistance,
           VeryActiveDistance,
           ModeratelyActiveDistance,
           LightActiveDistance,
           SedentaryActiveDistance) %>%
    summary()

#### 2.Daily Sleep

In [None]:
sleep2 %>%
    select(TotalSleepRecords,
           TotalMinutesAsleep,
           TotalTimeInBed) %>%
    summary()

#### 3.Daily Calories

In [None]:
calories2 %>%
    select(ActivityDate,
           Calories) %>%
    summary()

#### 3.Weight Log

In [None]:
weight2 %>%
    select(WeightKg,
           BMI,
           IsManualReport) %>%
    summary()

### Key Findings

* The average TotalSteps - 7638 and in average distance - 5.490 miles and the average calories burnt - 2304
* The average minutes asleep - 419.5 mins and the average time in bed - 458.6 mins
* The average weight in kg - 72.04 Kg


## PHASE 3 : PROCESS 

* The main focus of my analysis will, in my opinion, be on the calories burned by activity type and user as well as the relationship between activity and sleep quality. In order to do that, I'll make some new summarised tables in which I'll group certain data points into easier-to-understand categories for the analysis.



### Merging the datasets with a Left Join and Inner Join

In [None]:
activity_sleep <- left_join(activity,sleep2,by = c("Id","ActivityDate"))
head(activity_sleep)

In [None]:
calories_weight <- left_join(calories2,weight2,by = c("Id","ActivityDate"))
head(calories_weight)

### Vieweing the combined data

In [None]:
finaldata <- inner_join(activity_sleep,calories_weight,by = c("Id"))
View(finaldata)

In [None]:
str(finaldata)

## PHASE 4 : ANALYZE
* The majority of users appear to be sedentary or only lightly active, but it's fascinating to note that even though they make up the largest category, those who are Fairly active and, most significantly, Very active, burn the most calories. That may not come as a surprise, but it does support the notion that activity level has a direct correlation to calories burned, making it a crucial component of any weight-loss plan.

In [None]:
# Analysing the relationship between steps taken and Sedentary minutes
ggplot (data = activity)+
  geom_point(mapping = aes(x=TotalSteps, y=SedentaryMinutes),colour = "green")+
labs(title = 'Relationship between Steps Taken snd Sedentary Minutes',x = 'Total Steps',y = 'Sedentary Minutes')

In [None]:
# Analysing the relationship between Minutes Asleep and Time in Bed
ggplot (data = sleep)+
  geom_point(mapping = aes(x=TotalMinutesAsleep, y=TotalTimeInBed),color = "blue") +
labs(title = 'Relationship between Minutes Asleep and Time in Bed',x = 'Minutes Asleep', y = 'Minutes in Bed')

In [None]:
# Analysing the relationship between calories burnt and total steps
ggplot (data = activity)+
  geom_point(mapping = aes(x=TotalSteps, y=Calories),color = "yellow") +
labs(title = 'Relationship between calories burnt and total steps',x = 'Total Steps', y = 'Calories Burnt')

In [None]:
# Duration spent by activity in minutes
mean_activity_min <- c(991,192,13,21)
activity_intensity <- c("Sedentary","Light","Fair", "Very Active")
intensity_min <- data.frame(mean_activity_min, activity_intensity)

ggplot(data = intensity_min) + 
geom_col(aes(x = activity_intensity, y = mean_activity_min), fill = "lightblue") +
labs(title = 'Average Minutes by Intensity', x = 'Activity Intensity', y = 'Average Activity Minutes')

## PHASE 5 : SHARE

### Key objectives:

**Share my conclusions.:**

   **1)** Most of the time, a person spends more time in bed than they do sleeping. In an effort to keep them more active, suggest adding a feature to the app that will send an alarm if they stay in bed later than normal (this may also have an impact on their mental health).
    
   **2)**  A feature that directs users' attention to the actions they should perform in accordance with their calorie consumption. For instance, many individuals who consume 3,000 calories per day take, on average, fewer than 1000 steps.
   
   **3)**  An app feature can assist users in tracking their activity intensity over time and advising them to concentrate on an intensity type that is lower than the others to maintain a balanced pace of activity. 
