# How can a wellness company play it smart?

### About the Project
Today I'm going to analyze data of **Bellabeat**, a high-tech manufacturer of health-focused products for women. I followed total six steps: **Ask**, **Prepare**, **Process**, **Analyze**, **Share** & **Act** in following analysis. I used *Python* to analyze data and *spreadsheet program (Google Sheets)* to visualize data.


### About Bellabeat
Well, *Urška Sršen* and *Sando Mur* founded Bellabeat, a high-tech company that manufactures health-focused smart products.
*Sršen* used her background as an artist to develop beautifully designed technology that informs and inspires women around
the world. Collecting data on activity, sleep, stress, and reproductive health has allowed Bellabeat to empower women with
knowledge about their own health and habits. Since it was founded in 2013, Bellabeat has grown rapidly and quickly
positioned itself as a tech-driven wellness company for women.

![](https://mk0bellabeatcomhqlip.kinstacdn.com/wp-content/uploads/2020/10/bb_31.jpg)

### Business Tasks
I will try to answer these questions:
* What are some trends in smart device usage?
* How could these trends apply to Bellabeat customers?
* How could these trends help influence Bellabeat marketing strategy

<hr style="height:1px;border-top:2px solid #f00" />

### Let's play with Data!

Loading all the data.

In [None]:
import numpy as np 
import pandas as pd 

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

Then, storing these datasets into new variables.

In [None]:
daily_activity = pd.read_csv("/kaggle/input/fitbit/Fitabase Data 4.12.16-5.12.16/dailyActivity_merged.csv")
daily_steps = pd.read_csv("/kaggle/input/fitbit/Fitabase Data 4.12.16-5.12.16/dailySteps_merged.csv")
minute_sleep = pd.read_csv("/kaggle/input/fitbit/Fitabase Data 4.12.16-5.12.16/minuteSleep_merged.csv")
heartrate_seconds = pd.read_csv("/kaggle/input/fitbit/Fitabase Data 4.12.16-5.12.16/heartrate_seconds_merged.csv")
daily_intensities = pd.read_csv("/kaggle/input/fitbit/Fitabase Data 4.12.16-5.12.16/dailyIntensities_merged.csv")
sleep_day = pd.read_csv("/kaggle/input/fitbit/Fitabase Data 4.12.16-5.12.16/sleepDay_merged.csv")
weightloginfo = pd.read_csv("/kaggle/input/fitbit/Fitabase Data 4.12.16-5.12.16/weightLogInfo_merged.csv")
dailyCalories = pd.read_csv('/kaggle/input/fitbit/Fitabase Data 4.12.16-5.12.16/dailyCalories_merged.csv')

Checking ***daily_activity*** & ***sleep_day***.

In [None]:
daily_activity.head()

In [None]:
sleep_day.head()

We're seeing that all the **sleep records** has been recorded at 12:00:00 AM only once per day. So, we're going to remove the time.

In [None]:
def new_activity(row):
    row.SleepDay = row.SleepDay[:9]
    return row
sleep_day=sleep_day.apply(new_activity, axis="columns").rename(columns={"SleepDay":"ActivityDate"})


## Does total steps have any relation with sleep?
Let's combine two table in just one table. First, we have to make sure that there is no null value and keep only two column **TotalMinutesAsleep** & **TotalSteps** from **dailyactivitywithsleep**.

In [None]:
left = daily_activity.set_index(['Id','ActivityDate'])
right = sleep_day.set_index(['Id','ActivityDate'])
dailyactivitywithsleep = left.join(right)
dailyactivitywithsleep.reset_index()
dailyactivitywithsleep=dailyactivitywithsleep.drop_duplicates(keep='first',inplace=False).reset_index()
stepsnsleep = dailyactivitywithsleep[['TotalSteps','TotalMinutesAsleep']].dropna(subset=['TotalMinutesAsleep'])
stepsnsleep.to_csv("stepsnsleep.csv") #downloading for spreadsheet program.

After **visualizing** these data with spreadsheet program, we can see this line graph.

![hi](https://lh3.googleusercontent.com/TfzLApuNk_MXqmJbFoiGnlXCT2T28z0-qma8mmjrM_T9KINdMieMS4s97TLFM0hVogwaH7awi1rG9qJItTrixupyvsw3uz2K9hyzfovk28-yokiiLddtbcDtxp996waNI4S5a0xGpQ=w2400)

Here we can see that who sleeps more time approximately **800** minutes per day, her total steps is between **0-5000**. On the other hand, who steps daily **15000-20000** daily, her sleep time is minimum.

## Who walks more?
We have to analyze that women having different BMI scales, how much they walk?

In [None]:
def new_activity(row):
    row.Date = row.Date[:9]
    return row

weightloginfo=weightloginfo.apply(new_activity, axis="columns").rename(columns={"Date":"ActivityDate"})
weightloginfo=weightloginfo[['Id','ActivityDate','BMI']]
left = weightloginfo.set_index(['Id','ActivityDate'])
right = dailyactivitywithsleep.set_index(['Id','ActivityDate'])
n=left.join(right)
stepswithBMI=n.reset_index()[["Id","ActivityDate","BMI","TotalDistance"]].dropna(subset=['TotalDistance'])
meanforbestBMI=stepswithBMI.loc[(stepswithBMI.BMI >= 18.5) & (stepswithBMI.BMI <= 24.9)].TotalDistance.mean()
meanforhighBMI=stepswithBMI.loc[(stepswithBMI.BMI > 24.9) & (stepswithBMI.BMI <= 29.9)].TotalDistance.mean()
output=pd.DataFrame({'Best BMI': [meanforbestBMI], 'High BMI': [meanforhighBMI]})
output.to_csv("average_distance_with_BMI.csv")

Visualize is done 

![](https://lh3.googleusercontent.com/3Xw9nWgQRn6CzSrisT-t7St6dPJc5TzHEp99fJ1uyjhATxOuwoP9ZX3SGkDP6ECLXWLKflWX5z8G-pDc1BuD9SIA0MvVG9QQc8I5gBm3qIFVmyxKMOq00Y30u2jLRjj8oW8p22IUng=w2400)

## When the heart-rate become top picks?
From **heartrate_seconds** dataset, we'll calculate average heart rate value per hour in everyday.

In [None]:
hours = pd.to_datetime(heartrate_seconds['Time']).dt.strftime('%H')
heartrate_seconds['Hour']=hours
hourvsvalue=heartrate_seconds.groupby('Hour').Value.mean()
hourvsvalue=hourvsvalue.to_csv('hourvsheartrate.csv')

Have a look at this.

![](https://lh3.googleusercontent.com/ebtwfgXJzUnEjtqKEU-fp54YRxz_-N3NsPagJPSBkaUs2OuzLof613J_XEpJP4m7tIeFsqwIevYWkfWKWfeWOiA1Sqrj-OsgL0RY3nxDDFAtWui5T-tF2QHghE0rJBy0vNpxoYgM5g=w2400)

It seems, heart rate becomes lowest at midnight like **3:30AM-4:00AM**. At the **afternoon to evening**, heart rate becomes highest, that means maximum users do exercise or heavy activity at that time.

## Now, let's play with Acitive minutes
When women go for walks then how they spend that time we can see here.

In [None]:
daily_intensities['TotalActiveMinutes']=daily_intensities['LightlyActiveMinutes']+daily_intensities['FairlyActiveMinutes']+daily_intensities['VeryActiveMinutes']
avg_total = daily_intensities['TotalActiveMinutes'].mean()
prcntg_lightly =(daily_intensities['LightlyActiveMinutes'].mean()/avg_total)*100
prntg_fairly = (daily_intensities['FairlyActiveMinutes'].mean()/avg_total)*100
prntg_very = (daily_intensities['VeryActiveMinutes'].mean()/avg_total)*100
prcntg_lightly, prntg_fairly,prntg_very

![](https://lh3.googleusercontent.com/HgwdzIgnPUFZ8IOjSLrvpmyzk7QPtpv3wz4K0qex7GV9uk6-o2QS2LAgmEIUT6h8v0BbXVJnJEI5DmHO68EjMHzM3EoD6S5szGNSWlk8cGuJAkgpnQprPLSO-BwqrEpyo16DgqdEWQ=w2400)

Most of the time, activity is *light*.

## Which day women burns more calories?

In [None]:
#s = pd.date_range(dailyCalories['ActivityDay'],dailyCalories['ActivityDay'], freq='D').to_series(name='vals')
#dailyCalories['Day Name']= s.dt.day_name().to_frame().reset_index()['vals']
dailyCalories
dailyCalories['ActivityDay'] = pd.to_datetime(dailyCalories['ActivityDay'])
dailyCalories['dayOfWeek'] = dailyCalories['ActivityDay'].dt.day_name()
Calory_day = dailyCalories.groupby('dayOfWeek').Calories.mean().to_csv("Calory_per_day.csv")

Look at this bar chart

![](https://lh3.googleusercontent.com/QEPLSft9EmD1-3avdiXOQM_Odb-NqPuqgYGzHHBAOLvQQIVzffHhL8JLfDtp5LinmF5xODTbXcsJA0T_mFNOX_8JXuZDGM-nqr22fTNwtu5Yl-xY0Z7LmZy7ZXOBoI7-Bf4atQQukg=w2400)

At ***Thursday*** women's burning calory is lowest where ***Tuesday*** has highest calory burning record.

<hr style="height:1px;border-top:2px solid #f00" />

## Insights for Marketing Strategy
Following data driven decisions can be helpful for marketing team:
* According to [Sleep Foundation](https://www.sleepfoundation.org/women-sleep) study, natural sleep time for an adult woman should be at least **500** minutes. In our ***TotalMinutesAsleep Vs TotalSteps*** Viz, it seems that, women having total steps between 5000-10000 per day, having a good sleep record. Also, walking nearby 10000 steps is a standard for female, recommended by the [Centers for Disease Control and Prevention (CDC)](https://www.cdc.gov/diabetes/prevention/pdf/postcurriculum_session8.pdf). So, we can convince to the non-user that, to build up her good walking and sleeping habit, bellabeat's product will be good choice.
* From [American Cancer Society](https://www.cancer.org/cancer/cancer-causes/diet-physical-activity/body-weight-and-cancer-risk/adult-bmi.html)'s website, we know that the ideal BMI range for adult is 18.5-24.9. When BMI is over 24.9, that indicates overweight. In our ***Distance Vs BMI*** Viz, we are seeing that overweight women are more willing to lose their weight by walking more distance. So, women who are  fatty, they can use bellabeat products to be slim with more willing.
* Suppose anyone want to follow a diet where how much time she should consume at her walking is described. At her fifty minute's walking, her instructor said that she should do more fairly activities. So, she can use bellabeat products to follow instructions as we see at ***Type of Activity*** phase.
* It's always good to having medium heart rate. When a user do more exercise/activity it'll be bad for her. Also, at sleeping heart rate becomes slow as we see at ***Heart rate Vs Hour*** graph. So, bellabeat products can be good assistant when to sleep or measuring type of activities.
* It's a good trend to consume calories at week days same as weekend at ***Calories Vs Days*** graph. So, when a user will be so busy in her daily activities, bellabeat products reminder will help her to burn her calory. 

### And that's it!