<a href="https://colab.research.google.com/github/chonginbilly/Bi-Project/blob/Moringa_python/Copy_of_Implementing_Statistics_with_Functions_Lab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<font color="green">*To start working on this notebook, or any other notebook that we will use in this course, we will need to save our own copy of it. We can do this by clicking File > Save a Copy in Drive. We will then be able to make edits to our own copy of this notebook.*</font>

---

# Implementing Statistics with Functions - Lab

## Introduction

Step into the vibrant arena of fitness analytics, where we don the hats of data analysts for a cutting-edge health and fitness company. Our mission? To sculpt personalized workout plans that seamlessly integrate with the diverse needs of two distinct client groups—Group A, faithful to traditional workouts, and Group B, venturing into the uncharted territory of "HIIT Fusion" fitness.

As we embark on this analytics expedition, we wield the tools of statistical analysis, leveraging the Measures of Central Tendency and Measures of Dispersion acquired in prior lessons. Our focus lies in unraveling the stories told by fitness metrics: average step counts, calories torched per workout, workout duration distributions, and beyond. These insights are the building blocks for crafting workout plans that go beyond the ordinary, shaping a fitness experience that resonates uniquely with each group.

Prepare to infuse your statistical expertise into the dynamic landscape of fitness analytics, where data illuminates the path to tailor-made workout excellence!

## Objectives

By the end of this lab, you will be able to:

- Calculate the mean, median, and mode for a given dataset.
- Interpret the results to understand the central tendencies of the data.
- Explore the range, interquartile range (IQR), variance, and standard deviation calculations.
- Apply measures of dispersion to assess and quantify the spread of data.

## Import libraries

In [None]:
import numpy as np

## PulsePrecision Fitness Analytics

Welcome to the PulsePrecision Fitness Analytics Lab, where we, esteemed data analysts, will navigate the intricate landscape of fitness metrics for two distinct client groups. Group A adheres to the tried-and-true routine of traditional workouts, while Group B pioneers the unexplored territory of "HIIT Fusion" fitness. Our mission is to unearth insights that will sculpt personalized workout plans, ensuring an optimal fitness experience for each group.

In [None]:
# Run this cell without changes

# Set random seed for reproducibility
np.random.seed(42)

# Generate fitness data for Group A (Traditional Workouts)
group_a_data = {
  'step_count': np.random.randint(5000, 15000, 50),
  'calories_burned': np.random.randint(100, 600, 50),
  'workout_duration_minutes': np.random.randint(20, 90, 50),
}

# Generate fitness data for Group B (HIIT Fusion)
group_b_data = {
  'step_count': np.random.randint(3000, 12000, 50),
  'calories_burned': np.random.randint(150, 700, 50),
  'workout_duration_minutes': np.random.randint(15, 60, 50),
  'intensity_rating': np.random.randint(1, 10, 50),
  'interval_count': np.random.randint(5, 15, 50),
}

In [None]:
# Run this cell without changes
group_a_data

{'step_count': array([12270,  5860, 10390, 10191, 10734, 11265,  5466,  9426, 10578,
        13322,  6685,  5769, 11949,  7433, 10311, 10051, 11420,  6184,
         9555,  8385, 11396, 13666, 14274,  7558, 12849,  7047,  7747,
        14167, 14998,  5189,  7734,  8005,  9658,  6899, 12734,  6267,
         6528,  8556,  8890, 13838, 10393, 13792, 13433, 12513,  7612,
        12041, 14555, 11235, 10486, 12099]),
 'calories_burned': array([554, 527, 363, 530, 134, 305, 180, 519, 149, 459, 487, 101, 489,
        153, 205, 359, 409, 576, 290, 501, 317, 143, 261, 301, 545, 583,
        369, 450, 403, 370, 555, 561, 314, 351, 289, 395, 312, 307, 336,
        437, 466, 152, 379, 509, 316, 351, 287, 479, 592, 140]),
 'workout_duration_minutes': array([48, 34, 64, 84, 28, 20, 27, 82, 30, 27, 54, 54, 52, 24, 60, 47, 26,
        31, 53, 52, 67, 42, 81, 56, 63, 54, 84, 66, 22, 20, 24, 33, 46, 28,
        34, 61, 70, 82, 71, 23, 42, 34, 62, 48, 55, 32, 51, 78, 47, 85])}

In [None]:
# Run this cell without changes
group_b_data

{'step_count': array([ 5693,  6627,  8450,  4663,  8592, 10392,  4306,  9776,  8864,
        10526, 11901,  8575,  8530,  7413,  6748,  3663,  4998, 10994,
         4495,  6304,  6763,  8232,  4853,  9585,  4291,  6581, 10554,
        10280,  4636,  6696,  3698,  7737,  3854, 11164,  8855, 10392,
         9528,  8249,  8172,  4707,  8791,  8535,  7931,  6510,  3202,
         7218, 11958,  7389,  5327, 11004]),
 'calories_burned': array([347, 660, 293, 350, 273, 336, 475, 613, 498, 552, 495, 660, 296,
        297, 638, 487, 622, 300, 564, 447, 412, 293, 495, 151, 453, 403,
        602, 186, 309, 158, 382, 248, 357, 280, 553, 301, 203, 269, 569,
        571, 253, 403, 376, 261, 659, 622, 248, 302, 487, 312]),
 'workout_duration_minutes': array([30, 55, 50, 47, 18, 47, 28, 35, 34, 22, 21, 17, 31, 47, 26, 36, 36,
        44, 52, 52, 59, 22, 41, 41, 48, 35, 44, 47, 42, 47, 19, 33, 18, 49,
        31, 58, 42, 44, 43, 20, 49, 55, 51, 38, 43, 45, 49, 47, 35, 46]),
 'intensity_rating': array([7

## Lab Tasks




**Task 1: Central Tendency for Group A**
- Calculate the mean, median, and mode for the step count in Group A.
- Interpret the results to understand the central tendencies of the step count data.

In [None]:
import numpy as np
import os

count_group_a = group_a_data['step_count']
# Calculate mean
mean= np.mean(count_group_a)

# Calculate median
median= np.median(count_group_a)

# Calculate mode
mode= np.bincount(count_group_a).argmax()

print(f"The Mean for Group A Step Count is : {mean}")
print(f"The Median for Group A Step Count is: {median}")
print(f"The Mode for Group A Step Count is: {mode}")


The Mean for Group A Step Count is : 10068.06
The Median for Group A Step Count is: 10350.5
The Mode for Group A Step Count is: 5189


**Task 2: Central Tendency for Group B**
- Calculate the mean, median, and mode for the calories burned in Group B.
- Interpret the results to understand the central tendencies of the calories burned data.


In [None]:
## CODE GOES HERE
import numpy as np

# Calories burned data for Group B
calories_group_b = group_b_data['calories_burned']

# Calculate mean
mean = np.mean(calories_group_b)

# Calculate median
median = np.median(calories_group_b)

# Calculate mode
mode= np.bincount(calories_group_b).argmax()

print(f"The Mean for Group B burned calories is:{mean}")
print(f"The Median for Group B Calories Burned is:{median}")
print(f"The Mode for Group B Calories Burned is:{mode}")


The Mean for Group B burned calories is:406.42
The Median for Group B Calories Burned is:379.0
The Mode for Group B Calories Burned is:248


**Task 3: Dispersion Analysis for Group A**
- Explore the range, interquartile range (IQR), variance, and standard deviation for the workout duration in Group A.
- Apply measures of dispersion to assess and quantify the spread of workout duration data.

In [None]:
## CODE GOES HERE
import numpy as np

# Workout duration data for Group A
workout_duration= group_a_data['workout_duration_minutes']

# Calculate range
range = np.ptp(workout_duration)

# Calculate interquartile range (IQR)
q1, q2 = np.percentile(workout_duration, [60 ,20])
iqr = q1 - q2

# Calculate variance
variance = np.var(workout_duration)

# Calculate standard deviation
std_dev= np.std(workout_duration)

# Display the results
print(f"Range for Group A Workout Duration is:{range}")
print(f"Interquartile Range (IQR) for Group A Workout Duration is:{iqr}")
print(f"Variance for Group A Workout Duration is:{variance}")
print(f"Standard Deviation for Group A Workout Duration is:{std_dev}")


Range for Group A Workout Duration is:65
Interquartile Range (IQR) for Group A Workout Duration is:26.0
Variance for Group A Workout Duration is:382.3344
Standard Deviation for Group A Workout Duration is:19.55337311054029


**Task 4: Dispersion Analysis for Group B**
- Explore the range, interquartile range (IQR), variance, and standard deviation for the intensity rating in Group B.
- Apply measures of dispersion to assess and quantify the spread of intensity rating data.

In [None]:
## CODE GOES HERE
import numpy as np

# Intensity rating data for Group B
intensity_rating= group_b_data['intensity_rating']

# Calculate range
range = np.ptp(intensity_rating)

# Calculate interquartile range (IQR)
q0, q1 = np.percentile(intensity_rating, [40 ,30])
iqr = q0 - q1

# Calculate variance
variance= np.var(intensity_rating)

# Calculate standard deviation
std_dev= np.std(intensity_rating)

# Display the results
print(f"Range for Group B Intensity Rating is:{range}")
print(f"Interquartile Range (IQR) for Group B Intensity Rating is:{iqr}")
print(f"Variance for Group B Intensity Rating is: {variance}")
print(f"Standard Deviation for Group B Intensity Rating is: {std_dev}")


Range for Group B Intensity Rating is:8
Interquartile Range (IQR) for Group B Intensity Rating is:1.0
Variance for Group B Intensity Rating is: 7.0384
Standard Deviation for Group B Intensity Rating is: 2.6529983038064686


## Summary

In the PulsePrecision Fitness Analytics Lab, we actively engaged with diverse fitness datasets, calculating the mean, median, and mode for step count and calories burned, unveiling the distinctive central tendencies within each client group. Navigating dispersion analysis, we explored the range, interquartile range (IQR), variance, and standard deviation, gaining valuable insights into the spread of workout duration and intensity rating data. Through this hands-on journey, we refined our analytical skills, recognizing the impactful role of data in crafting personalized fitness plans. As we wrap up this lab, we anticipate the upcoming lesson, where we'll explore deeper into the dynamic connections between variables with covariance and correlation.