# INFORMACIÓN DEL PROYECTO

# **EDA on Gym Members Workouts**

# Exploratory Data Analysis (EDA) on a gym members exercise dataset

## Demographic Analysis

- Examine the age distribution of gym members
- Analyze gender breakdown and any patterns related to gender
- Look at the relationship between age, gender, and other variables

## Physical Characteristics

- Investigate the distribution of height, weight, and BMI among members
- Explore correlations between physical characteristics and other variables like fitness goals or workout preferences

## Membership Patterns

- Analyze the distribution of membership types
- Examine the relationship between membership type and workout frequency or duration

## Fitness Goals and Behaviors

- Study the distribution of fitness goals among members
- Investigate how fitness goals relate to workout preferences and durations
- Analyze the relationship between fitness goals and supplement usage

## Workout Habits

- Examine the distribution of preferred workout types
- Analyze average workout durations and how they vary across different member segments
- Investigate the frequency of gym visits and any patterns that emerge

## Experience and Fitness Levels

- Look at the distribution of years of gym experience among members
- Analyze how experience relates to fitness level, workout habits, and goals
- Investigate the relationship between fitness level and other variables like BMI or workout duration

## Dietary Preferences and Supplement Usage

- Examine the distribution of dietary preferences among members
- Analyze how dietary preferences relate to fitness goals or workout habits
- Investigate patterns in supplement usage and their relationship to other variables

## Personal Trainer Utilization

- Analyze the percentage of members using personal trainers
- Investigate how personal trainer usage relates to fitness goals, workout habits, and results

## Performance Metrics

- Examine the distribution of average heart rates during workouts
- Analyze patterns in calories burned per session and how they relate to other variables

## Time-based Analysis

- If the dataset includes temporal information, look for trends or patterns over time in membership, workout habits, or fitness outcomes

By focusing on these areas, we can gain a comprehensive understanding of the gym members' characteristics, behaviors, and outcomes. This analysis can provide valuable insights for gym management, personal trainers, and fitness program developers.

Citations:
[1] http://arno.uvt.nl/show.cgi?fid=172644
[2] https://www.healthandfitness.org/improve-your-club/data-based-fitness-assessments-help-gym-members-get-results/
[3] https://www.nature.com/articles/s41597-022-01784-7
[4] https://verpex.com/blog/website-tips/eda-in-machine-learning
[5] https://ugoproto.github.io/ugo_py_doc/eda_machine_learning_feature_engineering_and_kaggle/
[6] https://www.healthandfitness.org/improve-your-club/how-gyms-are-using-member-data-to-increase-retention/
[7] https://semasuka.github.io/blog/2019/03/26/introduction-to-eda.html
[8] https://www.kaggle.com/datasets/valakhorasani/gym-members-exercise-dataset/code

### HIPÓTESIS

Define aquí lo que será la hipótesis de tu proyecto.
Deberás tener más de una, pero tu proyecto lo harás teniendo una principal hipótesis.

In [2]:
# Escribe tu hipótesis principal en Markdown
hipotesis_1 = ""

# **Hypotheses**

## Hypotheses Related to Demographics

1. **Age and Fitness Level**: 
   - **Hypothesis**: Older gym members have a lower fitness level compared to younger members.
   
2. **Gender Differences in Workout Preferences**:
   - **Hypothesis**: Male gym members prefer strength training more than female gym members, who prefer cardio workouts.

## Hypotheses Related to Physical Characteristics

3. **BMI and Workout Duration**:
   - **Hypothesis**: Members with a higher BMI tend to have shorter average workout durations compared to those with a normal BMI.

4. **Height and Weight Correlation**:
   - **Hypothesis**: There is a positive correlation between height and weight among gym members.

## Hypotheses Related to Membership Patterns

5. **Membership Type and Frequency of Visits**:
   - **Hypothesis**: Members with premium memberships visit the gym more frequently than those with basic memberships.

6. **Years of Experience and Workout Frequency**:
   - **Hypothesis**: Members with more years of gym experience visit the gym more frequently than newer members.

## Hypotheses Related to Fitness Goals

7. **Fitness Goals and Supplement Usage**:
   - **Hypothesis**: Members aiming for weight loss are more likely to use dietary supplements compared to those focused on muscle gain.

8. **Impact of Personal Trainers on Fitness Goals**:
   - **Hypothesis**: Members who use personal trainers are more likely to achieve their fitness goals than those who do not.

## Hypotheses Related to Performance Metrics

9. **Calories Burned and Average Heart Rate**:
   - **Hypothesis**: There is a positive correlation between average heart rate during workouts and calories burned per session.

10. **Workout Duration and Calories Burned**:
    - **Hypothesis**: Longer workout durations are associated with higher calories burned per session.

## Hypotheses Related to Dietary Preferences

11. **Dietary Preferences and Fitness Levels**:
    - **Hypothesis**: Members with specific dietary preferences (e.g., vegetarian, vegan) have different fitness levels compared to those without dietary restrictions.

12. **Impact of Diet on Workout Performance**:
    - **Hypothesis**: Members who follow a specific diet plan report higher average workout durations than those who do not adhere to any diet.

These hypotheses can guide your analysis and help you uncover meaningful patterns in the dataset. Be sure to use appropriate statistical methods to test these hypotheses effectively!


## OBTENCIÓN DE LOS DATOS

### DATASETS Y FUENTES ALTERNATIVAS DE DATOS

Incluye aquí una vista del dataset o datasets de los que partirás para poder evaluar tu hipótesis. <br>
También incluye el origen de estos datos y su fuente.

In [3]:
import pandas as pd

In [4]:
df_gym = pd.read_csv("data/gym_members_exercise_tracking.csv")

'''
Citations:
[1] https://www.kaggle.com/datasets/valakhorasani/gym-members-exercise-dataset/code
'''


'\nCitations:\n[1] https://www.kaggle.com/datasets/valakhorasani/gym-members-exercise-dataset/code\n'

Muestra mediante un head() los principales datasets con los que vas a trabajar

In [5]:
df_gym.head(20)

Unnamed: 0,Age,Gender,Weight (kg),Height (m),Max_BPM,Avg_BPM,Resting_BPM,Session_Duration (hours),Calories_Burned,Workout_Type,Fat_Percentage,Water_Intake (liters),Workout_Frequency (days/week),Experience_Level,BMI
0,56,Male,88.3,1.71,180,157,60,1.69,1313.0,Yoga,12.6,3.5,4,3,30.2
1,46,Female,74.9,1.53,179,151,66,1.3,883.0,HIIT,33.9,2.1,4,2,32.0
2,32,Female,68.1,1.66,167,122,54,1.11,677.0,Cardio,33.4,2.3,4,2,24.71
3,25,Male,53.2,1.7,190,164,56,0.59,532.0,Strength,28.8,2.1,3,1,18.41
4,38,Male,46.1,1.79,188,158,68,0.64,556.0,Strength,29.2,2.8,3,1,14.39
5,56,Female,58.0,1.68,168,156,74,1.59,1116.0,HIIT,15.5,2.7,5,3,20.55
6,36,Male,70.3,1.72,174,169,73,1.49,1385.0,Cardio,21.3,2.3,3,2,23.76
7,40,Female,69.7,1.51,189,141,64,1.27,895.0,Cardio,30.6,1.9,3,2,30.57
8,28,Male,121.7,1.94,185,127,52,1.03,719.0,Strength,28.9,2.6,4,2,32.34
9,28,Male,101.8,1.84,169,136,64,1.08,808.0,Cardio,29.7,2.7,3,1,30.07


In [6]:
df_MTA = pd.read_csv("data/com_corp_mta.csv")

In [9]:
df_MTA.head(10)

Unnamed: 0,Tiempo,Peso,Cambio,IMC,Grasa corporal,Masa muscular esquelética,Masa ósea,Agua corporal,Unnamed: 8
0,"Oct 29, 2024",,,,,,,,
1,7:06 am,75.6 kg,0.2 kg,25.6,15.4 %,29.8 kg,4.7 kg,61.7 %,
2,"Oct 28, 2024",,,,,,,,
3,6:19 am,75.4 kg,0.7 kg,25.5,14.7 %,29.7 kg,4.7 kg,62.3 %,
4,"Oct 27, 2024",,,,,,,,
5,8:14 am,76.1 kg,0.1 kg,25.7,15.4 %,29.9 kg,4.7 kg,61.8 %,
6,"Oct 26, 2024",,,,,,,,
7,8:35 am,76.2 kg,0.5 kg,25.7,15.5 %,29.9 kg,4.7 kg,61.7 %,
8,"Oct 25, 2024",,,,,,,,
9,6:57 am,75.7 kg,0.2 kg,25.6,15 %,29.8 kg,4.7 kg,62.1 %,


In [7]:
# Primer dataset
# df_1.head()

In [8]:
# Siguiente dataset...
# df_2.head()