# 一、数据查看

In [18]:
import pandas as pd

# 加载数据集
df = pd.read_csv('/home/mw/input/11011446/gym_members_data.csv')

print('数据基本信息：')
df.info()

# 查看数据集行数和列数
rows, columns = df.shape

if rows < 100 and columns < 20:
    # 短表数据（行数少于100且列数少于20）查看全量数据信息
    print('数据全部内容信息：')
    print(df.to_csv(sep='\t', na_rep='nan'))
else:
    # 长表数据查看数据前几行信息
    print('数据前几行内容信息：')
    print(df.head().to_csv(sep='\t', na_rep='nan'))

数据基本信息：
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 973 entries, 0 to 972
Data columns (total 16 columns):
 #   Column                         Non-Null Count  Dtype  
---  ------                         --------------  -----  
 0   Index                          973 non-null    int64  
 1   Age                            973 non-null    int64  
 2   Gender                         973 non-null    object 
 3   Weight (kg)                    973 non-null    float64
 4   Height (m)                     973 non-null    float64
 5   Max_BPM                        973 non-null    int64  
 6   Avg_BPM                        973 non-null    int64  
 7   Resting_BPM                    973 non-null    int64  
 8   Session_Duration (hours)       973 non-null    float64
 9   Calories_Burned                973 non-null    float64
 10  Workout_Type                   973 non-null    object 
 11  Fat_Percentage                 973 non-null    float64
 12  Water_Intake (liters)          973 non-nul

# 二、数据预处理

## 2.1 缺失值检测

In [19]:
import numpy as np

# 检查缺失值
print('缺失值数量：')
print(df.isnull().sum())

# 分析数值型列的描述性统计信息，保留两位小数
numeric_columns = df.select_dtypes(include=[np.number]).columns
print('数值型列的描述性统计信息：')
print(df[numeric_columns].describe().round(2))

缺失值数量：
Index                            0
Age                              0
Gender                           0
Weight (kg)                      0
Height (m)                       0
Max_BPM                          0
Avg_BPM                          0
Resting_BPM                      0
Session_Duration (hours)         0
Calories_Burned                  0
Workout_Type                     0
Fat_Percentage                   0
Water_Intake (liters)            0
Workout_Frequency (days/week)    0
Experience_Level                 0
BMI                              0
dtype: int64
数值型列的描述性统计信息：
        Index     Age  Weight (kg)  Height (m)  Max_BPM  Avg_BPM  Resting_BPM  \
count  973.00  973.00       973.00      973.00   973.00   973.00       973.00   
mean   487.00   38.68        73.85        1.72   179.88   143.77        62.22   
std    281.03   12.18        21.21        0.13    11.53    14.35         7.33   
min      1.00   18.00        40.00        1.50   160.00   120.00        50.00   
2

### 2.1.1 缺失值分析

从结果来看，数据集中各列的缺失值数量均为 0，这表明数据集在完整性方面表现良好，不存在因缺失值而可能导致的数据不完整或分析偏差问题。

### 2.1.2 数值型列的描述性统计信息

- **年龄（Age）**：均值为 38.68 岁，标准差为 12.18，说明年龄分布有一定的离散程度。最小值 18 岁和最大值 59 岁界定了年龄的范围，可能这个数据集对应的是特定年龄段（如成年人群体）的相关数据。  
- **体重（Weight (kg)）**：平均体重 73.85 kg，标准差 21.21 表明体重的个体差异较大。最大体重达到 129.90 kg，可能存在个别体重较重的特殊情况，但在合理范围内，需结合实际背景判断是否为异常。  
- **其他指标**：类似地，可以对其他指标进行分析，了解各变量在数据集中的分布特征。例如最大心率（Max_BPM）的均值为 179.88，整体较为稳定，波动范围相对较小。

## 2.2 异常值检测

In [20]:
# 检测并处理异常值（以Age列为例，使用IQR方法）
Q1 = df['Age'].quantile(0.25)
Q3 = df['Age'].quantile(0.75)
IQR = Q3 - Q1
lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR

# 筛选出异常值
outliers = df[(df['Age'] < lower_bound) | (df['Age'] > upper_bound)]
print('Age列的异常值：')
print(outliers)

# 处理异常值，将异常值替换为上下边界值
df['Age'] = np.where(df['Age'] < lower_bound, lower_bound, df['Age'])
df['Age'] = np.where(df['Age'] > upper_bound, upper_bound, df['Age'])

Age列的异常值：
Empty DataFrame
Columns: [Index, Age, Gender, Weight (kg), Height (m), Max_BPM, Avg_BPM, Resting_BPM, Session_Duration (hours), Calories_Burned, Workout_Type, Fat_Percentage, Water_Intake (liters), Workout_Frequency (days/week), Experience_Level, BMI]
Index: []


以年龄（Age）列为例，使用 IQR 方法检测异常值，结果显示没有异常值。这意味着在年龄这个特征上，数据处于相对合理的分布区间内。对于其他列，如果需要也可以采用类似方法进行异常值检测和处理。

# 三、数据总体特征分析

## 3.1 不同性别用户分布情况

In [21]:
import matplotlib.pyplot as plt

# 统计不同性别的数量
gender_counts = df['Gender'].value_counts()

# 设置图片清晰度
plt.rcParams['figure.dpi'] = 300

# 设置中文字体
plt.rcParams['font.sans-serif'] = ['SimHei']

# 绘制饼图
plt.pie(gender_counts, labels=gender_counts.index, autopct='%1.1f%%')
plt.title('The distribution of different genders')
plt.show()

# 查看不同性别的用户分布数据
print('不同性别的用户分布：\n', gender_counts)

findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not foun

不同性别的用户分布：
 Gender
Male      511
Female    462
Name: count, dtype: int64


从结果中，我们可以推测，在这个数据集中男性用户数量略多于女性用户。这可能反映出健身房场景对男性的吸引力稍强，或者在用户获取渠道方面更偏向于吸引男性。也有可能是健身房会员的目标受众本身男性就占比较大。

## 3.2 不同锻炼类型下平均卡路里消耗情况

In [22]:
# 统计不同锻炼类型下平均卡路里消耗，并保留两位小数
grouped_data = df.groupby('Workout_Type')['Calories_Burned'].mean().round(2).reset_index()

# 绘制不同锻炼类型下平均卡路里消耗分布条形图
plt.bar(grouped_data['Workout_Type'], grouped_data['Calories_Burned'])
plt.xlabel('Workout_Type')
plt.ylabel('Calories_Burned')
plt.title('Distribution of consumption under different exercise types')

# 添加数值标签
for i in range(len(grouped_data['Workout_Type'])):
        plt.text(i, grouped_data['Calories_Burned'][i], grouped_data['Calories_Burned'][i], ha = 'center', size = 8)

# 旋转 x 轴标签
plt.xticks(rotation = 90)
    
# 展示图片
plt.show()

print('不同锻炼类型下平均卡路里消耗：\n', grouped_data)

findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not foun

不同锻炼类型下平均卡路里消耗：
   Workout_Type  Calories_Burned
0       Cardio           884.51
1         HIIT           925.81
2     Strength           910.70
3         Yoga           903.19


从结果中，我们可以推测，不同的锻炼类型在平均卡路里消耗上存在一定的差异。HIIT（高强度间歇训练）的平均卡路里消耗最高，达到了 925.81，这可能是因为 HIIT 通常包含高强度的运动和短暂的休息，能够在短时间内提升心率，从而消耗更多的热量。而 Cardio（有氧运动）的平均卡路里消耗相对较低，为 884.51，但也保持在较高水平。Yoga（瑜伽）的平均卡路里消耗处于中等水平，这可能是因为瑜伽的运动强度相对较为温和，更多地注重身体的柔韧性和平衡性训练。

## 3.3 年龄与燃烧卡路里的相关性

In [23]:
import seaborn as sns

# 计算年龄与燃烧卡路里的相关系数，并保留两位小数
level_depth = df['Age'].corr(df['Calories_Burned']).round(2)

# 绘制散点图
sns.scatterplot(x='Age', y='Calories_Burned', data=df)
plt.xlabel('Age')
plt.xticks(rotation=45)
plt.ylabel('Calories_Burned')
plt.title('Age VS Calories_Burned')

# 展示图片
plt.show()

print('年龄与燃烧卡路里的皮尔逊相关系数：', level_depth)

findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not foun

年龄与燃烧卡路里的皮尔逊相关系数： -0.15


计算得出年龄与燃烧卡路里的皮尔逊相关系数为 -0.15（保留两位小数）。相关系数的取值范围在 -1 到 1 之间，该数值接近 0，说明年龄和燃烧卡路里之间存在较弱的负相关关系。由此我们可以推测，随着年龄的增长，燃烧卡路里的数量可能会有轻微的下降趋势，但这种趋势并不十分明显。可能存在其他更多的因素对燃烧卡路里的数量产生影响，比如锻炼的强度、锻炼的类型等，而年龄在其中的影响相对较小。

# 四、会员画像描述性统计

In [24]:
# 对年龄、体重、身高、心率等数值型变量进行描述性统计分析，结果保留两位小数
numeric_stats = df[['Age', 'Weight (kg)', 'Height (m)', 'Max_BPM', 'Avg_BPM', 'Resting_BPM', 'Session_Duration (hours)', 'Calories_Burned', 'Fat_Percentage', 'Water_Intake (liters)', 'Workout_Frequency (days/week)', 'Experience_Level', 'BMI']].describe().round(2)
print('数值型变量描述性统计：')
print(numeric_stats)

# 对性别、锻炼类型等分类变量进行描述性统计分析
categorical_stats = df[['Gender', 'Workout_Type']].describe()
print('\n分类变量描述性统计：')
print(categorical_stats)

数值型变量描述性统计：
          Age  Weight (kg)  Height (m)  Max_BPM  Avg_BPM  Resting_BPM  \
count  973.00       973.00      973.00   973.00   973.00       973.00   
mean    38.68        73.85        1.72   179.88   143.77        62.22   
std     12.18        21.21        0.13    11.53    14.35         7.33   
min     18.00        40.00        1.50   160.00   120.00        50.00   
25%     28.00        58.10        1.62   170.00   131.00        56.00   
50%     40.00        70.00        1.71   180.00   143.00        62.00   
75%     49.00        86.00        1.80   190.00   156.00        68.00   
max     59.00       129.90        2.00   199.00   169.00        74.00   

       Session_Duration (hours)  Calories_Burned  Fat_Percentage  \
count                    973.00           973.00          973.00   
mean                       1.26           905.42           24.98   
std                        0.34           272.64            6.26   
min                        0.50           303.00          

## 4.1 数值型变量

- **年龄（Age）**：会员年龄在 18 - 59 岁之间，平均年龄约为 38.68 岁，说明会员以中青年为主。标准差为 12.18，年龄分布有一定的离散程度。  
- **体重（Weight (kg)）**：体重范围在 40 - 129.9 kg，平均体重 73.85 kg，标准差 21.21 表明体重差异较大。  
- **锻炼相关指标**：平均每次锻炼时长 1.26 小时，平均燃烧卡路里 905.42，平均脂肪百分比 24.98% 等，可以帮助了解会员的锻炼效果和强度。

## 4.2 分类变量

- **性别（Gender）**：会员中男性有 511 人，女性人数为 973 - 511 = 462 人，男性会员略多于女性会员。  
- **锻炼类型（Workout_Type）**：共有 4 种锻炼类型，其中最受欢迎的是 Strength（力量训练），有 258 人选择。

# 五、会员健身偏好分析

## 5.1 不同性别对锻炼类型的偏好

In [25]:
import matplotlib.pyplot as plt

# 分析不同性别对锻炼类型的偏好
gender_workout = df.groupby(['Gender', 'Workout_Type'])['Index'].count().unstack()
print('不同性别对锻炼类型的偏好：')
print(gender_workout)

# 绘制柱状图展示不同性别对锻炼类型的偏好
gender_workout.plot(kind='bar')
plt.title('Gender preferences for exercise types')
plt.xlabel('Gender')
plt.xticks(rotation=45)
plt.ylabel('Number of people')
plt.show()

findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not foun

不同性别对锻炼类型的偏好：
Workout_Type  Cardio  HIIT  Strength  Yoga
Gender                                    
Female           126   107       123   106
Male             129   114       135   133


findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not foun

从数据上分析，男性和女性在各种锻炼类型的选择上数量较为接近。男性在力量训练（Strength）上的选择略多于女性，而女性在有氧运动（Cardio）上的选择和男性相差不大。这可能暗示男性相对更倾向于力量型的锻炼，而女性在有氧和力量锻炼上的分布相对均衡。

## 5.2 不同经验水平对锻炼类型的偏好

In [26]:
# 分析不同经验水平对锻炼类型的偏好
level_workout = df.groupby(['Experience_Level', 'Workout_Type'])['Index'].count().unstack()
print('\n不同经验水平对锻炼类型的偏好：')
print(level_workout)

# 绘制柱状图展示不同经验水平对锻炼类型的偏好
level_workout.plot(kind='bar')
plt.title('Preferences for exercise types at different experience levels')
plt.xlabel('Experience_Level')
plt.xticks(rotation=45)
plt.ylabel('Number of people')
plt.show()

findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not foun


不同经验水平对锻炼类型的偏好：
Workout_Type      Cardio  HIIT  Strength  Yoga
Experience_Level                              
1                    109    85        97    85
2                    102    87       116   101
3                     44    49        45    53


从数据中分析，随着经验水平的提高，选择各种锻炼类型的人数总体呈下降趋势。经验水平为 2 的会员在力量训练（Strength）上的选择人数最多，而经验水平为 1 的会员在有氧运动（Cardio）上的选择人数最多。这可能表示新手会员更倾向于有氧运动，随着经验的积累，部分会员会尝试力量训练，但到了较高经验水平，整体锻炼人数有所减少，推测可能是高经验会员基数较小或者他们有其他更个性化的锻炼方式。

## 5.3 锻炼频率与锻炼时长的关系

In [27]:
# 分析锻炼频率与锻炼时长的关系，结果保留两位小数
frequency_duration = df.groupby('Workout_Frequency (days/week)')['Session_Duration (hours)'].mean().round(2)
print('\n锻炼频率与锻炼时长的关系：')
print(frequency_duration)

# 绘制折线图展示锻炼频率与锻炼时长的关系
frequency_duration.plot(kind='line', marker='o')
plt.title('Workout_Frequency VS Session_Duration')
plt.xlabel('Workout_Frequency (days/week)')
plt.xticks(rotation=45)
plt.ylabel('Session_Duration (hours)')
plt.show()

findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not foun


锻炼频率与锻炼时长的关系：
Workout_Frequency (days/week)
2    1.00
3    1.14
4    1.39
5    1.77
Name: Session_Duration (hours), dtype: float64


从数据中分析，可以明显看出，锻炼频率越高，平均每次的锻炼时长也越长。这可能是因为锻炼频率高的会员身体素质更好、锻炼意愿更强，能够承受更长时间的锻炼，或者他们有更系统的锻炼计划，随着锻炼频率的增加，每次的锻炼量也相应增加。

# 六、会员锻炼类型选择分析

## 6.1 不同锻炼类型的平均卡路里消耗、脂肪百分比和饮水量

In [28]:
# 分析不同锻炼类型的平均卡路里消耗、脂肪百分比、饮水量，结果保留两位小数
workout_stats = df.groupby('Workout_Type')[['Calories_Burned', 'Fat_Percentage', 'Water_Intake (liters)']].mean().round(2)
print('不同锻炼类型的平均卡路里消耗、脂肪百分比、饮水量：')
print(workout_stats)

# 绘制柱状图展示不同锻炼类型的平均卡路里消耗、脂肪百分比、饮水量
fig, axes = plt.subplots(1, 3, figsize=(15, 5))

workout_stats['Calories_Burned'].plot(kind='bar', ax=axes[0])
axes[0].set_title('Average calorie expenditure by type of exercise')
axes[0].set_xlabel('Workout_Type')
axes[0].set_ylabel('Calories_Burned')
axes[0].tick_params(axis='x', rotation=45)

workout_stats['Fat_Percentage'].plot(kind='bar', ax=axes[1])
axes[1].set_title('The average fat percentage of different exercise types')
axes[1].set_xlabel('Workout_Type')
axes[1].set_ylabel('Fat_Percentage')
axes[1].tick_params(axis='x', rotation=45)

workout_stats['Water_Intake (liters)'].plot(kind='bar', ax=axes[2])
axes[2].set_title('Average water intake for different types of exercise')
axes[2].set_xlabel('Workout_Type')
axes[2].set_ylabel('Water_Intake (liters)')
axes[2].tick_params(axis='x', rotation=45)

plt.tight_layout()
plt.show()

不同锻炼类型的平均卡路里消耗、脂肪百分比、饮水量：
              Calories_Burned  Fat_Percentage  Water_Intake (liters)
Workout_Type                                                        
Cardio                 884.51           25.40                   2.62
HIIT                   925.81           24.46                   2.65
Strength               910.70           25.46                   2.60
Yoga                   903.19           24.48                   2.64


findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not foun

从平均卡路里消耗来看，HIIT（高强度间歇训练）的卡路里消耗最高，为 925.81，这表明这种锻炼类型可能强度较大，能在单位时间内消耗更多热量。而 Cardio（有氧运动）的卡路里消耗相对较低，为 884.51。在脂肪百分比方面，Strength（力量训练）和 Cardio 的平均脂肪百分比较高，分别为 25.46 和 25.40，这可能意味着这两种锻炼类型在减脂方面有一定优势。在平均饮水量上，几种锻炼类型差异不大，都在 2.60 - 2.65 升之间，推测可能是因为锻炼时人体的基本水分需求相似。

## 6.2 不同锻炼类型下不同经验水平的分布

In [29]:
# 分析不同锻炼类型下不同经验水平的分布
workout_level = df.groupby(['Workout_Type', 'Experience_Level'])['Index'].count().unstack()
print('\n不同锻炼类型下不同经验水平的分布：')
print(workout_level)

# 绘制柱状图展示不同锻炼类型下不同经验水平的分布
workout_level.plot(kind='bar')
plt.title('The distribution of different experience levels under different exercise types')
plt.xlabel('Workout_Type')
plt.xticks(rotation=45)
plt.ylabel('Number of people')
plt.show()

findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not foun


不同锻炼类型下不同经验水平的分布：
Experience_Level    1    2   3
Workout_Type                  
Cardio            109  102  44
HIIT               85   87  49
Strength           97  116  45
Yoga               85  101  53


findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei
findfont: Generic family 'sans-serif' not found because none of the following families were found: SimHei


对于 Cardio 锻炼类型，经验水平为 1 的人数最多，有 109 人，随着经验水平升高人数逐渐减少。这可能说明 Cardio 是一种比较受新手欢迎的锻炼方式。而对于 Strength 锻炼类型，经验水平为 2 的人数最多，达到 116 人，这或许表示在积累了一定经验后，会员更倾向于进行力量训练。对于 Yoga 锻炼类型，经验水平为 3 的人数相对其他锻炼类型较多，可能意味着有一定经验的会员更能体会到瑜伽的好处并持续参与。