1. 1) Frailty is physical weakness; lack of health or strength. Reduced grip strength in females
correlated with higher frailty scores and vice versa. Hand grip strength can be quantified by
measuring the amount of static force that the hand can squeeze around a dynamometer. The
force has most commonly been measured in kilograms and pounds. The table below represents
data from 10 female participants. The Height is measured in inches, Weight in pounds, Age in
years, Grip strength in kilograms. Frailty is qualitative attribute indicated the presence or
absence of the symptoms. (10 points)
Height Weight Age Grip strength Frailty
65.8 112 30 30 N
71.5 136 19 31 N
69.4 153 45 29 N
68.2 142 22 28 Y
67.8 144 29 24 Y
68.7 123 50 26 N
69.8 141 51 22 Y
70.1 136 23 20 Y
67.9 112 17 19 N
66.8 120 39 31 N
Based on the following table, you must design AND implement a three‑stage workflow (ingest
→ process → analyze) with code and organized outputs. (reference study case in chapter
3). You need to save the raw data in csv file and read it into a pandas data frame and then
perform the following:
a. Unit standardization
i. Height_m = Height_in * 0.0254
ii. Weight_kg = Weight_lb * 0.45359237
b. Feature engineering
i. BMI = Weight_kg / (Height_m ** 2) (round to 2 decimals).
ii. AgeGroup (categorical): "<30", "30–45", "46–60", ">60" based on Age_yr.
c. Categorical → numeric encoding
i. Binary encoding: Frailty_binary (Y→1, N→0, store as int8).
ii. One‑hot encode AgeGroup into columns: AgeGroup_<30, AgeGroup_30–45,
AgeGroup_46–60, AgeGroup_>60
d. EDA & Reporting
I. Compute summary table: mean/median/std for numeric columns; save to
reports/findings.md .
II. Quantify relation of strength ↔ frailty: compute correlation between Grip_kg
and Frailty_binary, and report it.


1. Ingest


In [138]:
import pandas as pd
import os

In [162]:
os.makedirs("data", exist_ok=True)
raw_path = "data/raw_frailty.csv"
pd.DataFrame(data).to_csv(raw_path, index=False)

In [164]:
df = pd.read_csv(raw_path)
df.head(11)

Unnamed: 0,Height_in,Weight_lb,Age_yr,Grip_kg,Frailty
0,65.8,112,30,30,N
1,71.5,136,19,31,N
2,69.4,153,45,29,N
3,68.2,142,22,28,Y
4,67.8,144,29,24,Y
5,68.7,123,50,26,N
6,69.8,141,51,22,Y
7,70.1,136,23,20,Y
8,67.9,112,17,19,N
9,66.8,120,39,31,N


2. Process

In [149]:
df["Height_m"] = df["Height_in"] * 0.0254
df["Weight_kg"] = df["Weight_lb"] * 0.45359237


In [150]:
df["BMI"] = (df["Weight_kg"] / (df["Height_m"]**2)).round(2)

In [151]:
def make_age_group(age):
    if age < 30:
        return "<30"
    elif 30 <= age <= 45:
        return "30–45"
    elif 46 <= age <= 60:
        return "46–60"
    else:
        return ">60"

In [152]:
df["AgeGroup"] = df["Age_yr"].apply(make_age_group)

In [153]:
df["Frailty_binary"] = df["Frailty"].map({"N":0, "Y":1}).astype("int8")


In [154]:
age_dummies = pd.get_dummies(df["AgeGroup"], prefix="AgeGroup")
df = pd.concat([df, age_dummies], axis=1)


In [155]:
os.makedirs("processed", exist_ok=True)
df.to_csv("processed/processed_frailty.csv", index=False)


In [158]:
df.head(11)

Unnamed: 0,Height_in,Weight_lb,Age_yr,Grip_kg,Frailty,Height_m,Weight_kg,BMI,AgeGroup,Frailty_binary,AgeGroup_30–45,AgeGroup_46–60,AgeGroup_<30
0,65.8,112,30,30,N,1.67132,50.802345,18.19,30–45,0,1,0,0
1,71.5,136,19,31,N,1.8161,61.688562,18.7,<30,0,0,0,1
2,69.4,153,45,29,N,1.76276,69.399633,22.33,30–45,0,1,0,0
3,68.2,142,22,28,Y,1.73228,64.410117,21.46,<30,1,0,0,1
4,67.8,144,29,24,Y,1.72212,65.317301,22.02,<30,1,0,0,1
5,68.7,123,50,26,N,1.74498,55.791862,18.32,46–60,0,0,1,0
6,69.8,141,51,22,Y,1.77292,63.956524,20.35,46–60,1,0,1,0
7,70.1,136,23,20,Y,1.78054,61.688562,19.46,<30,1,0,0,1
8,67.9,112,17,19,N,1.72466,50.802345,17.08,<30,0,0,0,1
9,66.8,120,39,31,N,1.69672,54.431084,18.91,30–45,0,1,0,0


3. Analyze

In [159]:
summary = df.describe().transpose()[["mean", "50%", "std"]]
summary.rename(columns={"50%":"median"}, inplace=True)

In [160]:
corr = df["Grip_kg"].corr(df["Frailty_binary"])


In [161]:
os.makedirs("reports", exist_ok=True)
with open("reports/findings.md", "w") as f:
    f.write("# Frailty Study Report\n\n")
    f.write("## Summary Statistics\n\n")
    f.write(summary.to_markdown())
    f.write("\n\n")
    f.write("## Correlation Analysis\n\n")
    f.write(f"Correlation between Grip strength and Frailty_binary: {corr:.3f}\n")

summary, corr


(                      mean      median        std
 Height_in        68.600000   68.450000   1.670662
 Weight_lb       131.900000  136.000000  14.231811
 Age_yr           32.500000   29.500000  12.860361
 Grip_kg          26.000000   27.000000   4.521553
 Height_m          1.742440    1.738630   0.042435
 Weight_kg        59.828834   61.688562   6.455441
 BMI              19.682000   19.185000   1.780972
 Frailty_binary    0.400000    0.000000   0.516398
 AgeGroup_30–45    0.300000    0.000000   0.483046
 AgeGroup_46–60    0.200000    0.000000   0.421637
 AgeGroup_<30      0.500000    0.500000   0.527046,
 -0.4758668672668007)