# Lab 1 · Predicting Heart Rate Response

We will use a small dataset of 12 volunteers to see how dose, weight, and age explain the change in heart rate (Δ bpm) after a cardioactive study drug.

### Learning goals
- Load a CSV pharmacology dataset with pandas.
- Separate input features from the target variable.
- Fit a simple linear regression model using scikit-learn.
- Interpret the intercept/coefficients and evaluate the fit with intuitive metrics.

### Step-by-step plan
1. Import the libraries we need.
2. Read the dataset from `../data/heart_rate_response.csv`.
3. Look at the raw numbers to understand their ranges.
4. Train a linear regression model and inspect the coefficients.
5. Compare predicted vs measured heart rate responses.

In [None]:
from pathlib import Path
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error, r2_score

data_path = Path('..') / 'data' / 'heart_rate_response.csv'
df = pd.read_csv(data_path)
print(f"Loaded {len(df)} participants.")
df

### Exercise 1 · Explore the dataset
- Use `.describe()` to review the ranges.
- Plot dose and age against the heart rate change to look for trends.
- Question to discuss: does age appear to reduce the response?

In [None]:
df.describe()

plt.figure(figsize=(5, 4))
plt.scatter(df['dose_mg_per_kg'], df['delta_hr_bpm'], color='steelblue')
plt.xlabel('Dose (mg/kg)')
plt.ylabel('Δ Heart Rate (bpm)')
plt.title('Dose vs response')
plt.show()

plt.figure(figsize=(5, 4))
plt.scatter(df['age_years'], df['delta_hr_bpm'], color='darkorange')
plt.xlabel('Age (years)')
plt.ylabel('Δ Heart Rate (bpm)')
plt.title('Age vs response')
plt.show()

### Exercise 2 · Prepare features and target
We will treat dose, weight, and age as input features (`X`). The change in heart rate is the target (`y`).

In [None]:
feature_cols = ['dose_mg_per_kg', 'weight_kg', 'age_years']
X = df[feature_cols]
y = df['delta_hr_bpm']

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.25, random_state=42
)

model = LinearRegression()
model.fit(X_train, y_train)

print(f"Intercept (baseline Δ bpm): {model.intercept_:.2f}")
for name, coef in zip(feature_cols, model.coef_):
    print(f"{name}: {coef:.2f} Δ bpm per unit")

### Exercise 3 · Evaluate and interpret
Use the test split to measure how well the model generalizes.

In [None]:
y_pred = model.predict(X_test)

r2 = r2_score(y_test, y_pred)
mae = mean_absolute_error(y_test, y_pred)
print(f"R²: {r2:.3f}")
print(f"MAE: {mae:.2f} bpm")

results = X_test.copy()
results['actual_delta_hr'] = y_test.values
results['predicted_delta_hr'] = y_pred.round(2)
results

### Try it yourself
- Change one of the values inside `results` and re-run the prediction cell.
- How big must the dose change be to offset a 10-year age increase?