# Regression Models

Hands-on introduction to regression models using the Diabetes dataset from scikit-learn. We'll fit a Linear Regression model and inspect predictions and errors.

In [None]:
from sklearn.datasets import load_diabetes
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

data = load_diabetes(as_frame=True)
X = data.data[['bmi','bp','s1']]
y = data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = LinearRegression().fit(X_train, y_train)
preds = model.predict(X_test)
print('MSE:', mean_squared_error(y_test, preds))
print('R2:', r2_score(y_test, preds))

In [None]:
# Plot predicted vs actual
plt.figure(figsize=(6,6))
plt.scatter(y_test, preds, alpha=0.7)
plt.plot([y_test.min(), y_test.max()],[y_test.min(), y_test.max()], 'r--')
plt.xlabel('Actual')
plt.ylabel('Predicted')
plt.title('Predicted vs Actual')
plt.show()

## Next steps
- Try adding polynomial features or interaction terms.
- Compare with regularized models (Ridge/Lasso).
- Use cross-validation for more robust error estimates.