# Regression Models

Bias--variance tradeoff is key. [wiki](https://en.wikipedia.org/wiki/Bias%E2%80%93variance_tradeoff)

### Linear Regression
We model the target as $$y = Xw + b$$ and minimize the mean squared error:
$$	ext{MSE}=\frac{1}{n}\sum_{i=1}^n (y_i-\hat{y}_i)^2$$

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
X = pd.read_csv('../data/house_prices.csv')
y = X.pop('SalePrice')
X_train,X_test,y_train,y_test=train_test_split(X,y,random_state=0)
model=LinearRegression().fit(X_train,y_train)
model.score(X_test,y_test)

0.9999996298908006

### Ridge Regression
Adds L2 penalty to the loss function:
$$\hat{w}=\arg\min_w \|y-Xw\|^2+\alpha \|w\|^2$$

In [ ]:
from sklearn.linear_model import Ridge
ridge=Ridge(alpha=1.0).fit(X_train,y_train)
ridge.score(X_test,y_test)

### Lasso Regression
Uses an L1 penalty to encourage sparsity:
$$\hat{w}=\arg\min_w \|y-Xw\|^2+\alpha \|w\|_1$$

In [ ]:
from sklearn.linear_model import Lasso
lasso=Lasso(alpha=0.1).fit(X_train,y_train)
lasso.score(X_test,y_test)

### Advantages and Disadvantages
**Advantages**:
- Simple to interpret coefficients.
- Fast to train.
**Disadvantages**:
- Assumes linear relationships.
- Sensitive to outliers.

### Usage Example

In [2]:
model.predict(X_test.iloc[:2])

array([202.22093264, -12.19762234])

### Linear Regression Refresher

Linear regression models a target variable $y$ as a linear combination of features $x_1, \ldots, x_p$:

$$\hat{y} = \beta_0 + \beta_1 x_1 + \cdots + \beta_p x_p$$

The optimal coefficients minimize the residual sum of squares:

$$\min_\beta \|y - X\beta\|^2,$$

which has the closed-form solution (normal equation):

$$\hat{\beta} = (X^\top X)^{-1} X^\top y.$$


### Worked Example: Interpreting Coefficients

```python
import numpy as np
from sklearn.linear_model import LinearRegression

X = np.array([[800], [1000], [1200]])  # square footage
y = np.array([150_000, 200_000, 230_000])
model = LinearRegression().fit(X, y)

print(model.intercept_)
print(model.coef_)
print(model.predict([[900]]))
```

The intercept represents the predicted price for a home with zero square feet, and the coefficient indicates the expected change in price for each additional square foot. The final line predicts the price for a 900 square-foot home.


### Exercises & Further Reading
1. Add Ridge/Lasso.
2. Perform cross validation.
3. [sklearn linear models](https://scikit-learn.org/stable/modules/linear_model.html)4. Plot residuals for a fitted model and comment on any patterns.
5. Use `PolynomialFeatures` to fit a quadratic curve and compare it to the linear fit.
6. Derive the normal equation yourself and verify the solution using NumPy.
