overfitting happens when the model is too complex and fits the training data too closely helps 
in making it perform poorly on new data. To avoid this, we use techniques like Lasso and Ridge
regression which helps to simplify the model by limiting the size of the coefficients.

#### lasso regression L1

- It helps preventing overfitting by penalizing large coefficients which is useful when the number of predictors is large.
- It automatically selects most important features by reducing the coefficients of less significant features to zero.
- penalty term is added.
- Selecting correct lambda value is important. Cross-validation techniques are used to find the optimal value helps in balancing model complexity and predictive performance.

#### ridge regression L2

- Shrinks all coefficients toward zero, but none become exactly zero.
- When there is multicollinearity (high correlation among features) or you want all features to contribute


In [1]:
import numpy as np
import pandas as pd
from sklearn.linear_model import RidgeCV, LassoCV
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score

In [2]:
from sklearn.datasets import fetch_california_housing
data = fetch_california_housing()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target

In [3]:
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)


##### Ridge Regression with Cross-Validation
- Tests 50 values of alpha
- Uses 5-fold CV
- Chooses the alpha giving the best performance

In [4]:
# Try a range of alpha values
alphas = np.logspace(-3, 3, 50)  # 10^-3 to 10^3

ridge_cv = RidgeCV(alphas=alphas, cv=5)
ridge_cv.fit(X_train, y_train)

print("Best alpha (Ridge):", ridge_cv.alpha_)


Best alpha (Ridge): 8.286427728546842


In [5]:
y_pred_ridge = ridge_cv.predict(X_test)
print("Ridge R²:", r2_score(y_test, y_pred_ridge))
print("Ridge RMSE:", np.sqrt(mean_squared_error(y_test, y_pred_ridge)))

Ridge R²: 0.5763294311871543
Ridge RMSE: 0.7451051718977542


#### Lasso with CV

In [6]:
lasso_cv = LassoCV(alphas=alphas, cv=5, max_iter=10000)
lasso_cv.fit(X_train, y_train)

print("Best alpha (Lasso):", lasso_cv.alpha_)


Best alpha (Lasso): 0.001


In [7]:
y_pred_lasso = lasso_cv.predict(X_test)
print("Lasso R²:", r2_score(y_test, y_pred_lasso))
print("Lasso RMSE:", np.sqrt(mean_squared_error(y_test, y_pred_lasso)))


Lasso R²: 0.5773121026225017
Lasso RMSE: 0.7442405630689862


##### comparison b/w ridge & lasso

In [8]:
coef_comparison = pd.DataFrame({
    'Feature': X.columns,
    'Ridge Coeff': ridge_cv.coef_,
    'Lasso Coeff': lasso_cv.coef_
})
print(coef_comparison)


      Feature  Ridge Coeff  Lasso Coeff
0      MedInc     0.447339     0.444698
1    HouseAge     0.009738     0.009783
2    AveRooms    -0.120803    -0.115737
3   AveBedrms     0.769052     0.741172
4  Population    -0.000002    -0.000002
5    AveOccup    -0.003523    -0.003510
6    Latitude    -0.419735    -0.418522
7   Longitude    -0.433473    -0.431828
