### Regularization in Machine Learning:
- It is a technique in machine learning used to prevent overfitting by adding a penalty to the model's loss function, encouraging simpler and more generalizable models.
- Prevents overfitting - Adds constraints to the model to reduce the risk of memorizing noise in the training data.
- Improves generalization - Encourages simpler models that perform better on new, unseen data.
  - High variance - Overfitting
  - High Bias - Underfitting
  - Low Bias & Low variance - Good Balance. 

### Types of regularization:
#### 1. LASSO Regularization (or) L1:
- LASSO Stands for Least Absolute Shrikage and selection Operator.
- It adds the absolute value of magnitude of the coefficient as a penalty term to the loss function(L).
- This penalty can shrink some coefficients to zero which helps in selecting only the important features and ignoring the less ones.
- Formula, cost = 1/n Σ(y(i) - y^(i))² + α Σ|w(i)|
  - where,
  - m - No of Features
  - n - No of datapoints
  - y(i) - Actual Target Value
  - y^(i) - Predicted Target Value
 


#### Implementation in Python:
- lasso = Lasso(alpha=0.1) - Creates a Lasso regression model with regularization strength alpha set to 0.1.
- X, y = make_regression(n_samples=100, n_features=5, noise=0.1, random_state=42) - Generates regression dataset with 100 samples, 5 features and some noise.
- X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) - Splits the data into 80% training and 20% testing sets.

In [2]:
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import Lasso
from sklearn.metrics import mean_squared_error

X, y = make_regression(n_samples=100, n_features=5, noise=0.1, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

lasso = Lasso(alpha=0.1)
lasso.fit(X_train, y_train)

y_pred = lasso.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
print(f"MSE:{mse}")
print("Coefficients:", lasso.coef_)

MSE:0.06362439921332955
Coefficients: [60.50305581 98.52475354 64.3929265  56.96061238 35.52928502]


#### 2. Ridge Regression (or) L2 Regression:
- It adds the square magnitude of the coefficent as a penalty term to the loss function(L).
- It handles multicollinearity by shrinking the coefficents of correlated features instead of eliminating them.
- Formula, cost = 1/n Σ(y(i) - y^(i))² + α Σ|w²(i)|
   - where,
   - m - No of Features
   - n - No of datapoints
   - y(i) - Actual Target Value
   - y^(i) - Predicted Target Value
   - w(i) - Coefficents of features.
   - α - Regularization parameter that controls the strength of regularization.

### Implementation in Python:
- ridge = Ridge(alpha=1.0) - Creates a Ridge regularization model with regularization strength alpha=1.0

In [3]:
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.linear_model import Ridge

X, y = make_regression(n_samples=100, n_features=5, noise=0.1, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

ridge = Ridge(alpha=1.0)
ridge.fit(X_train, y_train)
y_pred = ridge.predict(X_test)

mse = mean_squared_error(y_test, y_pred)
print("Mean Squared Error:", mse)
print("Coefficients:", ridge.coef_)

Mean Squared Error: 4.114050771972782
Coefficients: [59.87954432 97.15091098 63.24364738 56.31999433 35.34591136]


### 3. Elastic Net Regression:
- It is a combination of L1 and L2.
- That shows that we add the absolute norm of the weights as well as the spread measure of the weights.
- With the help an extra hyperparameter that controls the ratio of the L1 and L2 regularization.


### Implemented in Python:
- model = ElasticNet(alpha=1.0) - Creates a Elastic Net regression model with regularization strength alpha set to 1.0 and L1/L2 mixing ratio 0.5

In [5]:
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import ElasticNet
from sklearn.metrics import mean_squared_error

X, y =make_regression(n_samples=100, n_features=5, noise=0.1, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = ElasticNet(alpha=1.0, l1_ratio=0.5)
model.fit(X_train, y_train)

y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)

print("MSE:",mse)
print("Coefficients:",model.coef_)

MSE: 2662.329268376173
Coefficients: [41.2685658  60.6166494  34.45391474 37.4873701  26.29561474]
