# Regularization in ML
- It is a technique used in Machine Learning to prevent or reduce overfitting, as well as improve the generalization performance of a model.
- It adds a penalty component to the model's formula: $y = B_0 + B_1X_1 + \epsilon$ ($\epsilon$ here is the penalty level).
- There's no one good value of Regularization level. You can control its intensity using hyperparameter tuning to get the best outcome.
- Types of Regularization:
    - L1 Regularization (Lasso)
         - **Technique** It applies a penalty with absolute value of coefficients
         - **Effect** It encourages sparsity in the model, meaning ti can shrink some coefficients to exactly zero. Therefore, it makes Lasso efficient for feature selection, as it only selects the most important features with low overfitting.
    - L2 Regularization (Ridge)
        - **Technique**  It applies a penalty with squared value of coefficients
        - **Effect** It prevents the coefficients from growing too large, thereby reducing the variance in the model. Unlike L1, L2 regularization doesn't make the coefficient go to zero, but shrinks close to zero.
    - ElasticNet
        - **Technique** Using weighted average, it combines both L1 and L2 regularization by adding both penalties to the loss function
        - **Effect** Because you can control the level of both L1 and L2, it's recommended when you have highly correlated features.
- Applying Regularization: 
    - To apply it for Linear Regression, you need to switch from `LinearRegression()` to `Lasso()` or `Ridge()`. 
    - However, in other algorithms, you need to use hyperparameters. For example, `LogisticRegression(penalty={‘l1’, ‘l2’, ‘elasticnet’, None}, default=’l2’)`

- Coefficient minimization formula: $\beta = \beta - \lambda * sign(\beta)$
- Where: $\beta$ is the coefficient, $\lambda$ is the regularization intensity/parameter,  $sign(\beta)$ is the sign of the coefficient (negative vs positive)

![reg](https://www.googleapis.com/download/storage/v1/b/kaggle-forum-message-attachments/o/inbox%2F17277811%2F75f6d401b8efcc9329cde3ffe0bf6d71%2Fridge2.png?generation=1723038136194204&alt=media)

# Regression Models Fit Automation with Hyperparameter Tuning And Evaluation 

In [1]:
import pandas as pd

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression, Lasso, Ridge, ElasticNet

#model evaluation
from sklearn.metrics import mean_squared_error, r2_score

In [2]:
path = '/Users/bassel_instructor/Documents/Datasets/'

df = pd.read_csv(path+'insurance.csv')
df.head()

Unnamed: 0,age,sex,bmi,children,smoker,region,expenses
0,19,female,27.9,0,yes,southwest,16884.92
1,18,male,33.8,1,no,southeast,1725.55
2,28,male,33.0,3,no,southeast,4449.46
3,33,male,22.7,0,no,northwest,21984.47
4,32,male,28.9,0,no,northwest,3866.86


Expenses represents the target, which is the amount of medical expenses.

In [3]:
df.isna().sum()

age         0
sex         0
bmi         0
children    0
smoker      0
region      0
expenses    0
dtype: int64

In [4]:
for col in ['sex', 'smoker', 'region']:
    print(col, ':',df[col].unique())

sex : ['female' 'male']
smoker : ['yes' 'no']
region : ['southwest' 'southeast' 'northwest' 'northeast']


- region is not ordinal so we use One Hot Encoding (`get_dummies()`)
- sex and smoker are binary so we use either One Hot Encoding or label encoding (`map()` or `factorize()`)

In [5]:
df_org = df.copy()

In [6]:
df = pd.get_dummies(data=df, columns=['region'], dtype=int)
df.head()

Unnamed: 0,age,sex,bmi,children,smoker,expenses,region_northeast,region_northwest,region_southeast,region_southwest
0,19,female,27.9,0,yes,16884.92,0,0,0,1
1,18,male,33.8,1,no,1725.55,0,0,1,0
2,28,male,33.0,3,no,4449.46,0,0,1,0
3,33,male,22.7,0,no,21984.47,0,1,0,0
4,32,male,28.9,0,no,3866.86,0,1,0,0


In [7]:
df['sex'], sex_mapping = pd.factorize(df['sex'])
df['smoker'], smoker_mapping = pd.factorize(df['smoker'])
df.head()

Unnamed: 0,age,sex,bmi,children,smoker,expenses,region_northeast,region_northwest,region_southeast,region_southwest
0,19,0,27.9,0,0,16884.92,0,0,0,1
1,18,1,33.8,1,1,1725.55,0,0,1,0
2,28,1,33.0,3,1,4449.46,0,0,1,0
3,33,1,22.7,0,1,21984.47,0,1,0,0
4,32,1,28.9,0,1,3866.86,0,1,0,0
