## <span style="color:Yellowgreen">**L1 (Lasso) and L2 (Ridge) Regularization**</span>

### **L1 Regularization (Lasso)**
- Definition: Adds a penalty term proportional to the absolute values of the coefficients (∑|wi|).
- Key Effects:
1. Encourages sparse solutions by shrinking some coefficients to exactly zero, effectively performing feature selection.
2. Useful when you suspect only a few features are significant.
- Applications:
When interpretability and feature selection are important.

### **L2 Regularization (Ridge)**
- Definition: Adds a penalty term proportional to the squared values of the coefficients (∑ wi**2)
- Key Effects:
1. Shrinks coefficients toward zero but does not set them to zero.
2. Useful for reducing multicollinearity (correlation between features).
3. Stabilizes solutions by reducing variance.
- Applications:
When all features are expected to contribute, even weakly.


In [1]:
import pandas as pd
import numpy as np

In [2]:
# Suppress Warnings for clean notebook
import warnings
warnings.filterwarnings('ignore')

In [3]:
df=pd.read_csv(r"F:\Machine Learning all Algorithms\15 Regularization\Melbourne_housing_FULL.csv")
df

Unnamed: 0,Suburb,Address,Rooms,Type,Price,Method,SellerG,Date,Distance,Postcode,...,Bathroom,Car,Landsize,BuildingArea,YearBuilt,CouncilArea,Lattitude,Longtitude,Regionname,Propertycount
0,Abbotsford,68 Studley St,2,h,,SS,Jellis,3/09/2016,2.5,3067.0,...,1.0,1.0,126.0,,,Yarra City Council,-37.80140,144.99580,Northern Metropolitan,4019.0
1,Abbotsford,85 Turner St,2,h,1480000.0,S,Biggin,3/12/2016,2.5,3067.0,...,1.0,1.0,202.0,,,Yarra City Council,-37.79960,144.99840,Northern Metropolitan,4019.0
2,Abbotsford,25 Bloomburg St,2,h,1035000.0,S,Biggin,4/02/2016,2.5,3067.0,...,1.0,0.0,156.0,79.0,1900.0,Yarra City Council,-37.80790,144.99340,Northern Metropolitan,4019.0
3,Abbotsford,18/659 Victoria St,3,u,,VB,Rounds,4/02/2016,2.5,3067.0,...,2.0,1.0,0.0,,,Yarra City Council,-37.81140,145.01160,Northern Metropolitan,4019.0
4,Abbotsford,5 Charles St,3,h,1465000.0,SP,Biggin,4/03/2017,2.5,3067.0,...,2.0,0.0,134.0,150.0,1900.0,Yarra City Council,-37.80930,144.99440,Northern Metropolitan,4019.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
34852,Yarraville,13 Burns St,4,h,1480000.0,PI,Jas,24/02/2018,6.3,3013.0,...,1.0,3.0,593.0,,,Maribyrnong City Council,-37.81053,144.88467,Western Metropolitan,6543.0
34853,Yarraville,29A Murray St,2,h,888000.0,SP,Sweeney,24/02/2018,6.3,3013.0,...,2.0,1.0,98.0,104.0,2018.0,Maribyrnong City Council,-37.81551,144.88826,Western Metropolitan,6543.0
34854,Yarraville,147A Severn St,2,t,705000.0,S,Jas,24/02/2018,6.3,3013.0,...,1.0,2.0,220.0,120.0,2000.0,Maribyrnong City Council,-37.82286,144.87856,Western Metropolitan,6543.0
34855,Yarraville,12/37 Stephen St,3,h,1140000.0,SP,hockingstuart,24/02/2018,6.3,3013.0,...,,,,,,Maribyrnong City Council,,,Western Metropolitan,6543.0


In [4]:
df.nunique()

Suburb             351
Address          34009
Rooms               12
Type                 3
Price             2871
Method               9
SellerG            388
Date                78
Distance           215
Postcode           211
Bedroom2            15
Bathroom            11
Car                 15
Landsize          1684
BuildingArea       740
YearBuilt          160
CouncilArea         33
Lattitude        13402
Longtitude       14524
Regionname           8
Propertycount      342
dtype: int64

In [5]:
cols_to_use=['Suburb', 'Rooms', 'Type', 'Price', 'Method', 'SellerG', 'Regionname', 'Propertycount', 
               'Distance', 'CouncilArea', 'Bedroom2', 'Bathroom', 'Car', 'Landsize', 'BuildingArea' ]

df=df[cols_to_use]

In [6]:
df.head()

Unnamed: 0,Suburb,Rooms,Type,Price,Method,SellerG,Regionname,Propertycount,Distance,CouncilArea,Bedroom2,Bathroom,Car,Landsize,BuildingArea
0,Abbotsford,2,h,,SS,Jellis,Northern Metropolitan,4019.0,2.5,Yarra City Council,2.0,1.0,1.0,126.0,
1,Abbotsford,2,h,1480000.0,S,Biggin,Northern Metropolitan,4019.0,2.5,Yarra City Council,2.0,1.0,1.0,202.0,
2,Abbotsford,2,h,1035000.0,S,Biggin,Northern Metropolitan,4019.0,2.5,Yarra City Council,2.0,1.0,0.0,156.0,79.0
3,Abbotsford,3,u,,VB,Rounds,Northern Metropolitan,4019.0,2.5,Yarra City Council,3.0,2.0,1.0,0.0,
4,Abbotsford,3,h,1465000.0,SP,Biggin,Northern Metropolitan,4019.0,2.5,Yarra City Council,3.0,2.0,0.0,134.0,150.0


In [7]:
df.shape

(34857, 15)

### **Checking for Nan values**

In [8]:
df.isna().sum()

Suburb               0
Rooms                0
Type                 0
Price             7610
Method               0
SellerG              0
Regionname           3
Propertycount        3
Distance             1
CouncilArea          3
Bedroom2          8217
Bathroom          8226
Car               8728
Landsize         11810
BuildingArea     21115
dtype: int64

### **Handling Missing values**

In [9]:
# Some feature's missing values can be treated as zero (another class for NA values or absence of that feature)
# like 0 for Propertycount, Bedroom2 will refer to other class of NA values
# like 0 for Car feature will mean that there's no car parking feature with house

cols_to_fillna=['Propertycount', 'Bedroom2', 'Bathroom', 'Car', 'Distance']
df[cols_to_fillna]=df[cols_to_fillna].fillna(0)

# other continuous features can be imputed with mean for faster results since our focus is on Reducing overfitting
# using Lasso and Ridge Regression
df['Landsize']=df['Landsize'].fillna(df['Landsize'].mean())
df['BuildingArea']=df['BuildingArea'].fillna(df['BuildingArea'].mean())

df.isna().sum()

Suburb              0
Rooms               0
Type                0
Price            7610
Method              0
SellerG             0
Regionname          3
Propertycount       0
Distance            0
CouncilArea         3
Bedroom2            0
Bathroom            0
Car                 0
Landsize            0
BuildingArea        0
dtype: int64

**Drop NA values of Price, since it's our predictive variable we won't impute it**

In [10]:
df.dropna(inplace=True)

In [11]:
df.shape

(27244, 15)

### **One hot encoding the categorical features**

In [12]:
dumy=pd.get_dummies(df, dtype='int', drop_first=True)
dumy

Unnamed: 0,Rooms,Price,Propertycount,Distance,Bedroom2,Bathroom,Car,Landsize,BuildingArea,Suburb_Aberfeldie,...,CouncilArea_Moorabool Shire Council,CouncilArea_Moreland City Council,CouncilArea_Nillumbik Shire Council,CouncilArea_Port Phillip City Council,CouncilArea_Stonnington City Council,CouncilArea_Whitehorse City Council,CouncilArea_Whittlesea City Council,CouncilArea_Wyndham City Council,CouncilArea_Yarra City Council,CouncilArea_Yarra Ranges Shire Council
1,2,1480000.0,4019.0,2.5,2.0,1.0,1.0,202.000000,160.2564,0,...,0,0,0,0,0,0,0,0,1,0
2,2,1035000.0,4019.0,2.5,2.0,1.0,0.0,156.000000,79.0000,0,...,0,0,0,0,0,0,0,0,1,0
4,3,1465000.0,4019.0,2.5,3.0,2.0,0.0,134.000000,150.0000,0,...,0,0,0,0,0,0,0,0,1,0
5,3,850000.0,4019.0,2.5,3.0,2.0,1.0,94.000000,160.2564,0,...,0,0,0,0,0,0,0,0,1,0
6,4,1600000.0,4019.0,2.5,3.0,1.0,2.0,120.000000,142.0000,0,...,0,0,0,0,0,0,0,0,1,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
34852,4,1480000.0,6543.0,6.3,4.0,1.0,3.0,593.000000,160.2564,0,...,0,0,0,0,0,0,0,0,0,0
34853,2,888000.0,6543.0,6.3,2.0,2.0,1.0,98.000000,104.0000,0,...,0,0,0,0,0,0,0,0,0,0
34854,2,705000.0,6543.0,6.3,2.0,1.0,2.0,220.000000,120.0000,0,...,0,0,0,0,0,0,0,0,0,0
34855,3,1140000.0,6543.0,6.3,0.0,0.0,0.0,593.598993,160.2564,0,...,0,0,0,0,0,0,0,0,0,0


### **Let's bifurcate our dataset into train and test dataset**

In [13]:
X=dumy.drop('Price', axis=1)

y=dumy['Price']


### **Let's train our Linear Regression Model on training dataset and check the accuracy on test set**

In [None]:
from sklearn.model_selection import train_test_split
train_X, test_X, train_y, test_y = train_test_split(X, y, test_size=0.3, random_state=2)           #Different set of param to test out regularization

In [31]:
from sklearn.linear_model import LinearRegression
reg = LinearRegression()
reg.fit(train_X, train_y)

In [32]:
reg.score(test_X, test_y)

0.13853683161631236

In [33]:
reg.score(train_X, train_y)

0.6827792395792723

**Here training score is 68% but test score is 13.85% which is very low**
**Normal Regression is clearly overfitting the data, let's try other models**

### Using Lasso (L1 Regularized) Regression Model

In [36]:
from sklearn.linear_model import Lasso
lasso_reg=Lasso(alpha=50, max_iter=100, tol=0.1)    # Higher values of alpha leads to stronger shrinkage of coefficients toward zero.

In [37]:
lasso_reg.fit(train_X, train_y)

In [38]:
lasso_reg.score(test_X, test_y)

0.6636111369404488

In [39]:
lasso_reg.score(test_X, test_y)

0.6636111369404488

### Using Ridge (L2 Regularized) Regression Model

In [40]:
from sklearn.linear_model import Ridge
ridge_reg=Ridge(alpha=50, max_iter=100, tol=0.1)
ridge_reg.fit(train_X, train_y)

In [42]:
ridge_reg.score(test_X, test_y)

0.6670848945194959

In [41]:
ridge_reg.score(train_X, train_y)

0.6622376739684328

**We can observe that Lasso and Ridge regression prove to be beneficial when our Simple Linear Regression model overfits. These results may not be that high but significant in most cases.
Also, L1 and L2 Regularizations are used in Neural Networks.**