# Ridge, Lasso, and Elastic Net Regression Tutorial 

Simplified Regularized Regression Tutorial This chapter explains Ridge, Lasso, and Elastic Net Regression, which are like linear regression but add penalties to keep the model simple and avoid overfitting. You’ll learn how to use gradient descent with easy equations to find the coefficients for these models, based on a small dataset. After reading, you’ll know:

         • How Ridge, Lasso, and Elastic Net improve linear regression.
         • How to update coefficients using simple math.
Let’s get started!

### ✅1.Tutorial Data Set
    We use the same dataset as the linear regression tutorial:-
| x | Prediction |
|---|------------|
| 1 | 1       |
| 2 | 3       |
| 4 | 3      |
| 3 | 2       |
| 5 | 5       |


---

### ✅2.Ridge Regression

Ridge Regression builds a model like y = B0+B1·x, but adds a penalty to keep B1 from getting too big. This helps the model work better on new data. 
The penalty shrinks B1 a little each step. 
    
We use gradient descent to update B0 and B1. For each data point:

        1. Predict y with the current B0 and B1.
        2. Calculate the error: error = predicted y − real y.
        3. Update coefficients using these simple equations:

                                                            B0 = B0 − α · error
                                                            
                                                            B1 = B1 − α · error · x − α · λ · B1

                    Here, α is the step size (e.g., 0.01), λ is the penalty strength (e.g., 0.1), and the term α · λ · B1 shrinks B1.


### ✅2.1 Ridge Regression Example
  Start with B0 = 0, B1 = 0, α = 0.01, λ = 0.1. For the first data point (x = 1, y = 1):
  
                 • Predict: y = 0 + 0 · 1 
                              = 0
                 
                 • Error: error = 0 − 1 
                                = −1
                 
                 • Update B0: B0 = 0 − 0.01 · (−1) 
                                 = 0.01
                 
                 • Update B1: B1 = 0 − 0.01 · (−1) · 1 − 0.01 · 0.1 · 0 
                                 = 0.01
  We repeat this for all 5 data points, then do 4 more rounds (4 epochs, 20 iterations total). After 20 iterations, we might get B0 ≈ 0.23, B1 ≈ 0.78.  
---
---
### ✅3 Lasso Regression
Lasso Regression also uses y = B0 + B1 · x, but its penalty can make B1 exactly zero, which is useful if some inputs don’t matter. The penalty pushes B1 toward zero with a fixed nudge. The update equations are:

                                                            B0 = B0 − α · error

                                                            B1 = B1 − α · error · x − α · λ
                                                
                                                The α · λ term is a constant push to make B1 smaller.

### ✅3.1 Lasso Regression Example
    Using B0 = 0, B1 = 0, α = 0.01, λ = 0.1, for the first data point (x = 1, y = 1):
    
                • Predict: y = 0
                
                • Error: error = 0 − 1 
                               = −1
                
                • Update B0: B0 = 0 − 0.01 · (−1) 
                                = 0.01
                
                • Update B1: B1 = 0 − 0.01 · (−1) · 1 − 0.01 · 0.1 
                                = 0.01 − 0.001 
                                = 0.009
After 20 iterations, we might get B0 ≈ 0.23, B1 ≈ 0.77, slightly smaller than Ridge.                                               
---

---

### ✅4 Elastic Net Regression
    Elastic Net combines Ridge and Lasso penalties. It shrinks B1 like Ridge and pushes it toward zero like Lasso. The update equations are:

                                                        B0 = B0 − α · error
                                                        
                                                        B1 = B1 − α · error · x − α · λ1 − α · λ2 · B1
                                                        
                                Here, λ1 (e.g., 0.05) is the Lasso penalty, and λ2 (e.g., 0.05) is the Ridge penalty.

                                
### ✅4.1 Elastic Net Example
    Set B0 = 0, B1 = 0, α = 0.01, λ1 = 0.05, λ2 = 0.05. For the first data point:
                • Predict: y = 0
                
                • Error: error = 0 − 1 
                               = −1
                               
                • Update B0: B0 = 0 − 0.01 · (−1) 
                                = 0.01
                                
                • Update B1: B1 = 0−0.01·(−1)·1−0.01·0.05−0.01·0.05·0 
                                = 0.01−0.0005 
                                = 0.0095
After 20 iterations, we might get B0 ≈ 0.23, B1 ≈ 0.78.
---
---
### ✅5 Predictions
    Using the Ridge coefficients (B0 = 0.23, B1 = 0.78) as an example, we predict:
| x | Prediction |
|---|------------|
| 1 | 1.01       |
| 2 | 1.79       |
| 4 | 3.35       |
| 3 | 2.57       |
| 5 | 4.13       |

The error (RMSE) is about 0.72, similar to the linear regression tutorial, showing a good fit.


#### ✅6 Summary
You learned how Ridge, Lasso, and Elastic Net Regression add penalties to linear regression to make better models. You saw how to update coefficients using simple equations. In the next chapter, you’ll explore other regression techniques.
---


### ✅Ridge Regression (L2 Regularization)

In [52]:
#-- Ridge Regression (L2 Regularization)

from pandas import read_csv
from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import Ridge

filename = 'boston.csv'
dataframe = read_csv(filename)
array = dataframe.values
X = array[:,0:13]
Y = array[:,13]

num_folds = 10

kfold = KFold(n_splits=10)
model = Ridge(alpha=1) #L2 regularization term add here in ridge regression

scoring = 'neg_mean_squared_error'

results = cross_val_score(model, X, Y, cv=kfold, scoring=scoring)
print(results.mean())

#Predict the output for a specific input
model.fit(X,Y)
test = [[0.06860,0.00,2.890,0,0.4450,7.4160,62.50,3.4952,2,276.0,18.00,396.90,6.19]]
print(model.predict(test))

-34.07824620925927
[31.99969554]


### ✅LASSO Regression (L1 Regularization)


In [58]:
# LASSO Regression (L1 Regularization)

from pandas import read_csv
from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import Lasso

filename = 'boston.csv'
dataframe = read_csv(filename)
array = dataframe.values
X = array[:,0:13]
Y = array[:,13]

num_folds = 10

kfold = KFold(n_splits=10)
model = Lasso(alpha=0.5) #L1 regularization term add here in ridge regression

scoring = 'neg_mean_squared_error'

results = cross_val_score(model, X, Y, cv=kfold, scoring=scoring)
print(results.mean())



#Predict the output for a specific iinput
model.fit(X,Y)
test = [[0.06860,0.00,2.890,0,0.4450,7.4160,62.50,3.4952,2,276.0,18.00,396.90,6.19]]
print(model.predict(test))

-32.98763988638431
[30.30931607]


### ✅ElasticNet Regression

In [63]:
# ElasticNet Regression

from pandas import read_csv
from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import ElasticNet

filename = 'boston.csv'
dataframe = read_csv(filename)
array = dataframe.values
X = array[:,0:13]
Y = array[:,13]

num_folds = 10

kfold = KFold(n_splits=10)
model = ElasticNet(alpha=0.3, l1_ratio=0.2) #ElasticNet regularization term (i.e., alpha and l1_ratio) add here in ridge regression

scoring = 'neg_mean_squared_error'

results = cross_val_score(model, X, Y, cv=kfold, scoring=scoring)
print(results.mean())



#Predict the output for a specific iinput
model.fit(X,Y)
test = [[0.06860,0.00,2.890,0,0.4450,7.4160,62.50,3.4952,2,276.0,18.00,396.90,6.19]]
print(model.predict(test))

-29.680630733514626
[30.15094065]
