# Elastic Net Regression 

## What is Elastic Net Regression?
It's a smart version of Linear Regression that helps you make better predictions when you have lots of features (columns), especially if some features are related to each other.

It combines the ideas of Lasso Regression (L1 penalty) and Ridge Regression (L2 penalty).

## Combined L1 and L2 Penalties
### L1 Penalty (from Lasso):
- Helps the model ignore unimportant features by making their coefficients exactly zero (feature selection)

### L2 Penalty (from Ridge):
- Helps the model avoid big, wild numbers by making the coefficients smaller (but not exactly zero)

**Elastic Net = L1 + L2**
- Takes the best of both worlds!
- Helps when you have many correlated features (when some columns are similar)

## Elastic Net Cost Function
The formula looks like this:

Cost = MSE + α × [λ₁ × (sum of |w|) + λ₂ × (sum of w²)]

Where:
- MSE: Mean Squared Error (normal regression loss)
- L1: sum of absolute values of weights (|w|)
- L2: sum of squared weights (w²)
- α (alpha): Overall strength of regularization
- λ₁ and λ₂: How much you want to weight L1 and L2 (usually in scikit-learn: l1_ratio)

## Hyperparameter Tuning
Hyperparameters are the settings that you choose before training your model. For Elastic Net, the main ones are:

### alpha
- How much you want to regularize (penalize large coefficients)
- Higher alpha = more regularization

### l1_ratio
- How much weight to put on L1 (Lasso) versus L2 (Ridge)
- l1_ratio = 0 → pure Ridge
- l1_ratio = 1 → pure Lasso
- Between 0 and 1 → Elastic Net

## How to Choose the Best Values?
1. Try different combinations of alpha and l1_ratio
2. Use cross-validation to test which settings make your model perform best (not overfitting, not underfitting)
3. In Python, you can do this easily with scikit-learn's ElasticNetCV




## Step-by-Step Example

Suppose you have these data points:

| x | y |
|---|---|
| 1 | 2 |
| 2 | 3 |
| 3 | 6 |

You want to fit a line:

$$
y = w \cdot x + b
$$

Let’s set **$ b = 0 $** (to keep it simple).

Assume after training (for this example), **$ w = 1.5 $**.

### Hyperparameters:
- $ \alpha = 1 $
- $ \text{l1\_ratio} = 0.6 $

### Elastic Net Cost Function:

$$
\text{Cost} = \frac{1}{n} \sum (y_i - w x_i)^2 + \alpha \left[ \text{l1\_ratio} \times |w| + (1 - \text{l1\_ratio}) \times w^2 \right]
$$

---

### Step 1a: Calculate MSE

$$
\text{MSE} = \frac{1}{3} \left[ (2 - 1.5 \times 1)^2 + (3 - 1.5 \times 2)^2 + (6 - 1.5 \times 3)^2 \right]
$$

Calculate each term:
- $ 2 - 1.5 \times 1 = 0.5 \rightarrow (0.5)^2 = 0.25 $
- $ 3 - 1.5 \times 2 = 0 \rightarrow (0)^2 = 0 $
- $ 6 - 1.5 \times 3 = 1.5 \rightarrow (1.5)^2 = 2.25 $

$$
\text{MSE} = \frac{1}{3} (0.25 + 0 + 2.25) = \frac{1}{3} (2.5) \approx 0.833
$$

---

### Step 1b: Calculate L1 and L2 Terms

- $ |w| = |1.5| = 1.5 $
- $ w^2 = (1.5)^2 = 2.25 $

- **L1 part:** $ 0.6 \times 1.5 = 0.9 $
- **L2 part:** $ 0.4 \times 2.25 = 0.9 $

Total regularization: $ 0.9 + 0.9 = 1.8 $

---

### Step 1c: Total Elastic Net Cost

$$
\text{Cost} = \text{MSE} + \alpha \times (\text{L1} + \text{L2}) = 0.833 + 1.8 = 2.633
$$

---

# Final Answer: The Elastic Net cost for this example is **2.633**.

In [1]:
import numpy as np 

In [2]:
# Data
x = np.array([1, 2, 3])
y = np.array([2, 3, 6])
w = 1.5
b = 0

In [3]:
# Hyperparameters
alpha = 1
l1_ratio = 0.6

In [4]:
# 1. Predict y
y_pred = w * x + b

In [5]:
# 2. Calculate MSE
mse = np.mean((y - y_pred)**2)

In [6]:
# 3. Calculate L1 and L2 penalties
l1 = abs(w)
l2 = w**2

In [8]:
l1_part = l1_ratio * l1
l2_part = (1 - l1_ratio) * l2
reg_total = l1_part + l2_part

In [9]:
cost = mse + alpha * reg_total

In [10]:
cost

np.float64(2.6333333333333333)