# Power Transformations in Feature Engineering | [Link](https://github.com/AdilShamim8/50-Days-of-Machine-Learning/tree/main/Day%2015%20Power%20Transformer)

Power transformations help make data more Gaussian-like by stabilizing variance and reducing skewness. Two popular methods are the Box‑Cox transform and the Yeo‑Johnson transform.

---

## 1. Box‑Cox Transformation

### Formula

<p>For a feature <i>x</i> (where <i>x &gt; 0</i>), the Box-Cox transformation is defined as:</p>

$$  
x^{(\lambda)} =   
\begin{cases}  
\frac{x^\lambda - 1}{\lambda} & \text{if } \lambda \neq 0, \\
\log(x) & \text{if } \lambda = 0.  
\end{cases}  
$$  

<ul>  
    <li><strong>x</strong>: Original feature value (must be positive)</li>  
    <li><strong>λ</strong>: Transformation parameter (learned from data)</li>  
    <li><strong>x<sup>(λ)</sup></strong>: Transformed feature value</li>  
</ul>  

### When to Use

- Use Box‑Cox when all feature values are strictly positive.
- It is especially useful when data exhibits strong right skewness.

---

## 2. Yeo‑Johnson Transformation

### Formula

The Yeo-Johnson transform extends the Box-Cox method to handle both positive and negative values:  

For \( x > 0 \):  

$$  
x^{(\lambda)} =   
\begin{cases}  
\frac{(x+1)^\lambda - 1}{\lambda} & \text{if } \lambda \neq 0, \\
\log(x+1) & \text{if } \lambda = 0,  
\end{cases}  
$$  

For \( x < 0 \):  

$$  
x^{(\lambda)} =   
\begin{cases}  
-\frac{(-x+1)^{2-\lambda} - 1}{2-\lambda} & \text{if } \lambda \neq 2, \\
-\log(-x+1) & \text{if } \lambda = 2.  
\end{cases}  
$$  

<ul>  
    <li><strong>x</strong>: Original feature value (can be negative or positive)</li>  
    <li><strong>λ</strong>: Transformation parameter (learned from data)</li>  
    <li><strong>x<sup>(λ)</sup></strong>: Transformed feature value</li>  
</ul>  

### When to Use

- Use Yeo‑Johnson when your feature contains zero or negative values.
- It provides similar variance stabilization benefits as Box‑Cox while accommodating a broader range of data.

---

## 3. Python Code Examples

Below are Python code snippets using scikit‑learn’s `PowerTransformer` to apply both Box‑Cox and Yeo‑Johnson transformations.

### 3.1 Box‑Cox Transformation Example

> **Note:** Box‑Cox requires all values to be strictly positive.

```python
import numpy as np
import pandas as pd
from sklearn.preprocessing import PowerTransformer
import matplotlib.pyplot as plt

# Create a sample dataset (all values must be > 0)
data_boxcox = pd.DataFrame({
    'Feature': [1, 2, 3, 10, 20, 30, 50, 100]
})

# Initialize PowerTransformer with method='box-cox'
pt_boxcox = PowerTransformer(method='box-cox', standardize=True)

# Apply the Box-Cox transformation
data_boxcox['BoxCox_Transformed'] = pt_boxcox.fit_transform(data_boxcox[['Feature']])

print("Box-Cox Transformation:")
print(data_boxcox)

# Plot original vs. Box-Cox transformed data
plt.figure(figsize=(10, 4))
plt.subplot(1, 2, 1)
plt.plot(data_boxcox['Feature'], marker='o')
plt.title("Original Data")
plt.xlabel("Index")
plt.ylabel("Feature Value")

plt.subplot(1, 2, 2)
plt.plot(data_boxcox['BoxCox_Transformed'], marker='o', color='green')
plt.title("Box-Cox Transformed Data")
plt.xlabel("Index")
plt.ylabel("Transformed Value")
plt.tight_layout()
plt.show()
```

---

### 3.2 Yeo‑Johnson Transformation Example

> **Note:** Yeo‑Johnson can handle both positive and negative values.

```python
# Create a sample dataset with negative and positive values
data_yeojohnson = pd.DataFrame({
    'Feature': [-10, -5, -1, 0, 1, 5, 10, 20, 50]
})

# Initialize PowerTransformer with method='yeo-johnson'
pt_yeojohnson = PowerTransformer(method='yeo-johnson', standardize=True)

# Apply the Yeo-Johnson transformation
data_yeojohnson['YeoJohnson_Transformed'] = pt_yeojohnson.fit_transform(data_yeojohnson[['Feature']])

print("\nYeo-Johnson Transformation:")
print(data_yeojohnson)

# Plot original vs. Yeo-Johnson transformed data
plt.figure(figsize=(10, 4))
plt.subplot(1, 2, 1)
plt.plot(data_yeojohnson['Feature'], marker='o')
plt.title("Original Data")
plt.xlabel("Index")
plt.ylabel("Feature Value")

plt.subplot(1, 2, 2)
plt.plot(data_yeojohnson['YeoJohnson_Transformed'], marker='o', color='orange')
plt.title("Yeo-Johnson Transformed Data")
plt.xlabel("Index")
plt.ylabel("Transformed Value")
plt.tight_layout()
plt.show()
```

---

## Conclusion

Power transformations—specifically the Box‑Cox and Yeo‑Johnson transforms—are powerful tools to normalize the distribution of your data and stabilize variance. Use Box‑Cox when your data is strictly positive and Yeo‑Johnson when your data includes zero or negative values. The scikit‑learn `PowerTransformer` makes it simple to integrate these transformations into your preprocessing pipelines.