# Notebook 06 — Polynomial Features & Feature Interactions
📁 File name: 06_polynomial_features.ipynb

This notebook teaches how to expand your feature set by adding polynomial terms and interactions using PolynomialFeatures from scikit-learn — a powerful technique for capturing non-linear relationships.

📒 Notebook Sections
1. Title & Intro
2. What Are Polynomial Features?
3. Select Numeric Columns
4. Generate Polynomial Features
5. Degree Comparison (1 vs 2 vs 3)
6. Risks: Overfitting & Dimensionality
7. Summary & What’s Next

## 1. Title & Introduction (Markdown)
### 06 — Polynomial Features & Interactions

In this notebook, we’ll explore how to expand our feature space using **polynomial combinations** of numeric features.

We’ll use scikit-learn’s `PolynomialFeatures` to:

- Generate higher-order (nonlinear) terms  
- Create interaction terms between features  
- Understand when and how to use them

## 2. What Are Polynomial Features? (Markdown)
### What Are Polynomial Features?

Polynomial features transform your existing numeric columns into new ones that capture:

- Powers (e.g. \( x^2, x^3 \))
- Interactions (e.g. \( x_1 \cdot x_2 \))

Example:

Given `x1` and `x2`, degree=2 produces:

- `x1`
- `x2`
- `x1^2`
- `x2^2`
- `x1 * x2`



## 3. Select Numeric Columns

In [None]:
import pandas as pd

# Load dataset
df = pd.read_csv("../data/sample_data.csv")

# Select 2–3 numeric columns for simplicity
numeric_cols = ["Age", "Income"]  # Replace with relevant columns from your dataset
df[numeric_cols].head()

## 4. Generate Polynomial Features (Degree 2)

In [None]:
from sklearn.preprocessing import PolynomialFeatures

poly = PolynomialFeatures(degree=2, include_bias=False)
poly_features = poly.fit_transform(df[numeric_cols])

# Column names for expanded features
feature_names = poly.get_feature_names_out(numeric_cols)

df_poly = pd.DataFrame(poly_features, columns=feature_names)
df_poly.head()

## 5. Compare Different Degrees

In [None]:
for deg in [1, 2, 3]:
    poly = PolynomialFeatures(degree=deg, include_bias=False)
    transformed = poly.fit_transform(df[numeric_cols])
    print(f"Degree {deg} → Output shape: {transformed.shape}")

 As degree increases, the number of features grows **exponentially**!

- Degree 1: Only original features  
- Degree 2: Adds squares + interactions  
- Degree 3: Adds cubes + 2-way + 3-way interactions

Be careful — this can lead to **overfitting** or slow training.


## 6. When to Use Polynomial Features (Markdown)
### When to Use Polynomial Features

✅ Useful when:

- You suspect non-linear relationships
- You use linear models (e.g., Linear/Logistic Regression)
- You want to enrich the feature space

⚠️ Avoid when:

- You already use non-linear models (trees, XGBoost)
- Your data is small (risk of overfitting)
- You have too many input columns (curse of dimensionality)


## 7. Summary / What’s Next (Markdown)
### Summary

- Polynomial features expand numeric features with non-linear combinations
- `PolynomialFeatures(degree=2)` creates squares and interactions
- Be cautious with high degrees due to feature explosion

**Next Up**: `07_custom_transformations.ipynb`  
We’ll learn how to define your **own transformations** using `FunctionTransformer` or custom Python classes!
