# Polynomial Regression Demo
This notebook shows how polynomial regression can model nonlinear relationships by expanding features with powers of the original input.

## Imports

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import Pipeline
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
from sklearn.datasets import make_regression
sns.set_theme()

## Generate nonlinear data
We start from `make_regression` and add a quadratic term to create curvature.

In [None]:
X, y = make_regression(n_samples=200, n_features=1, noise=8.0, random_state=42, bias=30.0)
y = y + 0.8 * np.square(X[:, 0])
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [None]:
plt.scatter(X, y, alpha=0.6)
plt.title('Synthetic nonlinear data')
plt.xlabel('Feature X')
plt.ylabel('Target y')
plt.show()

## Build a polynomial regression pipeline
Increasing the polynomial degree adds flexibility. Try changing the `degree` variable to see underfitting vs. overfitting.

In [None]:
degree = 3
model = Pipeline([
    ('poly', PolynomialFeatures(degree=degree, include_bias=False)),
    ('linreg', LinearRegression())
])
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

## Evaluate
We report MAE, MSE, RMSE, and RÂ².

In [None]:
mse = mean_squared_error(y_test, y_pred)
metrics = {
    'MAE': mean_absolute_error(y_test, y_pred),
    'MSE': mse,
    'RMSE': np.sqrt(mse),
    'R2': r2_score(y_test, y_pred)
}
metrics

## Visualize the fitted curve

In [None]:
X_plot = np.linspace(X.min() - 5, X.max() + 5, 300).reshape(-1, 1)
y_plot = model.predict(X_plot)
plt.figure(figsize=(8,5))
plt.scatter(X_train, y_train, alpha=0.5, label='train')
plt.scatter(X_test, y_test, alpha=0.8, label='test')
plt.plot(X_plot, y_plot, color='darkorange', linewidth=2.5, label='prediction')
plt.xlabel('Feature X')
plt.ylabel('Target y')
plt.title(f'Polynomial Regression (degree={degree})')
plt.legend()
plt.show()

### Notes
- Higher degree = more flexible curve but risk of overfitting.
- Regularization (Ridge/Lasso) can help tame large coefficients.
- For multi-dimensional features, consider scaling before polynomial expansion.