# Polynomial regression

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns; sns.set_theme(font_scale=.75)

from sklearn.linear_model import LinearRegression
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import PolynomialFeatures
from sklearn.model_selection import cross_val_score

Vi skapar ett dataset med en ickelinjär relation (notera `X**2` när vi genererar värdena i `y`.)

In [33]:
m = 100

rng = np.random.default_rng(seed=42)

X = (5 * rng.random(m,) - 2)
y = 4 + 3 * X**2 + 5 * X + rng.random(m) * 8  # Nonlinear relationship + noise

Den ickelinjära funktionen ser ut såhär:


$y = 4 + 3x^2 + 5x + \epsilon$

In [None]:
sns.scatterplot(x=X, y=y, alpha=.8, s=25)

### Träna modeller

#### Utan `PolynomialFeatures`

In [None]:
X_reshaped = X.reshape(-1, 1) # Modellen vill ha en (m, n)-matris
m1 = LinearRegression()
m1.fit(X_reshaped, y)

In [None]:
(X_reshaped[:, 0] == X_reshaped.reshape(-1)).all()

In [None]:
x = np.linspace(-2, 3, m) # Linjär data för att plotta en linje

sns.scatterplot(x=X, y=y, alpha=.8, s=25)
sns.lineplot(x=x, y=m1.predict(x.reshape(-1, 1)), c="red", alpha=.65, lw=2, label="Predictions")

In [None]:
cross_val_score(m1, X_reshaped, y, scoring="neg_root_mean_squared_error").mean() * -1

In [None]:
y.mean()

In [None]:
m1.intercept_, m1.coef_

$ y = 13.2 + 7.2x $

#### Med `PolynomialFeatures`

In [None]:
m2 = make_pipeline(PolynomialFeatures(degree=2), LinearRegression())
m2.fit(X_reshaped, y)
cross_val_score(m2, X_reshaped, y).mean()

In [None]:
sns.scatterplot(x=X, y=y, alpha=.8, s=25)
sns.lineplot(x=x, y=m2.predict(x.reshape(-1, 1)), c="red", alpha=.65, lw=2, label="Predictions") # type: ignore

`m2` är en pipeline. För att komma åt *estimatorn* använder vi `named_steps`.

In [None]:
m2.named_steps["polynomialfeatures"].n_output_features_

In [None]:
m2.named_steps["linearregression"].intercept_, m2.named_steps["linearregression"].coef_

Riktiga: $y =  4 + 3x^2 + 5x + \epsilon$

m2: $y = 8.2 + 2.9x^2 + 4.8x $