### What is polinomial regression?
You can regard polynomial regression as a generalized case of linear regression. You assume the polynomial dependence between the output and inputs and, consequently, the polynomial estimated regression function.

In other words, in addition to linear terms like 𝑏₁𝑥₁, your regression function 𝑓 can include nonlinear terms such as 𝑏₂𝑥₁², 𝑏₃𝑥₁³, or even 𝑏₄𝑥₁𝑥₂, 𝑏₅𝑥₁²𝑥₂.

The simplest example of polynomial regression has a single independent variable, and the estimated regression function is a polynomial of degree two: 𝑓(𝑥) = 𝑏₀ + 𝑏₁𝑥 + 𝑏₂𝑥².

### What difference does it make in creating the estimator?

Coding wise? Pretty much no change!
Despite inclusion of non-linear terms, our goal remains unchanged. ie. calculate weights 𝑏₀, 𝑏₁, and 𝑏₂ to minimize SSR.
The inputs with different degrees can simple considered as a whole input. instead of x<sub>1</sub> we will have the input as x<sub>1</sub><sup>n</sup>

> For more info. on regression, read the markdowns from SimpleLinearRegression.py

In [1]:
# Import packages
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures

In [12]:
# Create sample data
x = np.array([5, 15, 25, 35, 45, 55]).reshape((-1, 1))  # 2D
y = np.array([15, 11, 2, 8, 25, 32])                    # 1D

# Converting data to polynomial by using transformer
# fit and apply polynomial transformer
x_ = PolynomialFeatures(degree=2, include_bias=False).fit_transform(x)

# print new data
print(x_)

[[   5.   25.]
 [  15.  225.]
 [  25.  625.]
 [  35. 1225.]
 [  45. 2025.]
 [  55. 3025.]]


<b>include_bias</b> is a Boolean (True by default) that decides whether to include the bias, or intercept, column of 1 values (True) or not (False).

In [9]:
# Create Model and fit
model = LinearRegression().fit(x_, y)

In [11]:
# Show R square
print(f"coefficient of determination: {model.score(x_, y)}")

# Show b0
print(f"intercept: {model.intercept_}")

# Show weight b1 and b2 (2D input)
print(f"coefficients: {model.coef_}")

coefficient of determination: 0.8908516262498563
intercept: 21.372321428571418
coefficients: [-1.32357143  0.02839286]


In [14]:
# Predict using train x set
y_pred = model.predict(x_)
print(y_pred)

[15.46428571  7.90714286  6.02857143  9.82857143 19.30714286 34.46428571]
