* Polynomial Regression is a form of linear regression known as a special case of multiple linear regression which estimates the relationship as an nth degree polynomial. Polynomial Regression is sensitive to outliers so the presence of one or two outliers can also badly affect the performance.

* In this we need to transform the array of inputs to include non-linear terms such as x^2.

* **PolynomialFeatures generate polynomial and interaction features. Generate a new feature matrix consisting of all polynomial combination of the features with degree less than or equal to the specified degree. For example, if an input sample is two dimensional and of the form [a, b], the degree-2 polynomial features are [1, a, b, a^2, ab, b^2].**


* The complex collaborative effects of features towards prediction of a variable is called feature interaction. Another aspect of feature interaction is the variation of one feature with respect to another with which it is interacting. These variables are often referred to as interaction variables.

In [1]:
#Importing the necessary libraries
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures

* We need to input the data as an **two dimensional array** so for that reason the reshape() function was used.

In [10]:
#Loading the data
x = np.array([5,15,25,35,45,55]).reshape((-1,1))
y = np.array([15,11,2,8,25,32])

In [13]:
x

array([[ 5],
       [15],
       [25],
       [35],
       [45],
       [55]])

In [14]:
y

array([15, 11,  2,  8, 25, 32])

In [15]:
#Transforming the input data
transformer = PolynomialFeatures(degree=2,include_bias=False)

* **degree parameter** - is an integer(default=2) that represents the degree of the polynomial regression function.


* **interaction_only** - is a Boolean (default=False) that decides whether to include only interaction features (True) or all features(False).


* **include_bias** - is a Boolean (default=True) that decides whether to include the bias(the feature in which all polynomial powers are zero i.e. a column of ones which acts as an intercept term in a linear model), or intercept, column of 1 values(True) or not(False).

In [16]:
transformer.fit(x)

PolynomialFeatures(include_bias=False)

* Once transformer is fitted, then it's ready to create a new, modified input array which will be done using the transform() function.

In [18]:
x_transform = transformer.transform(x)

* The above three lines of code can be replaced with a single line of code using the fit_transform() function.

In [19]:
x_transform = PolynomialFeatures(degree=2,include_bias=False).fit_transform(x)
x_transform

array([[   5.,   25.],
       [  15.,  225.],
       [  25.,  625.],
       [  35., 1225.],
       [  45., 2025.],
       [  55., 3025.]])

In [20]:
#Creating the model
model = LinearRegression()

In [21]:
#Fitting the model
model.fit(x_transform,y)

LinearRegression()

* The score function returns the R^2 value.

In [22]:
#Calcualting the score
r_sq = model.score(x_transform,y)
r_sq

0.8908516262498564

In [24]:
print(f"Intercept:{model.intercept_}")
print(f"Coefficients:{model.coef_}")

Intercept:21.37232142857144
Coefficients:[-1.32357143  0.02839286]


In [25]:
#Making predictions
y_pred = model.predict(x_transform)
y_pred

array([15.46428571,  7.90714286,  6.02857143,  9.82857143, 19.30714286,
       34.46428571])

In [28]:
print(f"Coefficient of determination:{r_sq}")
print(f"Intercept:{model.intercept_}")
print(f"Coefficients:{model.coef_}")
print(f"Predicted response:{y_pred}")

Coefficient of determination:0.8908516262498564
Intercept:21.37232142857144
Coefficients:[-1.32357143  0.02839286]
Predicted response:[15.46428571  7.90714286  6.02857143  9.82857143 19.30714286 34.46428571]
