<h1><b>The Polynomial Regression Algorithm</b></h1> <p align="justify">In this exercise you will study the <b><i>polynomial regression</i></b> (<b><i>polynomial regression</i></b>) algorithm. For training the model you will use the data from the file <b><i><a href="https://github.com/netmode/Stochastic-Processes-and-Optimization-in-Machine-Learning-Lab/blob/main/lab1/data2.csv">data2.csv</a></b></i>. The application concerns estimating the pressure values of a system given its temperature.</p> <p align="justify">This Notebook includes commands for (a) loading the training data contained in the file <b><i>data2.csv</b></i>, (b) training a <b><i>linear regression</i></b> model and fitting it on the training data, (c) training a <b><i>polynomial regression</i></b> model of <b><i>degree</i></b> and fitting it on the training data, and (d) visualizing the training data, as well as the decision lines for the two regression models. For training both regression models, the <b><i>Scikit-Learn</b></i> library of <b><i>Python</i></b> is used. More information about the <b><i>polynomial regression</i></b> algorithm and the exercise code can be found <a href="https://www.geeksforgeeks.org/python-implementation-of-polynomial-regression/">here</a>.</p> <p align="justify">In this exercise, you are asked to observe the shape and changes of the regression line for different values of the <b><i>degree</b></i> parameter based on the provided training data.</p> <p align="justify">First, you will install and load the necessary libraries.</p>


In [None]:
!pip install numpy
!pip install matplotlib
!pip install pandas
!pip install sklearn

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression

Then, you will load the training data provided from the file <b><i>data2.csv</b></i>.

In [None]:
data = pd.read_csv('data2.csv')
print(data)

    sno  Temperature  Pressure
0     1            0    0.0002
1     2            5    0.0010
2     3           10    0.0017
3     4           15    0.0040
4     5           20    0.0200
5     6           25    0.0400
6     7           30    0.0100
7     8           35    0.0340
8     9           40    0.0500
9    10           45    0.0600
10   11           50    0.0700
11   12           55    0.0800
12   13           60    0.0900
13   14           65    0.1000
14   15           70    0.1100


Now, you will create the input and output variables for the machine learning algorithms from the provided training data.

In [None]:
X = data.iloc[:, 1:2].values
y = data.iloc[:, 2].values

print("Input")

print(X)

print("Labels")

print(y)

Next, you will train the <b><i>linear regression</i></b> model using the training data above.

In [None]:
lin = LinearRegression()
lin.fit(X, y)

<h3><b><i>Question 1</i></b></h3> <p align="justify">Run the following code segments for the parameter <b><i>degree</i></b> values {<i>2, 3, 4, 5, 8, 10, 12, 15</i>} and record the plots provided as outputs. What changes do you observe in the shape of the regression line for the <b><i>polynomial regression</i></b> model as the value of the <b><i>degree</i></b> parameter increases? What do you observe about the required training time of the <b><i>polynomial regression</i></b> model as the value of the <b><i>degree</i></b> parameter increases?</p> <br>

<h4> <b><i>degree</b></i> = 2 </h4>

In [None]:
poly = PolynomialFeatures(degree=2)
X_poly = poly.fit_transform(X)
poly.fit(X_poly, y)
lin2 = LinearRegression()
lin2.fit(X_poly, y)

plt.scatter(X, y, color = 'blue', label = 'Training Examples')

plt.plot(X, lin.predict(X), color = 'red', label = 'Linear Regression Line')
plt.plot(X, lin2.predict(poly.fit_transform(X)), color = 'green', label = 'Polynomial Regression Line')
plt.legend()
plt.show()

<h4> <b><i>degree</b></i> = 3 </h4>

In [None]:
poly = PolynomialFeatures(degree=3)
X_poly = poly.fit_transform(X)
poly.fit(X_poly, y)
lin2 = LinearRegression()
lin2.fit(X_poly, y)

plt.scatter(X, y, color = 'blue', label = 'Training Examples')

plt.plot(X, lin.predict(X), color = 'red', label = 'Linear Regression Line')
plt.plot(X, lin2.predict(poly.fit_transform(X)), color = 'green', label = 'Polynomial Regression Line')
plt.legend()
plt.show()

<h4> <b><i>degree</b></i> = 4 </h4>

In [None]:
poly = PolynomialFeatures(degree=4)
X_poly = poly.fit_transform(X)
poly.fit(X_poly, y)
lin2 = LinearRegression()
lin2.fit(X_poly, y)

plt.scatter(X, y, color = 'blue', label = 'Training Examples')

plt.plot(X, lin.predict(X), color = 'red', label = 'Linear Regression Line')
plt.plot(X, lin2.predict(poly.fit_transform(X)), color = 'green', label = 'Polynomial Regression Line')
plt.legend()
plt.show()

<h4> <b><i>degree</b></i> = 5 </h4>

In [None]:
poly = PolynomialFeatures(degree=5)
X_poly = poly.fit_transform(X)
poly.fit(X_poly, y)
lin2 = LinearRegression()
lin2.fit(X_poly, y)

plt.scatter(X, y, color = 'blue', label = 'Training Examples')

plt.plot(X, lin.predict(X), color = 'red', label = 'Linear Regression Line')
plt.plot(X, lin2.predict(poly.fit_transform(X)), color = 'green', label = 'Polynomial Regression Line')
plt.legend()
plt.show()

<h4> <b><i>degree</b></i> = 8 </h4>

In [None]:
poly = PolynomialFeatures(degree=8)
X_poly = poly.fit_transform(X)
poly.fit(X_poly, y)
lin2 = LinearRegression()
lin2.fit(X_poly, y)

plt.scatter(X, y, color = 'blue', label = 'Training Examples')

plt.plot(X, lin.predict(X), color = 'red', label = 'Linear Regression Line')
plt.plot(X, lin2.predict(poly.fit_transform(X)), color = 'green', label = 'Polynomial Regression Line')
plt.legend()
plt.show()

<h4> <b><i>degree</b></i> = 10 </h4>

In [None]:
poly = PolynomialFeatures(degree=10)
X_poly = poly.fit_transform(X)
poly.fit(X_poly, y)
lin2 = LinearRegression()
lin2.fit(X_poly, y)

plt.scatter(X, y, color = 'blue', label = 'Training Examples')

plt.plot(X, lin.predict(X), color = 'red', label = 'Linear Regression Line')
plt.plot(X, lin2.predict(poly.fit_transform(X)), color = 'green', label = 'Polynomial Regression Line')
plt.legend()
plt.show()

<h4> <b><i>degree</b></i> = 12 </h4>

In [None]:
poly = PolynomialFeatures(degree=12)
X_poly = poly.fit_transform(X)
poly.fit(X_poly, y)
lin2 = LinearRegression()
lin2.fit(X_poly, y)

plt.scatter(X, y, color = 'blue', label = 'Training Examples')

plt.plot(X, lin.predict(X), color = 'red', label = 'Linear Regression Line')
plt.plot(X, lin2.predict(poly.fit_transform(X)), color = 'green', label = 'Polynomial Regression Line')
plt.legend()
plt.show()

<h4> <b><i>degree</b></i> = 15 </h4>

In [None]:
poly = PolynomialFeatures(degree=15)
X_poly = poly.fit_transform(X)
poly.fit(X_poly, y)
lin2 = LinearRegression()
lin2.fit(X_poly, y)

plt.scatter(X, y, color = 'blue', label = 'Training Examples')

plt.plot(X, lin.predict(X), color = 'red', label = 'Linear Regression Line')
plt.plot(X, lin2.predict(poly.fit_transform(X)), color = 'green', label = 'Polynomial Regression Line')
plt.legend()
plt.show()

<h3><b><i>Question 2</i></b></h3>

You are also provided with the file <b><i>data2b.csv</i></b>, which is the file <b><i>data2.csv</i></b> with one additional record, <b><i>(9, 38, 0.3)</i></b>. Run the following program for the parameter value <b><i>degree = 15</i></b>. How large is the change you observe in the shape of the regression line compared to the corresponding case in Question 1? What conclusion can you draw about the <b><i>polynomial regression</i></b> algorithm?

In [None]:
data = pd.read_csv('data2b.csv')
print(data)

X = data.iloc[:, 1:2].values
y = data.iloc[:, 2].values

lin = LinearRegression()
lin.fit(X, y)

poly = PolynomialFeatures(degree=15)
X_poly = poly.fit_transform(X)
poly.fit(X_poly, y)
lin2 = LinearRegression()
lin2.fit(X_poly, y)

plt.scatter(X, y, color = 'blue', label = 'Training Examples')

plt.plot(X, lin.predict(X), color = 'red', label = 'Linear Regression Line')
plt.plot(X, lin2.predict(poly.fit_transform(X)), color = 'green', label = 'Polynomial Regression Line')
plt.legend()
plt.show()