# Polynomial regression

Polynomial regression will allow us to build a nonlinear model by adopting and fitting a polynomial curve.

We have seen how to solve a linear problem. But in everyday life, we meet a lot of other models that are not linear: curves, sinusoids, ...

If we observe the contamination rate of a pandemic, the line will not be linear but will probably look like an exponential curve.

For example, if you apply a linear model on this data:

![polynomial](./assets/polynom_1.JPG)

We can see that the bias of our predictions is high. We cannot say that our model is effective.

But the following model already gives me much more confidence.

![](./assets/poly_2.JPG)

The polynomial can have several degrees, the more degrees it has, the more it will be able to solve complex problems.

<img src="https://upload.wikimedia.org/wikipedia/commons/1/16/Lsf.gif"/>

Source: <a href=https://upload.wikimedia.org/wikipedia/commons/1/16/Lsf.gif>Wikipedia</a>

Although polynomial models allow us to model relationships of non-linear shapes, they belong to the family of linear models. In the term "linear model", the adjective "linear" refers to the parameters of the model and the fact that their effects are added together. This is indeed the case here. Moreover, linear regression is a polynomial of degree 1. 

## Variables studied

In [None]:
import numpy as np
from sklearn.datasets import make_regression
import matplotlib.pyplot as plt
import pandas as pd

First of all, we will load our dataset. This is a fake dataset for the example.

In [None]:
df = pd.read_csv("./data/poly.csv")
df.head()

In [None]:
df.shape

As you can see we now have 200 rows, 1 feature and 1 target.

**Exercise:** Create the `X` and `y` variables and define which column will be the target and which column will be the feature. 
They must be of type `numpy.ndarray`. Our variable `X` therefore has one dimension.

### Relationship between variables

**Exercise:** Use matplotlib (or other) to display the dataset as a scatter plot.

**Exercise:** Show correlation coefficients.

As we can see, the coefficient of correlation remains important even if the dataset is not perfectly linear.

### Split the dataset


You now know the process!

**Exercise:** Import `train_test_split` from sklearn and split the dataset and create the variables `X_train`, `X_test`, `y_train`, `y_test`.

##  Load and fit the model (with scikit-learn)

This time there is a little change. 
We have a single feature in our dataset. The polynomial model is a special case of multiple regression. So we need several features to be able to apply polynomial regression. And these features, we'll have to add them ourselves. By the way, this way of doing things has a name: feature engineering.


Let's imagine that we want to have a 2-degree polynomial regression. 
So we will need to add a feature. 
This feature is simply an exponent of $x$.  

$[x, x^2]$

So $x^2$ is the new feature.

If you want a 3-degree polynomial model, you will have to add 2 features in this case.

$[x, x^2, x^3]$

To do this, we will need to create a pipeline. 
A pipeline is a processing chain that will execute a set of functions and pass arguments between them.

First of all, we need to define the number of degrees.
 
**Exercise:** Create a `degree` variable with 1 as value. (We will change this value later)

**Exercise:** Create a pipeline with sklearn.This pipeline must contain the `PolynomialFeatures` and `LinearRegression` classes. Don't forget to set the number of degrees for the `PolynomialFeatures`


**Exercise:** Fit your model.

**Exercise:** Use a scatter plot and display your predictions on `X_test`.

If you see a straight line it is because we have set the number of degrees to one. This confirms that the linear regression is indeed a polynomial model of degree 1.

**Exercise:** Change the number of degrees and train your model again. You must try to fit the curve as well as possible while limiting the number of degrees, to save some resources from your machine.

## From scratch

Again a few changes. This time we'll just have to add new features manually. 

### Transform to matrix

$$
\\ Y = X \cdot \theta \\
$$

The $Y$ vector is the same too

$$Y =
\begin{bmatrix}
y^{(1)}\\
y^{(2)}\\
y^{(3)}\\
... \\
y^{(m)}\\
\end{bmatrix}
$$ 


The theta vector will have as many lines as there are parameters +1 (for the constant). 
$$ \theta =
\begin{bmatrix}
a\\
b\\
c\\
... \\
\end{bmatrix}
$$

The $X$ initially looks like this: 

$$ X =
\begin{bmatrix}
x^{(1)}\\
x^{(2)}\\
x^{(3)}\\
x^{(m)}\\
\end{bmatrix}
$$

If we want to add a degree to the polynomial, it adds a feature to our $X$. And this feature will contain $x^2$.

Example of polynomial of degree 2:

$$ X =
\begin{bmatrix}
x^{(1)}_1, x^{(1)2}_2\\
x^{(2)}_1, x^{(2)2}_2\\
x^{(3)}_1, x^{(3)2}_2\\
\dots, \dots\\
x^{(m)}_1,x^{(m)k}_2\\
\end{bmatrix}
$$

Example of polynomial of degree 3: (In this case the third feature will be of power 3.)

$$ X =
\begin{bmatrix}
x^{(1)}_1, x^{(1)2}_2, x^{(1)3}_3\\
x^{(2)}_1, x^{(2)2}_2, x^{(2)3}_3\\
x^{(3)}_1, x^{(3)2}_2, x^{(3)3}_3\\
\dots, \dots,\dots \\
x^{(m)}_1,x^{(m)2}_2, x^{(m)3}_3\\
\end{bmatrix}
$$

And so on and so forth. Of course, don't forget at the end to add a feature with only 1s.

$$ X =
\begin{bmatrix}
x^{(1)}_1, x^{(1)2}_2, ..., x^{(m)k}_{n}, 1\\
x^{(2)}_1, x^{(2)2}_2, ..., x^{(m)k}_{n}, 1\\
x^{(3)}_1, x^{(3)2}_2, ..., x^{(m)k}_{n}, 1\\
x^{(m)}_1,x^{(m)k}_2, ..., x^{(m)k}_{n}, 1\\
\end{bmatrix}
$$

**Exercise:** Create a matrix `X` for a 3-degree polynomial $[x, x^2, x^3, 1]$

**Exercise:** Initialize the random `theta` vector, with 4 elements (because `X` has four columns).

**Exercise:** Create the `model`. It is always the same:

$$Y = X \cdot \theta $$

**Exercise:** Create a `MSE` function. It is always the same too.

**Exercise:** Create a `grad` function. Again, it is always the same.

**Exercise:** 
Again...
1. Create a `gradient_descent` function that receives as parameter `X`, `y`, `theta`, `learning_rate` and `n_iterations`.
2. In the function, create a variable `cost_history` with a matrix filled with 0 and which has a length of `n_iterations`. We will use it to display the histogram of the model learning process.
3. Create a loop that iterates up to `n_iterations`.
4. In the loop, update `theta` with the formula of the gradient descent (the example above).
5. In the loop, update `cost_history[i]` with the values of `MSE(X,y,theta)`.
6. Return `theta` and `cost_history`

### Train your model 

**Exercise:** Create variables `n_iterations` and `learning_rate`.

**Exercise:** Create variables `theta_final`, `cost_history` and call `gradient_descent()`.

**Exercise:** Create a `predictions` variable that contains `model(X, theta_final)`.


**Exercise:** Display your `predictions` and the true values of the dataset.

It says it looks like this.

<img src="./assets/poly3.JPG" />

**Exercise:** Plot `cost_history`.

In [None]:
def coef_determination(y, pred):
    u = ((y - pred)**2).sum()
    v = ((y - y.mean())**2).sum()
    return 1 - u/v

In [None]:
coef_determination(y, predictions)

**Exercise:** Try to improve your model by adding a degree to your polynomial model.

Good, you must feel like this now: 

![](https://media.giphy.com/media/DHqth0hVQoIzS/giphy.gif)

## Where to go next?

Linear models might look simple but they can get very complicated. You might look into **Ridge Regression** or **Lasso Regression** if you want to further deepen your knowledge.

- [Statquest - Regularization explained (Lasso & Ridge)](https://youtu.be/Q81RR3yKn30)