# Polynomial Regression

Polynomial regression is a form of regression analysis in which the relationship between the independent variable x and the dependent variable y is modelled as an nth degree polynomial in x.

Polynomial regression fits a nonlinear relationship between the value of x and the corresponding conditional mean of y,

Although polynomial regression fits a nonlinear model to the data, as a statistical estimation problem it is linear, in the sense that the regression function E(y | x) is linear in the unknown parameters that are estimated from the data. For this reason, polynomial regression is considered to be a special case of multiple linear regression.

In general, we can model the expected value of y as an nth degree polynomial, yielding the general polynomial regression model

$$
y=\beta_{0}+\beta_{1} x+\beta_{2} x^{2}+\beta_{3} x^{3}+\cdots+\beta_{n} x^{n}+\varepsilon
$$

Conveniently, these models are all linear from the point of view of estimation, since the regression function is linear in terms of the unknown parameters β0, β1, .... Therefore, for least squares analysis, the computational and inferential problems of polynomial regression can be completely addressed using the techniques of multiple regression. This is done by treating x, x2, ... as being distinct independent variables in a multiple regression mode

# Matrix calculation

The polynomial regression model

$$
y_{i}=\beta_{0}+\beta_{1} x_{i}+\beta_{2} x_{i}^{2}+\cdots+\beta_{m} x_{i}^{m}+\varepsilon_{i}(i=1,2, \ldots, n)
$$

$$
\left[ \begin{array}{c}{y_{1}} \\ {y_{2}} \\ {y_{3}} \\ {\vdots} \\ {y_{n}}\end{array}\right]=\left[ \begin{array}{ccccc}{1} & {x_{1}} & {x_{1}^{2}} & {\ldots} & {x_{1}^{m}} \\ {1} & {x_{2}} & {x_{2}^{2}} & {\ldots} & {x_{2}^{m}} \\ {1} & {x_{3}} & {x_{3}^{2}} & {\ldots} & {x_{3}^{m}} \\ {\vdots} & {\vdots} & {\vdots} & {\ddots} & {\vdots} \\ {1} & {x_{n}} & {x_{n}^{2}} & {\ldots} & {x_{n}^{m}}\end{array}\right] \left[ \begin{array}{c}{\beta_{0}} \\ {\beta_{1}} \\ {\beta_{2}} \\ {\vdots} \\ {\beta_{m}}\end{array}\right]+\left[ \begin{array}{c}{\varepsilon_{1}} \\ {\varepsilon_{2}} \\ {\varepsilon_{3}} \\ {\vdots} \\ {\varepsilon_{n}}\end{array}\right]
$$

which when using pure matrix notation is written as

$$
\vec{y}=\mathbf{X} \vec{\beta}+\vec{\varepsilon}
$$

The vector of estimated polynomial regression coefficients (using ordinary least squares estimation) is

$$
\widehat{\vec{\beta}}=\left(\mathbf{X}^{\top} \mathbf{X}\right)^{-1} \mathbf{X}^{\top} \vec{y}
$$

assuming m < n which is required for the matrix to be invertible; then since {\displaystyle \mathbf {X} } \mathbf {X}  is a Vandermonde matrix, the invertibility condition is guaranteed to hold if all the {\displaystyle x_{i}} x_{i} values are distinct. This is the unique least-squares solution.

# The Bias vs Variance trade-off
**Bias** refers to the error due to the model’s simplistic assumptions in fitting the data. A high bias means that the model is unable to capture the patterns in the data and this results in **under-fitting**.

**Variance** refers to the error due to the complex model trying to fit the data. High variance means the model passes through most of the data points and it results in **over-fitting** the data.

<img src="img.png">