# Least squares example 2 - A polynomial

In this example we will fit a polynomial. We are going to generate some values for the relation

\begin{equation}
y = 3 + 4x + 5x^2
\end{equation}

and check if we can recover the parameters 3, 4, and 5 by doing a least squares fit.

## Generate data for $y=3 + 4x + 5x^2$

In [None]:
# We import some libraries for generating values and plotting:
import numpy as np
import seaborn as sns
from matplotlib import pyplot as plt

sns.set_theme(style="ticks", context="notebook", palette="muted")
%matplotlib inline

In [None]:
# Generate some values we will use for solving the least squares problem:
x = np.arange(-11, 11, 0.5)
y = 3 + 4 * x + 5 * x**2
# Also plot them:
fig, ax = plt.subplots(constrained_layout=True)
ax.set(xlabel="x", ylabel="y = 3 + 4*x + 5*x²")
ax.scatter(x, y)
sns.despine(fig=fig)

## Matrix solution

In [None]:
ones = np.ones_like(x)
X = np.column_stack((ones, x, x**2))
b = np.linalg.inv(X.T @ X) @ X.T @ y
print(b)

In [None]:
b = np.linalg.pinv(X) @ y  # Matrix product of the pseudoinverse and y:
print(b)

## Solution with scikit-learn

In [None]:
from sklearn.linear_model import LinearRegression

X2 = np.column_stack((x, x**2))
model = LinearRegression()
model.fit(X2, y)
print(model.intercept_)
print(model.coef_)

## Solution with statsmodels

In [None]:
import statsmodels.api as sm

X3 = sm.add_constant(np.column_stack((x, x**2)))
model_s = sm.OLS(y, X3)
result = model_s.fit()
print(result.summary())

# Least squares example 3 - Dependence between variables
Here, we will just check what happens when we have linear dependence between the variables.
We will generate some values for the relation

\begin{equation}
y = 3 + 2 x_1 + x_2
\end{equation}

and at the same time we define

\begin{equation}
x_2 = 2 x_1
\end{equation}

Of course, this means that the first equation we are fitting to really is

\begin{equation}
y = 3 + 2 x_1 + x_2 = 3 + 2 x_1 + 2_x1 = 3 + 4 x_1
\end{equation}

We shall see how well least squares deals with this.

In [None]:
# Generate some values we will use for solving the least squares problem:
x1 = np.arange(-11, 11, 0.5)
x2 = 2 * x1
y = 3 + 2 * x1 + x2

## Solution with matrices


In [None]:
ones = np.ones_like(x1)
X = np.column_stack((ones, x1, x2))
b = np.linalg.inv(X.T @ X) @ X.T @ y
print(b)

Here, the above code should fail since we can't invert $\mathbf{X}^\top \mathbf{X}$ here. (Why?)

We can inspect $\mathbf{X}^\top \mathbf{X}$ and print out the rank, which is the number of linearly independent columns.

In [None]:
print(X.T @ X)
print(np.linalg.matrix_rank(X.T @ X))

Although we can't do the inversion above, a solution still exists! We can find it by using the psudoinverse:

In [None]:
b = np.linalg.pinv(X) @ y  # Matrix product of the pseudoinverse and y:
print(b)

Here, the coefficients are seemingly different from the original equation. We shall comment on this after testing out scikit-learn and statsmodels.

## Solution with scikit-learn

In [None]:
from sklearn.linear_model import LinearRegression

X2 = np.column_stack((x1, x2))
model = LinearRegression()
model.fit(X2, y)
print(model.intercept_)
print(model.coef_)

## Solution with statsmodels

In [None]:
import statsmodels.api as sm

X3 = sm.add_constant(np.column_stack((x1, x2)))
model_s = sm.OLS(y, X3)
result = model_s.fit()
print(result.summary())

## Comment about the solution we found.
We find the following least squares solution

\begin{equation}
y = 3 + 0.8 x_1 + 1.6 x_2
\end{equation}

if we use what we know, that $x_2 = 2 x_1$, we get

\begin{equation}
y = 3 + 0.8 x_1 + 1.6 x_2 = 3 + 0.8 x_1 + 1.6 \cdot 2x_1 = 3 + 0.8 x_1 + 3.2 x_1 = 3 + 4x_1
\end{equation}

and this is equal to the original equation. So we do find the correct solution, but we do not find the
original parameters. In fact, if we inspect what we are fitting in more detail

\begin{equation}
y = a + b_1 x_1 + b_2 x_2 = a + b_1 x_1 + 2 b_2 x_1 = a + x_1 (b_1 + 2 b_2)
\end{equation}

we see that what we have many possible parameters. They only have to satisfy

\begin{equation}
b_1 + 2 b_2 = 4
\end{equation}

and the least squares approach above find one of these. OK, let us see if we can find some other solutions by just numerically minimizing the squared error:

In [None]:
from scipy.optimize import minimize


def error(b, X, y):
    return sum((y - X @ b) ** 2)


result = minimize(error, [3, 4, 0], args=(X, y))
b = result.x
print(b)
print("b[1] + 2*b[2]:", b[1] + 2 * b[2])


result = minimize(error, [3, -2.4, 3.2], args=(X, y))
b = result.x
print(b)
print("b[1] + 2*b[2]:", b[1] + 2 * b[2])


result = minimize(error, [3, 5.2, -0.6], args=(X, y))
b = result.x
print(b)
print("b[1] + 2*b[2]:", b[1] + 2 * b[2])


result = minimize(error, [3, 2000, -998], args=(X, y))
b = result.x
print(b)
print("b[1] + 2*b[2]:", b[1] + 2 * b[2])

## Alternative to least squares
Let us finally try a variant of least squares. This one ([Lasso](https://en.wikipedia.org/wiki/Lasso_(statistics))) modifies the term we are minimizing in such a way
that coefficients can become zero.

In [None]:
from sklearn.linear_model import Lasso

model_lasso = Lasso()
model_lasso.fit(X2, y)
print(model_lasso.intercept_)
print(model_lasso.coef_)

Note that one of the coefficients is zero here. This means that the Lasso regression above has selected that
one of the variables is not important, and it is just using the other one.