## Gradient Descent

First we need to create the design matrix.

If we have a polynomial of degree $d$ in $n$ variables, the design matrix $D$ is constructed by taking all combinations of the variables raised to powers that sum to at most $d$.

D = np.array([[-1,0],[1,3],[2,1],[2,2],[0,4]])

y = np.array([-1,2,0.5,1,4])

e.g. $f(x) = \beta_3x_1^2+\beta_2x_1x_2+\beta_1x_2^2+\beta_0$

then we have $D = 
\begin{bmatrix}
1 & 0 & 0 & 1 \\
1 & 3 & 9 & 1 \\
4 & 2 & 1 & 1 \\
4 & 4 & 4 & 1 \\
0 & 0 & 16 & 1 \\
\end{bmatrix}
$


In [1]:
import numpy as np

In [2]:
# Given data
D = np.array([[-1, 0], [1, 3], [2, 1], [2, 2], [0, 4]])
y = np.array([-1, 2, 0.5, 1, 4])

# Build design matrix
X = np.column_stack(
    [
        D[:, 0] ** 2,  # x1^2
        D[:, 0] * D[:, 1],  # x1*x2
        D[:, 1] ** 2,  # x2^2
        np.ones(D.shape[0]),  # intercept
    ]
)

In [None]:
# Solve for beta
beta = np.linalg.inv(X.T @ X) @ X.T @ y

# Compute RSS
rss = np.sum((y - X @ beta) ** 2)

print("beta:", beta) # Highest to lowest i in B_i
print("RSS:", rss)

beta: [ 0.34865114 -0.06663592  0.33165783 -1.26730459]
RSS: 0.09151487203135802
