## Linear Regression Using Normal Equation (easy)

Write a Python function that performs linear regression using the normal equation. The function should take a matrix X (features) and a vector y (target) as input, and return the coefficients of the linear regression model. Round your answer to four decimal places, -0.0 is a valid result for rounding a very small number.

In [6]:
X = [[1, 1], [1, 2], [1, 3]]
y = [1, 2, 3]

# output: [0.0, 1.0]
# reasoning: The linear model is y = 0.0 + 1.0*x, perfectly fitting the input data.

The normal equation is:

<div align="left">
$$
\theta = (X^T X)^{-1} X^T y
$$
</div>

In [9]:
import numpy as np
def linear_regression_normal_equation(X: list[list[float]], y: list[float]) -> list[float]:
    # convert to numpy arrays
    X = np.array(X) 
    y = np.array(y).reshape(-1,1) # ensure y is a column vector
    
    # compute theta using the normal equation
    theta = np.linalg.inv(X.T @ X) @ (X.T @ y)

    # flatten the array to ensure it is a 1D array
    theta = np.round(theta, 4).flatten()

    # convert it to a list
    return theta.tolist()

In [10]:
linear_regression_normal_equation(X, y)

[-0.0, 1.0]

In [67]:
from sklearn.linear_model import LinearRegression

def linear_regression_normal_equation_sk(X: list[list[float]], y: list[float]) -> list[float]:
    model = LinearRegression(fit_intercept=True)
    model.fit(X, y)
    
    theta = model.coef_
    
    return np.round(theta, 4).tolist()

In [68]:
linear_regression_normal_equation_sk(X, y)

[0.0, 1.0]

### Why .reshape()

In [14]:
v_1d = np.array([7, 8, 9])
print(v_1d.shape)
print(v_1d.T.shape)

(3,)
(3,)


In [13]:
v_2d = np.array([7, 8, 9]).reshape(-1,1)
print(v_2d.shape)
print(v_2d.T.shape)

(3, 1)
(1, 3)
