# Multiple Regression

## The Model

You can improve a `single linear regression` model with additional features (independent variables):

<img src="images/multiple_linear_regression1.png" alt="" style="width: 600px;"/>

In multiple regression the vector of parameters is usually called β.

Assumptions:
- The columns of `x` are `linearly independent` - that there is no way to write any one as a weighted sum of some of the others. If not met, we could not estimate `beta`.
- The columns of `x` are all uncorrelated with the `errors e`.

In [2]:
from scratch.linear_algebra import dot, Vector

# beta = [alpha, beta_1, ..., beta_k]
# x_i = [1, x_i1, ..., x_ik]

def predict(x: Vector, beta: Vector) -> float:
    '''assumes that the first element of x is 1'''
    return dot(x, beta)

# For example, x_i (independent variables vector):
[
    1,  # constant term
    34, # feature 1
    2,  # feature 2
    0   # feature 3...
]

[1, 34, 2, 0]

As we did in the simple linear model, we’ll choose `beta` to minimize the `sum of squared errors`. Finding an exact solution is not simple to do by hand, which means we’ll need to use gradient descent. The error function is almost identical to the one used for `simple linear regression` but instead of expecting parameters `alpha`, `beta` it will take a vector of arbitrary length:

In [3]:
from typing import List

def error(x: Vector, y: float, beta: Vector) -> float:
    return predict(x, beta) - y

In [4]:
def squared_error(x: Vector, y: float, beta: Vector) -> float:
    return error(x, y, beta) ** 2

In [5]:
x = [1, 2, 3]
y = 30
beta = [4, 4, 4] # so prediction = 4 + 8 + 12 = 24

In [6]:
error(x, y, beta)

-6

In [7]:
squared_error(x, y, beta)

36

In [8]:
# The gradient
def sqerror_gradient(x: Vector, y: float, beta: Vector) -> Vector:
    err = error(x, y, beta)
    return [2 * err * x_i for x_i in x]

In [9]:
sqerror_gradient(x, y, beta)

[-12, -24, -36]