# Chapter 15. Multiple Regression

In [6]:
from __future__ import division
from collections import Counter
from functools import partial
from linear_algebra import dot, vector_add
from statistics import median, standard_deviation
from probability import normal_cdf
from gradient_descent import minimize_stochastic
from simple_linear_regression import total_sum_of_squares
import math, random

The VP is impressed by your simple regression model, but you know you can do better.  
You start by collecting more data: for each user you get data on how many hours he works each day and whether he has a PhD.  
You can use this additional data to improve your model.  
Accordingly, you hypothesize a linear model with more independent variables:  

$\normalsize \text{minutes} = \alpha + \beta_1 \text{friends} + \beta_2 \text{work hours} + \beta_3 \text{PhD} + \epsilon$  

For the PhD category we can use a dummy variable (see Chapter 11) that equals 1 for users *with* a PhD and 0 for users *without* a PhD.

## The Model

Recall that in Chapter 14 we fit a model of the form:  

$\Large y_i = \alpha + \beta x_i + \epsilon_i$  

Now imagine that each input $\normalsize x_i$ is not a single number, but is instead a vector of $\normalsize k$ numbers $\normalsize \;{x_i}_1, {x_i}_2, \ldots, {x_i}_k$.  
The multiple regression model assumes that:  

$\Large y_i = \alpha + \beta_1{x_i}_1 + \ldots + \beta_k{x_i}_k + \epsilon_i$

In multiple regression the vector of parameters is usually called $\normalsize \beta$.  
We'll want this to include the constant term as well, which we can achieve by adding a column of ones to our data:

and:

Then our model is:

In [7]:
def predict(x_i, beta):
    """ assumes that the first element of each x_i is 1 """
    return dot(x_i, beta)

In this particular case, our independent variable `x` will be a list of vectors, each of which looks like this:

## Further Assumptions of the Least Squares Model