## 13.2 The multivariable linear regression model 

Suppose we wish to relate an outcome ($Y$) to $p$ predictor variables $(X_1, X_2, ..., X_p)$. The appropriate multivariable linear regression model is a straightforward extension of the simple linear regression model: 

$$ 
y_i = \beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} + ,..., \beta_p x_{ip}+\epsilon_i \text{ with } \epsilon_i \sim NID(0,\sigma^2).
$$ 

where, 

$$
\begin{align}
y_i &= \text{value of the dependent variable for the ith participant}\\
x_{ji} &= \text{value of the jth predictor variable for the ith participant}. 
\end{align}
$$

The parameters in the model are interpreted as follows:

+ $\beta_0$ is the intercept. It is the expectation of $Y$ when all the $X_j's$ are zero.
+ $\beta_j$ is the expected change in $Y$ for a 1 unit increase in $X_j$ *with all the other covariates held constant*. 

The $\beta_j's$ are the **regression coefficients** (otherwise known as **partial regression coefficients**). Each one measures the effect of one covariate controlled (or adjusted) for all of the others. 

### 13.2.1 The multivariable linear regression model in matrix notation

Similarly to the simple linear regression model, the multivariable linear regression model can be expressed using matrix algebra. 

$$
\mathbf{Y}=\mathbf{X}\mathbf{\beta}+\mathbf{\epsilon} \text{ where }\epsilon \sim N(0,\mathbf{I}\sigma^2)
$$

$$
\begin{vmatrix}y_1\\y_2 \\. \\. \\. \\y_n \end{vmatrix}=\begin{vmatrix}1 & x_{11} & x_{12} & ... & x_{1p} \\ 1 & x_{21} & x_{22} & ... & x_{2p}  \\1 & . \\1 & .  \\ 1& . \\1 & x_{p1} & x_{p} & ... & x_{pn} \end{vmatrix}\begin{vmatrix} \beta_0 \\ \beta_1 \\ \beta_2 \\ ... \\ \beta_p \end{vmatrix}+\begin{vmatrix}\epsilon_1\\ \epsilon_2 \\ . \\ . \\. \\ \epsilon_n \end{vmatrix} 
$$

In this formulation, $\mathbf{X}$ is an $n \times (p+1)$ matrix, $Y$ and $\epsilon$ are vectors of length $n$ whilst $\mathbf{\beta}$ is a vector of length $(p+1)$.
 
### 13.2.2 Estimation of the parameters

The regression coefficients in multivariable linear regression can be estimated by minimising the residual sum of squares: 

$$
\begin{align}
SS_{RES} &= \sum_{i=1}^n \hat{\epsilon}_i^2 = \sum_{i=1}^n (y_i-\hat{y})^2 \\
&= \sum_{i=1}^n (y_i-\hat{\beta}_0-\hat{\beta}_1x_{1i}-...-\hat{\beta}_px_{pi})^2 
\end{align}
$$

The closed form solution, obtained by solving the $(p+1)$ simultaneous equations that result from setting the partial derivatives of the above equation with respect to each parameter estimate to zero, can be written succinctly using matrix notation: 

$$
\mathbf{\hat{\beta}}= (\mathbf{X'X})^{-1}X'Y
$$

$\mathbf{\hat{\beta}}$ is an unbiased estimator of $\mathbf{\beta}$. Its distribution is as follows:

$$
\mathbf{\hat{\beta}} \sim \mathbf{N(\beta, (X'X)^{-1}\sigma^2)}.
$$

This expresses the fact that the elements of $\mathbf{\hat{\beta}}$ follow a multivariate normal distribution whose variances and covariances are given by $\mathbf{(X'X)^{-1}\sigma^2}$. 

It can also be shown that the following is an unbiased estimator for $\sigma^2$:

$$
\begin{align}
\hat{\sigma}^2 &= \sum_{i=1}^n \frac{\hat{\epsilon_i}^2}{(n-(p+1))}\\
              &=\sum_{i=1}^n (y_i - \hat{\beta_0} - \hat{\beta}_1x_{1i}- ... - \hat{\beta_p}x_{ip})^2/(n-(p+1))
\end{align}
$$

While it is useful to know how these parameters are estimated, in practice they are often obtained using statistical software. Next, we demonstrate how to perform multivariable regression in R using the birthweight data and discuss the interpretation of the estimated regression coefficients. 
 