---
title: "Multivariate Linear Regression"
description: "Extending linear regression to model a continuous outcome using multiple predictor variables."
image: "Figures/multiple_regression_3D.gif"
categories: [Regression]
order: 3
---


## General Principles
To study relationships between multiple continuous independent variables (e.g., the effect of weight and age on height), we can use a multiple regression approach. Essentially, we extend [Linear Regression for continuous variable](1.&#32;Linear&#32;Regression&#32;for&#32;continuous&#32;variable.qmd) by adding a regression coefficient $\beta_x$ for each continuous variable (e.g., $\beta_{weight}$ and $\beta_{age}$).


## Considerations
::: callout-note 
- We have the same considerations as for the [Regression for continuous variable](1.&#32;Linear&#32;Regression&#32;for&#32;continuous&#32;variable.qmd).

- The model interpretation of the regression coefficients $\beta_x$ is considered for fixed values of the other independent variable(s)' regression coefficients‚Äîi.e., for a given age, $\beta_{weight}$ represents the expected change in the dependent variable (height) for each one-unit increase in weight, holding all other variables (e.g., age) constant.

:::

## Example
Below is example code demonstrating Bayesian multiple linear regression using the Bayesian Inference (BI) package. Data consist of three continuous variables (*height*, *weight*, *age*), and the goal is to estimate the effect of *weight* and *age* on *height*. This example is based on @mcelreath2018statistical.

::: {.panel-tabset group="language"}
### Python

In [None]:
from BI import bi

# Setup device------------------------------------------------
m = bi(platform='cpu')

# Import Data & Data Manipulation ------------------------------------------------
from importlib.resources import files
# Import
data_path = m.load.howell1(only_path = True)
m.data(data_path, sep=';') 
m.df = m.df[m.df.age > 18] # Subset data to adults
m.scale(['weight', 'age']) # Normalize

# Define model ------------------------------------------------
def model(height, weight, age):
    # Parameter prior distributions
    alpha = m.dist.normal(0, 0.5, name = 'alpha')    
    beta1 = m.dist.normal(0, 0.5, name = 'beta1')
    beta2 = m.dist.normal(0, 0.5, name = 'beta2')
    sigma = m.dist.uniform(0, 50, name = 'sigma')
    # Likelihood
    m.dist.normal(alpha + beta1 * weight + beta2 * age, sigma, obs = height)

# Run MCMC ------------------------------------------------
m.fit(model) # Optimize model parameters through MCMC sampling

# Summary ------------------------------------------------
m.summary()

### R
```R
library(BayesianInference)
m=importBI(platform='cpu')

# Import Data & Data Manipulation ------------------------------------------------
m$data(m$load$howell1(only_path = T), sep=';')# Import
m$df = m$df[m$df$age > 18,] # Subset data to adults
m$scale(list('weight', 'age')) # Normalize
m$data_to_model(list('weight', 'height', 'age')) # Send to model (convert to jax array)

# Define model ------------------------------------------------
model <- function(height, weight, age){
  # Parameter prior distributions
  alpha = bi.dist.normal(0, 0.5, name = 'a')
  beta1 = bi.dist.normal(0, 0.5, name = 'b1')
  beta2 = bi.dist.normal(0, 0.5, name = 'b2')   
  sigma = bi.dist.uniform(0, 50, name = 's')
  # Likelihood
  bi.dist.normal(alpha + beta1 * weight + beta2 * age, sigma, obs=height)
}

# Run MCMC ------------------------------------------------
m$fit(model) # Optimize model parameters through MCMC sampling

# Summary ------------------------------------------------
m$summary() # Get posterior distributions

```

### Julia
```Julia
using BayesianInference

# Setup device------------------------------------------------
m = importBI(platform="cpu")

# Import Data & Data Manipulation ------------------------------------------------
# Import
data_path = m.load.howell1(only_path = true)
m.data(data_path, sep=';') 
m.df = m.df[m.df.age > 18] # Subset data to adults
m.scale(["weight", "age"]) # Normalize

# Define model ------------------------------------------------
@BI function model(height, weight, age)
    # Parameter prior distributions
    alpha = m.dist.normal(0, 0.5, name = "alpha")    
    beta1 = m.dist.normal(0, 0.5, name = "beta1")
    beta2 = m.dist.normal(0, 0.5, name = "beta2")
    sigma = m.dist.uniform(0, 50, name = "sigma")
    # Likelihood
    m.dist.normal(alpha + beta1 * weight + beta2 * age, sigma, obs = height)
end

# Run mcmc ------------------------------------------------
m.fit(model)  # Optimize model parameters through MCMC sampling

# Summary ------------------------------------------------
m.summary() # Get posterior distributions
```
:::

::: callout-caution
For R users, if you create the regression coefficient in a single call:

```python
betas = bi.dist.normal(0, 0.5, name = 'regression_coefficients', shape = (2,))
```

you need to index them starting by 0:

```python
 m$normal(alpha + betas[0] * weight + betas[1] * age, sigma, obs=height)
``` 
:::

## Mathematical Details
### *Frequentist formulation*
We model the relationship between the independent variables $(X_{1i}, X_{2i}, ..., X_{[K,i]})$ and the dependent variable _Y_ using the following equation:

$$
ùëå_i = \alpha +\beta_1  ùëã_{[1,i]} + \beta_2  ùëã_{[2,i]} + ... + \beta_n  ùëã_{[K,i]} + \epsilon_i
$$

Where:

- $Y_i$ is the dependent variable for observation *i*.
  
- $\alpha$ is the intercept term.
  
- $X_{[1,i]}$, $X_{[2,i]}$, ..., $X_{[K,i]}$ are the values of the independent variables for observation *i*.
  
- $\beta_1$, $\beta_2$, ..., $\beta_K$ are the regression coefficients.
  
- $\epsilon_i$ is the error term for observation *i*, and the vector of the error terms, $\epsilon$, are assumed to be independent and identically distributed.
  

### *Bayesian formulation*
In the Bayesian formulation, we define each parameter with [<span style="color:#0D6EFD">priors üõà</span>]{#prior}. We can express the Bayesian model as follows:

$$
ùëå_i \sim \text{Normal}(\alpha + \sum_{k=1}^K  \beta_k  X_{[K,i]}, œÉ)
$$

$$
\alpha \sim \text{Normal}(0,1)
$$

$$
\beta_k \sim \text{Normal}(0,1)
$$

$$
œÉ \sim \text{Uniform}(0, 50)
$$

Where:

- $Y_i$ is the dependent variable for observation *i*. 
  
- $\alpha$ is the intercept term, which in this case has a unit-normal prior.
  
- $\beta_k$ are slope coefficients for the _K_ distinct independent variables, which also have unit-normal priors.
  
- $X_{[1,i]}$, $X_{[2,i]}$, ..., $X_{[K,i]}$ are the values of the independent variables for observation *i*.
  
- $\sigma$ is a standard deviation parameter, which here has a Uniform prior that constrains it to be positive.

## Reference(s)
::: {#refs}
:::