---
title: "Varying Intercepts Models"
description: "Modeling grouped or hierarchical data by allowing the intercept to vary for each group."
categories: [Hierarchical Models, Regression]
image: "Figures/13.png"
order: 16
---


## General Principles

To model the relationship between a dependent variable and an independent variable while allowing for different intercepts across groups or clusters, we can use a *Varying Intercepts* model. This approach is particularly useful when data are grouped (e.g., by subject, location, or time period) and we expect the baseline level of the outcome to vary across these groups.

## Considerations

::: callout-note
- We have the same considerations as for [Regression for a continuous variable](1.&#32;Linear&#32;Regression&#32;for&#32;continuous&#32;variable.qmd).

- The main idea of varying intercepts is to generate an intercept for each group, allowing each group to start at different levels. Thus, the intercept $\alpha_k$ is defined uniquely for each of the $K$ declared groups.

- In the code below, the intercept ```alpha``` for each of the $k$ declared groups shares two priors, ```a_bar``` and ```sigma```, which are respectively modeled by a Normal and an Exponential distribution.

:::

## Example

Below is an example code snippet demonstrating Bayesian regression with varying intercepts using the Bayesian Inference (BI) package. The data consists of a dependent variable representing individuals' survival (*surv*) and an independent categorical variable (*tank*), which indicates the tank where the individual was born, with a total of 48 tanks. This example is based on @mcelreath2018statistical.

::: {.panel-tabset group="language"}
## Python (Raw)

In [None]:
from BI import bi, jnp

# Setup device------------------------------------------------
m = bi(platform='cpu')

# Import Data & Data Manipulation ------------------------------------------------
# Import
from importlib.resources import files
data_path = m.load.reedfrogs(only_path=True)
m.data(data_path, sep=';') 
# Manipulate
m.df["tank"] = jnp.arange(m.df.shape[0]) 

# Define model ------------------------------------------------
def model(tank, surv, density):
    sigma = m.dist.exponential( 1,  name = 'sigma')
    a_bar = m.dist.normal( 0., 1.5,  name = 'a_bar')
    alpha = m.dist.normal( a_bar, sigma, shape= tank.shape, name = 'alpha')
    p = alpha[tank]
    m.dist.binomial(total_count = density, logits = p, obs=surv)

# Run sampler ------------------------------------------------
m.fit(model) 

# Diagnostic ------------------------------------------------
m.summary()

## Python (Build in function)

In [None]:
from BI import bi, jnp


# Setup device------------------------------------------------
m = bi(platform='cpu')

# Import Data & Data Manipulation ------------------------------------------------
# Import
from importlib.resources import files
data_path =  m.load.reedfrogs(only_path=True)
m.data(data_path, sep=';') 
# Manipulate
m.df["tank"] = jnp.arange(m.df.shape[0]) 

# Define model ------------------------------------------------
def model(tank, surv, density):
    alpha = m.effects.varying_intercept(N_groups=48,group_id=tank, group_name = 'tank')
    m.dist.binomial(total_count = density, logits = alpha, obs=surv)

# Run sampler ------------------------------------------------
m.fit(model) 

# Diagnostic ------------------------------------------------
m.summary()

## R

```R
library(BayesianInference)

# setup platform------------------------------------------------
m=importBI(platform='cpu')

# Import data ------------------------------------------------
m$data(m$load$reedfrogs(only_path=T), sep=';')
m$df$tank = c(0:(nrow(m$df)-1)) # Manipulate
m$data_to_model(list('tank', 'surv', 'density')) # Manipulate
m$data_on_model$tank = m$data_on_model$tank$astype(jnp$int32) # Manipulate
m$data_on_model$surv = m$data_on_model$surv$astype(jnp$int32) # Manipulate


# Define model ------------------------------------------------
model <- function(tank, surv, density){
  # Parameter prior distributions
  sigma = bi.dist.exponential( 1,  name = 'sigma',shape=c(1))
  a_bar =  bi.dist.normal(0, 1.5, name='a_bar',shape=c(1))
  alpha = bi.dist.normal(a_bar, sigma, name='alpha', shape =c(48))
  p = alpha[tank]
  # Likelihood
  m$dist$binomial(total_count = density, logits = p, obs=surv)
} 

# Run MCMC ------------------------------------------------
m$fit(model) # Optimize model parameters through MCMC sampling

# Summary ------------------------------------------------
m$summary() # Get posterior distribution


```

## Julia
```julia
using BayesianInference

# Setup device------------------------------------------------
m = importBI(platform="cpu")

# Import Data & Data Manipulation ------------------------------------------------
# Import
data_path = m.load.reedfrogs(only_path = true)
m.data(data_path, sep=';')
m.df["tank"] = jnp.arange(m.df.shape[0]) 

# Define model ------------------------------------------------
@BI function model(tank, surv, density)
    alpha = m.effects.varying_intercept(N_groups=48,group_id=tank, group_name = "tank")
    m.dist.binomial(total_count = density, logits = alpha, obs=surv)
end

# Run mcmc ------------------------------------------------
m.fit(model)  # Optimize model parameters through MCMC sampling

# Summary ------------------------------------------------
m.summary() # Get posterior distributions
```
:::

## Mathematical Details

We model the relationship between the independent variable $X$ and the outcome variable $Y$ while accounting for varying intercepts $\alpha$ for each group where $k(i)$ give us group belonging for observation $i$, using the following equation:


$$
Y_{i} \sim \text{Normal}(\mu_{i}, \sigma)
$$
$$
\mu_{i} = \alpha_{[k(i)]} + \beta X_{i}
$$

$$
\beta \sim \text{Normal}(0, 1)
$$

$$
\sigma \sim \text{Exponential}(1)
$$

$$
\alpha_{[k]} \sim \text{Normal}(\bar{\alpha}, \varsigma) 
$$

$$
\bar{\alpha} \sim \text{Normal}(0, 1)
$$
$$
\varsigma \sim \text{Exponential}(1)
$$

Where:

- $Y_{i}$ is the outcome variable for observation $i$.

- $\alpha_{[k(i)]}$ is the varying intercept corresponding to the group $k$ of observation $i%$.

- $\beta$ is the regression coefficient.

- $\sigma$ is a standard deviation parameter, which here has an Exponential prior that constrains it to be positive.

- $\bar{\alpha}$ is the overall mean intercept.

- $\varsigma$ is the variance of the intercepts across groups.


## Notes
::: callout-note 
- We can apply multiple variables similarly to [Chapter 2](2.&#32;Multiple&#32;continuous&#32;Variables.qmd).

- We can apply interaction terms similarly to [Chapter 3](/3.%20Interaction%20between%20continuous%20variables.qmd).

- We can apply categorical variables similarly to [Chapter 4](/4.%20Categorical%20variable.qmd).

- We can apply varying intercepts with any distribution developed in previous chapters.
:::

## Reference(s)
::: {#refs}
:::