# Marginal Models

## Overview

![overview](week-3-img/marginal-models-overview.png)

## What are Marginal models?
* General class of statistical models used to model dependent data, where observations within a randomly sampled cluster may be correlated.
* We are interesting in the estimation of overall, population-average relationships between independent variables (IVs) and dependent variables(DVs), across all clusters!

**In marginal models, we don't allow coefficients to randomly vary across clusters)a key feature of multilevel models.**

**Goal:** Make inference about these overall, marginal relationships, with standard errors that reflect clustering inherent in the study design. <br>
If we fail to account for the fact that observations are correlated within the same higherlevel cluster, we run the risk of understating our standard errors. We need to make sure that our standard errors reflecting the sampling variance of the regression coefficients that we are trying to estimate adequately reflect these correlations within the higher level clusters.
<br>
**This is a key feature of marginal models.** <br>
And we can do this in a way that does not require the use of random effects.

![overview](week-3-img/marginal-vs-multilevel.png)

The graphs show the difference in the fits of the two different types of models.<br>
* Each of those plots corosponds to a different child - child 46,48,49 and so on
* Children have up to 5 measurements on dependent variable called VSAE collected across time
    * So this is a longitudinal dataset
* We see that different kids have different trajectories in terms of the 5 observations on the dependent variable VSAE
* The **multilevel fit** is reflected by the **dashed lines.**
    * Each kid has their own unique trajectory: some grow fast, some slow. 
    * These random effects capture this variability and allow each kid ot have their own unique relationship
* The **marginal model fit** is represented by the **solid line.**
    * The solid line remains the same regardless of which child we are talking about.
    * We are estimating the effect of the overall predictor on the dependent variable
    * We are estimating the overall relationship of that predictor with the dependent variable
        * We are not interested in the relationship between the high level clusters
        * We don't want to make inference about the between child variance.
    * We will have the same expectation for what the predicted dependent variable is regardless of who the child is depending on the time point.
    
## Marginal Models: Before we fit them
1. Explicitly select a structure for the mean of the dependent variable
    * We model the distributional features of a given dependent variable
    * We start by selecting a structure that defines the mean of that dependent variable
        * The mean is usually defined by regression coefficients and the predictor variables
    
2. Select structure that makes sense for the **variances and the covariances** of the observations coming from the **same cluster** based on our study design and that structure needs to explain the variances n covariances that are **not explained** by the selected predictor variables we used to define the mean of the dependent variable.

3. We then compare the fits of different marginal models with different choices for this variance covariance structure and choose the model with the best fit.

## Example

![overview](week-3-img/example-marginal-model.png)

* The above example is 1 possible covariance structure that we can consider for the data.
* We have to evaluate different covariant structures in marginal models to see which one has the best fit.

* An alternative to the covariance structure, we can assume the errors follow an exchangable covariance structure 
    * This means all the observations in the same cluster have the same variance but the observations have a constant covariance over time.
    * So any two observations we look at within a randomly sampled cluster will have the same correlation
        * This is different from the auto regressive structure where observations close to each other in time have a stronger correlation than observations farther apart.
    * With the exchangable correlation structure, it doesn't matter what observation we are talking about, they will always have the exact same correlation. 
        * This kind of structure **makes sense for a clustered data where there is no temporal element**

## When to fit Marginal Models

![overview](week-3-img/when-to-fit-marginal-models.png)

## Why do we fit Marginal Models?

![overview](week-3-img/why-fir-marginal-models.png)

# 1 
A colleague is arguing that they want to use a marginal modeling approach to analyze a set of longitudinal data, and make conclusions about the variability among individuals in their longitudinal trajectories on a dependent variable of interest based on the results. What is the problem with your colleague’s argument?


1. The marginal model does not enable estimation of between-individual variance in coefficients of interest, because random effects of individuals are not included.

Correct 
Answer: a). Marginal models do not include explicit random effects, so inference about between-subject variability in the trajectory of interest would not be possible. One could only make inference about the overall trajectory in the dependent variable of interest across all individuals (which may be perfectly fine for some research questions).

2. The standard errors based on the marginal modeling approach will not correctly account for within-individual correlations.


3. The marginal modeling approach is not appropriate for longitudinal data.


4. Nothing; my colleague should proceed with the analysis.