# Marginal Linear Regression Models

![overview](week-3-img/marginal-modelling-approaches.png)

**We have a continuous dependent variable that we assume marginally is normally distributed.**

![overview](week-3-img/marginal-lr-model-1.png)



* for point 1: We assume that vector of observations on y follows a normal distribution with a mean defined by that regression function, that linear combination of the fixed effects are the beta parameters, and the predictor variables denoted by x sub i, and then this additional variance covariance structure defined by v sub i.

* An important part of fitting these models is chosing a structure for these V*i* matrix.

## Generalized Estimating Equations

**It is a technique used for fitting marginal models**
* The score function is a function of the regression parameter (the beta parameters we are interested in).
* We are interested in the estimate of the parameters that solve the equation.

**When fitting models to dependent data using GEE we seek to estimate the parameters that solve the score function (or estimating equation:**

![overview](week-3-img/gee-eqn-parameters.png)

![overview](week-3-img/gee-eqn-parameters-2.png)

**We want to estimate the parameters that estimate mu sub i that solve this score equation**
* We use iteratively weighted least squared or Fisher scoring algotithms
* These are all iterative techniques that attempt to identify the estimates of the Beta parameters that solve this score equation.

### About GEE

![overview](week-3-img/about-gee.png)

### How do we make inferences?

* Variance of parameters estimates, the variance of the sampling distribution are computed using **sandwich estimator** as discussed in Liang and Zeger paper, 1986)
    * Sandwich estimation technique reflects the clustering of the observations
* The estimator is based on a specified working correlation matrix 
    * e.g an exchangable structure meaning the observations have a constand correlation within a cluster
* Estimator is also based on the variance-covariance matrix of observations based on the expected means from the fitted model
* Estimators of our fixed effect parameters (betas) are going to be **consistent** (with a large enough sample size, those estimates will converge to a true value.) **EVEN** if we **mispecify the working correlation matrix**
    * This means even if we choose a bad correlation matrix, we will still get good consistent estimates
    * But the bad correlation matrix chose will **affect** the **Standard error**
 
![overview](week-3-img/how-to-fit-GEE.png)

#### Inferences:

![overview](week-3-img/inferences-gee.png)

![overview](week-3-img/inferences-gee-2.png)


## Revisiting the ESS Example 

![overview](week-3-img/revisit-ess.png)

**The dependency was introduced by the study design!**

![overview](week-3-img/revisit-ess-2.png)

### Interpretation

**Multilevel Model**
* In the multilevel model, because we have the explicit random interviewer effects in the model, we would say,
    * For a given interviewer, a 1 unit increase in trust in police leads to a 0.12 expected increase in helpfulness

**Marginal Model**
* When we are fitting the overall model across all the interviewers, and we are not explicitly conditioning on these random interviewer effects, we would make our interpretation as follows:
    * **Across all interviewers** (not conditional on any 1 interviewer) a 1 unit increase in trust in police will lead to an expected 0.04 increase in helpfulness.

### Model Diagnostics

* Asuming constant correlation within interviewers and our nuisance estimate of that correlation was 0.05.
* The corosponding QIC for this model is 6790.61
    * By itself, the QIC value isn't that useful.
    * We use that value to compare the fit of different models

* Using Unstructured or first order autoregressors don't make a lot of sense as tehre is **no explicit time ordering** of the cross-sectional observations with each interviewer.
    * We can't really use **auto-regressors** as there is no time element. we can;t say stuff like, " the observations on person 1 n 2 have a stronger correlation than the observations on person 1 and 10. 
    * Same thing applies for **unstructured correlation structure**
        * It's hard to specify why we would expect different  pairwise correlations of observations within the same interviewer depending on when those observations were collected as there is no time element here.

![overview](week-3-img/what-about-independent-model.png)

## Conclusion

* Marginally, when looking at overall relationship across interviewers, we did not find much evidence of relationship between trust in police and the perceived helpfulness as we did when controlling for all the interviewer effects explicitly.
* Accounting for the dependency rather than assuming independence of observations within each interviewer, did seem to improve model fit slightly.
    * **When using GEE, comparing models is very important**

#### Remember:
**When fitting marginal models, we can no longer make inferences about the between cluster variance.** <br>
we are just controlling for that nuisance correlation within clusters when making inferences about our fixed effects of the variable of interest.

# 1
A co-worker is analyzing longitudinal data that were collected at equally-spaced time points. They ask you for help reviewing the output from several marginal linear regression models fitted in Python, where each model had a different working correlation structure for capturing the correlations of observations within individuals. The QIC values for the models follow: Independent = 5689; First-order Autoregressive = 5570; Exchangeable = 5643; Unstructured = 5612. Which model would you recommend that your colleague interpret?


1. Independent

2. First-order Autoregressive

Correct 
Answer: b). This model has the lowest QIC value, suggesting the best fit.



3. Exchangeable


4. Unstructured