## 16.4 Collapsibility

We expect the effect estimates for a treatment to change when we adjust for an important confounder. Conversely, when we adjust for a variable which is not a confounder, intuitively we do not expect the estimated treatment effect to change. However, it turns out that this intuition is only correct in certain situations.

In this section we will look at a property of certain estimands which is called ‘collapsibility’. For this we consider a modified DAG as shown in Figure 2. Compared with the DAG in Figure 1 the arrow from Z to X has been removed, indicating an assumption that Z does not affect X. A situation such as this would arise if X is a randomized treatment in a randomized trial, for example.

```{figure} Images/Session 16 Figure 2.jpg
---
height: 300px
name: Figure 2
---
```

We consider the scenario depicted in Figure 2 for the case of a continuous outcome and then a
binary outcome.

### 16.4.1 Continuous outcome

We will use simulated data in this section. Data were generated on Y , X and Z for 4000 individuals. 1000 individuals were in each of the four groups (X = 0, Z = 0), (X = 0, Z = 1), (X = 1, Z = 0), (X = 1, Z = 1). The outcome Y was generated using the linear model

<center>$Y = 10 + 2X + Z + \epsilon $<center>
<div style="text-align: right"> (15) </div>
    
where the residuals $\epsilon$ follow a normal distribution with mean 0 and variance 1. The data conform to the assumptions in Figure 2: there is no (marginal) association between X and Z, but both X and Z affect Y .
    
Suppose we are interested in the effect of X on Y . As in the previous section the conditional expectations E(Y |do(X = x), Z = z) can be estimated using E(Y |do(X = x), Z) = E(Y |X = x, Z = z). Unlike in the previous section, here there is no confounding by Z and so the marginal expectation can be written as E(Y |do(X = x)) = E(Y |X = x). That is, in this situation it is legitimate to estimate the marginal treatment effect without having to use standardization (and using standardization will give the same result).
    
Results from two linear regression models are shown below. The regression of Y on X alone provides an estimate of the marginal treatment effect E(Y |do(X = 1)) − E(Y |do(X = 0)) of 1.98. The regression of Y on X and Z provides an estimate of the conditional treatment effect E(Y |do(X = 1), Z = z) − E(Y |do(X = 0), Z = z) of 1.98, which is assumed by the model to be the same for Z = 0, 1 (and which we know is true because of how the data were simulated - you can check this in the practical using the simulated data by including an interaction term).
    
The marginal and conditional effect estimates are identical, which we expect because there is no
confounding by Z.

```{figure} Images/Session 16 Figure 3.jpg
---
height: 300px
name: Figure 3
---
```

Coefficients in the linear regression model, i.e. mean differences, are described as being ‘collapsible’. This means that when there is no effect of Z on X, the marginal treatment effect (which is ‘collapsed’ over Z) is the same as the treatment effect conditional on Z. The implication of this is that if we adjust for a variable Z and see that the coefficient for X does not change, then this implies that Z does not confound the association between X and Y . Note the standard errors are
different.

We can show this result algebraically. Consider the linear regression model

<center>$Y = \beta_{0} + \beta_{X}X + \beta_{Z}Z + \epsilon $<center>
<div style="text-align: right"> (16) </div>
    
Under this model, the expectation of Y given X is
    
<center>$E(Y |X) = \beta_{0} + \beta_{X}X + \beta_{Z}E(Z|X) $<center>
<div style="text-align: right"> (17) </div>
    
If X and Z are marginally independent, as in Figure 2, then E(Z|X) = E(Z), and we use the notation E(Z) = $\mu_{Z}$. Then we have

<center>$E(Y |X) = \beta_{0} + \beta_{Z}\mu_{Z} + \beta_{X}X $<center>
<div style="text-align: right"> (18) </div>
    
Therefore, if we fit the regression model for Y with X as the only covariate, the coefficient for X is identical to the coefficient for X in the model which adjusts for Z (i.e. $\beta_{X}$), if X and Z are marginally independent. Note that the intercept changes from $\beta_{0}$ to $\beta_{0} + \beta_{Z}\mu_{Z}$.


### 16.4.2 Binary outcome

Next, we investigate the setting with a binary outcome Y , considering the example data in Table 4.

```{figure} Images/Session 16 Table 4.jpg
---
height: 300px
name: Table 4
---
```

Earlier in this session we considered three ways of measuring the treatment effect for a binary
outcome: a risk difference, risk ratio, and odds ratio. First, let’s consider the odds ratio. The conditional odds ratios in the Z = 0 and Z = 1 groups are

<center>$\frac{Pr(Y = 1|do(X = 1), Z = 0) Pr(Y = 0|do(X = 0), Z = 0)}{Pr(Y = 1|do(X = 0), Z = 0) Pr(Y = 0|do(X = 1), Z = 0)} = \frac{900 x 500}{100 x 500} = 9 $<center>
<div style="text-align: right"> (19) </div>
    
<center>$\frac{Pr(Y = 1|do(X = 1), Z = 1) Pr(Y = 0|do(X = 0), Z = 1)}{Pr(Y = 1|do(X = 0), Z = 1) Pr(Y = 0|do(X = 1), Z = 1)} = \frac{500 x 900}{500 x 100} = 9 $<center>
<div style="text-align: right"> (20) </div>
    
The marginal odds ratio is
    
<center>$\frac{Pr(Y = 1|do(X = 1)) Pr(Y = 0|do(X = 0))}{Pr(Y = 1|do(X = 0)) Pr(Y = 0|do(X = 1))} = \frac{1400 x 1400}{600 x 600} = 5.44 $<center>
<div style="text-align: right"> (21) </div>
    
So the conditional odds ratio is equal to 9 in the two Z groups, telling us there is no interaction between X and Z on the odds ratio scale. However, the marginal odds ratio is 5.44. Odds ratios are ‘non-collapsible’. Meaning that even if Z is not a confounder the marginal and conditional odds ratios for X will be different. In this example they are quite different in magnitude (comparing 5.44 with 9). The implication of this is that if we compare the coefficient for X in a logistic regression of Y on X with that from a logistic regression of Y on X and Z, a change in the coefficient (a log odds ratio) does not necessarily suggest that Z is a confounder. Due to the non-collapsibility of odds ratios, we expect the coefficient for X to change even when Z is not a confounder .
    
Next consider the risk difference. The conditional risk differences in the Z = 0 and Z = 1 groups are
    
<center>$Pr(Y = 1|do(X = 1), Z = 0) − Pr(Y = 1|do(X = 0), Z = 0) = \frac{500}{1000} - \frac{100}{1000} = 0.4 $<center>
<div style="text-align: right"> (22) </div>
    
<center>$Pr(Y = 1|do(X = 1), Z = 1) − Pr(Y = 1|do(X = 0), Z = 1) = \frac{900}{1000} - \frac{500}{1000} = 0.4 $<center>
<div style="text-align: right"> (23) </div>
    
The marginal risk difference, which we can estimate without the use of standardization because of the lack of an arrow from Z to X in Figure 2, is
    
<center>$Pr(Y = 1|do(X = 1)) − Pr(Y = 1|do(X = 0)) = \frac{1400}{2000} - \frac{600}{2000} = 0.4 $<center>
<div style="text-align: right"> (24) </div>
    
The conditional risk difference is the same in both Z groups, suggesting no interaction between X and Z on the risk difference scale, and also the conditional risk differences are equal to the marginal risk difference. Risk differences are collapsible.
    
Finally, consider the risk ratio. The conditional risk ratios are
    
<center>$Pr(Y = 1|do(X = 1), Z = 0)/Pr(Y = 1|do(X = 0), Z = 0) = 5$<center>
<div style="text-align: right"> (25) </div>    
    
<center>$Pr(Y = 1|do(X = 1), Z = 1)/Pr(Y = 1|do(X = 0), Z = 1) = 1.8$<center>
<div style="text-align: right"> (26) </div>    
    
and the marginal risk ratio is

<center>$Pr(Y = 1|do(X = 1))/Pr(Y = 1|do(X = 0)) = 2.33 $<center>
<div style="text-align: right"> (27) </div>  
    
In this case the risk ratio differs in the Z = 0 and Z = 1 groups - that is, there is an interaction between X and Z. Interactions cannot be depicted in the DAG.
    
Interactions are scale-dependent, meaning that we can have an interaction between X and Z when we measure the treatment effect on one scale but not on another. In this example there is an interaction on the risk ratio scale but not on the risk difference scale or the odds ratio scale. This is actually quite an unusual example in that there is no interaction on either the risk difference or odds ratio scale. In general if there is no interaction on one scale then there are interactions on the other two scales. Risk ratios are actually collapsible, however collapsibility is only a relevant concept when there is no X-by-Z interaction.


### 16.4.3 Implications for randomised controlled trials

The preceding results have implications for randomised trials with binary outcomes when the treatment effect is quantified using an odds ratio. In RCTs, baseline covariates are sometimes adjusted for. Given the non-collapsibility of odds ratios, the odds ratio for the treatment effect will differ depending on what baseline covariates are adjusted for. This is potentially problematic, as it means that, all other things being equal, different trials may be estimating different quantities simply due to differences in the covariates which are being adjusted for.

There is also the question of which treatment effect we should be interested in estimating. The marginal treatment effect is (arguably) the relevant quantity for making policy decisions, while conditional effects are more relevant for answering how effective a treatment will be for a particular individual (on the basis of the values of their covariates). Marginal quantities refer to a specific population and care should be taken to consider whether marginal estimates from a trial are transportable to a wider population in which a treatment could be used.

### 16.4.4 Implications for observational studies

One approach to deciding whether or not a variable is a confounder for an exposure’s effect on an outcome is to compare its estimated effect before and after adjustment for the potential confounder. The above results show that when using logistic regression, we should be aware that a change could be attributable purely to the non-collapsibility of odds ratios. To determine whether this is the case, we could assess the association between exposure and the potential confounder. If they are independent, any difference between the marginal and conditional odds ratio for treatment could be attributable to non-collapsibility. In practice, non-collapsibility may not have a big impact on estimates. This is true when a binary outcome is rare, because then the odds ratio approximates a risk ratio, which is collapsible.
