# 07 Weighting

Exchangeability assumption implies that $\mathbb E(Y^a) = \mathbb E(Y|A=a)$. However, this is not guaranteed in observational studies.


## Survey Weights


### Re-Weighting

If we can survey everybody (the whole population), then the estimated outcome is simply the average: $\widehat{\mathbb E}(Y)= \frac 1N \sum_{i=1}^N Y_i$. The problem is that we can cover part of the population. Assume we have $n$ samples observed. Ideally, it would be 
$$\widehat{\mathbb E}(Y) = \frac 1n \sum_{i=1}^n Y_i.$$

However, when the confounding variables are taken into consideration, we need to be aware that the confounding variables of the sampled distribution is not identical to the general population. Therefore, we need to re-weigh  the samples to make the confounding variables of the sampled distribution identical to the general population.

Confounding variables that are sampled frequently should be given less weight, and confounding variables that are sampled less frequently should be given more weight.

### Horvitz-Thompson Estimator

Assume the probability of unit $w_i\ (i=1,\dotsc,N)$ being sampled is $p_i$. Assume the sample size is $n \ (n<N)$. Then the Horvitz-Thompson estimator is
$$\widehat{\mathbb E}(Y) =\frac 1n \sum_{i\ {\rm is \ sampled}}\frac{1}{p_i}\cdot \frac{n}{N}\cdot Y_i= \frac{1}{N}\sum_{i=1}^N \mathbb I_{i\ {\rm is \ sampled}}\cdot \frac{Y_i}{p_i}.$$

The weight $\frac{1}{p_i}\cdot \frac{n}{N}$ adjusts the sample distribution. The estimator is unbiased.

**Proof** $$\mathbb E\left[\widehat{\mathbb E}(Y) \right]= \frac{1}{N}\sum_{i=1}^N \mathbb E(\mathbb I_{i\ {\rm is \ sampled}})\cdot \frac{Y_i}{p_i}= \frac{1}{N}\sum_{i=1}^N Y_i.$$

<br>

We can see that the Horvitz-Thompson estimator is degenerated when each unit has identical probability to be sampled, i.e. $p_1 = \dotsc = p_N = \frac {n}{N}$.

## g-Formula

### Re-Weighting

We compare the g-formula with the idea of Horvitz-Thompson estimator. If we want to estimate $\mathbb E(Y^a)$, the  alternative choice is $\mathbb E(Y|A=a)$ when exchangeability holds:

$$\mathbb E(Y|A=a) = \sum_{c\in C}\mathbb E(Y|C=c,A=a)\mathbb P(C=c|A=a) .$$

However, for each unit with attribute $c$, the probability of it being treated $\mathbb P(A=1|C=c) = e(c)$ is not $\frac12$ when exchangeability does not hold. Therefore, we should assign an inverse weight $\frac{1}{e(c)}\cdot \mathbb P(A=1)$ to each unit with attribute $c$:

$$\begin{aligned}\mathbb E(Y^1) &= \sum_{c\in C}\mathbb E(Y|C=c,A=1)\mathbb P(C=c|A=1)\cdot  \frac{\mathbb P(A=1)}{e(c)}
\\&= \sum_{c\in C}\mathbb E(Y|C=c,A=1)\mathbb P(C=c).
\end{aligned}$$

The second equation is obtained by Bayes' rule.

### Estimator 

As the Horvitz-Thompson estimator has the second form $\frac{1}{N}\sum_{i=1}^N \mathbb I_{i\ {\rm is \ sampled}}\cdot \frac{Y_i}{p_i}$, the g-formula also has another form. It sums over the whole sample but extracts the treated units:

$$\widehat{\mathbb E}(Y^a) = \frac{1}{N}\sum_{i=1}^N \mathbb I_{A_i=1}\cdot \frac{Y_i}{e(c_i)}.$$


**Proof** We will show that it is unbiased:

$$\begin{aligned}\mathbb E\left[\widehat{\mathbb E}(Y^a) \right]&=  \frac{1}{N}\sum_{i=1}^N \mathbb E\left[ \mathbb I_{A_i=1}\cdot \frac{Y_i}{e(c_i)}\right]=   \frac{1}{N}\sum_{i=1}^N \frac{\mathbb P(A_i=1|C=c_i)Y^1}{e(c_i)}
=   \frac{1}{N}\sum_{i=1}^N Y^1 =\mathbb E(Y^1).\end{aligned}
$$

<br>


As a result, we can see that the average treatment effect, ACE, can be estimated by

$$\widehat{\mathbb E}(\text{ACE}) =  \frac{1}{N}\sum_{i=1}^N \mathbb I_{A_i=1}\cdot \frac{Y_i}{e(c_i)}\ -\ \frac{1}{N}\sum_{i=1}^N \mathbb I_{A_i=0}\cdot \frac{Y_i}{1-e(c_i)}.$$

## Longitudinal Data