# Chapter 10: Random Variability

There are two qualitatively different reasons why causal inferences may be wrong: systematic bias and random variability.

The previous 3 chapters described 3 types of systematic biases:
- selection bias
- measurement bias
- unmeasured confounding

Now, this chapter discusses random variability and how to deal with it



## 10.1 Identification Versus Estimation

Say that we may want to make inferences about the super-population probability $Pr[Y=1|A=a]$.

We refer to the parameter of interest in the super-population, the probability $Pr[Y=1|A=a]$ in this case, as the *estimand*.

An *estimator* is a rule that takes the data from any sample from the super-population and produces a numerical value for the estimand.

This numerical value for a particular sample is the *estimate* from the sample.

The sample proportion of individuals that develop the outcome among those receiving treatment level $a$,
$$\hat{Pr}[Y=1|A=a]$$
, is an estimator of the super-population probability $Pr[Y=1|A=a]$.

The sample estimate $\hat{Pr}[Y=1|A=a]$'s value is the *point estimate*. Of-course, the value of the estimate will depend on the values of the randomly sampled users from the super-population.

An estimator is *consistent* for a particular estimand if the estimates get closer to the parameter as the sample size increases.

In the absense of systematic biases, statistical theory allows one to quantify the confidence in the point estimate in the form of a confidence interval around the point estimate.

A common way to construct a 95% CI for a point estimate is to use a 95% Wald confidence interval centered at a point estimate.

First, estimate the SE of the point estimate under the assumption that our study population is a random sample from a much larger super-population.

Second, calculate the upper limit of the 95% Wald confidence interval by adding 1.96 times the estimated SE to the point estimate (and subtract for lower bound)

For example, consider the estimator $\hat{Pr}[Y=1|A=a]=\hat p$ of the super-population parameter $Pr[Y=1|A=a]=p$. Its standard error is:
$$\sqrt{\frac{p(1-p)}{n}}$$
(the standard error of a binomial)

Recall that the Wald 95% CI for a parameter $\theta$ based on an estimator $\hat\theta$ is
$$\hat\theta\pm 1.96\times\hat{se}(\hat\theta)$$
where 1.96 is the upper 97.5% quantile of a standard normal distribution with mean 0 and variance 1.

A 95% CI is considered to be *calibrated* if the estimand is contained in the interval in 95% of random samples, *conservative* if more than 95% of random samples, and *anti-conservative* otherwise. A CI is *valid* if calibrated or conservative.

Note that a Wald CI is only guaranteed to be valid in large samples, and can actually be anti-conservative in small samples according to (Brown et al., 2001).

CI's are often classified as either *small-sample* or *large-sample* confidence intervals. *small-sample* calibrated confidence intervals are sometimes called *exact* confidence intervals. A large-sample is sometimes called an *asymptotic* confidence interval.

Not all consistent estimators can be used tgo center a valid Wald CI, even in large samples. Most users of statistics will consider an estimator unbiased if it can center a valid Wald interval and biased if it cannot. Therefore, *bias* can refer to the inability to center valid Wald CI's.

## 10.2 Estimation of Causal Effects

## 10.3 The Myth of the Super-Population

## 10.4 The Conditionality "Principle"

## 10.5 The Curse of Dimensionality