## 16. Inference about Independence

This chapter addresses two questions:

1. How do we test if two random variables are independent?
2. How do we estimate the strength of dependence between two random variables?

Recall we write $Y {\perp \!\!\! \perp} Z$ to mean that $Y$ and $Z$ are independent.

When $Y$ and $Z$ are not independent, we say they are **dependent** or **associated** or **related**.

Note that dependence does not mean causation:
- Smoking is related to heart disease, and quitting smoking will reduce the chance of heart disease.
- Owning a TV is related to lower starvation, but giving a starving person a TV does not make them not hungry.

### 16.1 Two Binary Variables

Suppose that both $Y$ and $Z$ are binary.  Consider a data set $(Y_1, Z_1), \dots, (Y_n, Z_n)$.  Represent the data as a two-by-two table:

$$
\begin{array}{c|cc|c} 
      & Y = 0  & Y = 1 & \\
\hline
Z = 0 & X_{00} & X_{01} & X_{0\cdot}\\
Z = 1 & X_{10} & X_{11} & X_{1\cdot}\\
 \hline
      & X_{\cdot 0} & X_{\cdot 1} & n = X_{\cdot \cdot}
\end{array}
$$

where $X_{ij}$ represents the number of observations where $(Z_k, Y_k) = (i, j)$.  The dotted subscripts denote sums, e.g. $X_{i\cdot} = \sum_j X_{ij}$.  Denote the corresponding probabilities by:

$$
\begin{array}{c|cc|c} 
      & Y = 0  & Y = 1 & \\
\hline
Z = 0 & p_{00} & p_{01} & p_{0\cdot}\\
Z = 1 & p_{10} & p_{11} & p_{1\cdot}\\
 \hline
      & p_{\cdot 0} & p_{\cdot 1} & 1
\end{array}
$$

where $p_{ij} = \mathbb{P}(Z = i, Y = j)$.  Let $X = (X_{00}, X_{01}, X_{10}, X_{11})$ denote the vector of counts.  Then $X \sim \text{Multinomial}(n, p)$ where $p = (p_{00}, p_{01}, p_{10}, p_{11})$.

The **odds ratio** is defined to be

$$ \psi = \frac{p_{00} p_{11}}{p_{01} p_{10}}$$

The **log odds ratio** is defined to be

$$ \gamma = \log \psi$$

**Theorem 16.2**.  The following statements are equivalent:

1. $Y {\perp \!\!\! \perp} Z$
2. $\psi = 1$
3. $\gamma = 0$
4. For $i, j \in \{ 0, 1 \}$, $p_{ij} = p_{i\cdot} p_{\cdot j}$

Now consider testing

$$
H_0: Y {\perp \!\!\! \perp} Z
\quad \text{versus} \quad
H_1: Y {\not\!\perp \!\!\! \perp} Z
$$

First consider the likelihood ratio test.  Under $H_1$, $X \sim \text{Multinomial}(n, p)$ and the MLE is $\hat{p} = X / n$.  Under $H_0$, again $X \sim \text{Multinomial}(n, p)$ but $p$ is subjected to the constraint $p_{ij} = p_{i\cdot} p_{\cdot j}$.  This leads to the following test.

**Theorem 16.3 (Likelihood Ratio Test for Independence in a 2-by-2 table)**. 
Let

$$ T = 2 \sum_{i=0}^1 \sum_{j=0}^1 X_{ij} \log \left( \frac{X_{ij} X_{\cdot \cdot}}{X_{i \cdot} X_{\cdot j}} \right)$$

Under $H_0$, $T \leadsto \chi_1^2$.  Thus, an approximate level $\alpha$ test is obtained by rejecting $H_0$ when $T > \chi_{1, \alpha}^2$.

**Theorem 16.4 (Pearson's $\chi^2$ test for Independence in a 2-by-2 table)**. Let

$$ U = \sum_{i=0}^1 \sum_{j=0}^1 \frac{(X_{ij} - E_{ij})^2}{E_{ij}} $$

where

$$ E_{ij} = \frac{X_{i\cdot} X_{\cdot j}}{n}$$

Under $H_0$, $U \leadsto \chi_1^2$.  Thus, an approximate level $\alpha$ test is obtained by rejecting $H_0$ when $U > \chi_{1, \alpha}^2$.

Here's the intuition for the Pearson test: Under $H_0$, $p_{ij} = p_{i \cdot} p_{\cdot j}$, so the MLE of $p_{ij}$ is $\hat{p}_{ij} = \hat{p}_{i \cdot} \hat{p}_{\cdot j} = \frac{X_{i \cdot}}{n} \frac{X_{\cdot j}}{n}$.  Thus, the expected number of observations in the $(i, j)$ cell is $E_{ij} = n \hat{p}_{ij} = \frac{X_{i \cdot} X_{\cdot j}}{n}$.  The statistic $U$ compares the observed and expected counts.

**Theorem 16.6**. The MLE's of $\psi$ and $\gamma$ are

$$
\hat{\psi} = \frac{X_{00} X_{11}}{X_{01} X_{10}}
, \quad
\hat{\gamma} = \log \hat{\psi}
$$

The asymptotic standard errors (computed from the delta method) are

$$
\begin{align}
\hat{\text{se}}(\hat{\psi}) &= \sqrt{\frac{1}{X_{00}} + \frac{1}{X_{01}} + \frac{1}{X_{10}} + \frac{1}{X_{11}}}\\
\hat{\text{se}}(\hat{\gamma}) &= \hat{\psi} \hat{\text{se}}(\hat{\gamma})
\end{align}
$$

Yet another test of independence is the Wald test for $\gamma = 0$ given by $W = (\hat{\gamma} - 0) / \hat{\text{se}}(\hat{\gamma})$. 

A $1 - \alpha$ confidence interval for $\gamma$ is $\hat{\gamma} \pm z_{\alpha/2} \hat{\text{se}}(\hat{\gamma})$.

A $1 - \alpha$ confidence interval for $\psi$ can be obtained in two ways.  First, we could use $\hat{\psi} \pm z_{\alpha/2} \hat{\text{se}}(\hat{\psi})$.  Second, since $\psi = e^{\gamma}$ we could use  $\exp \{\hat{\gamma} \pm z_{\alpha/2} \hat{\text{se}}(\hat{\gamma})\}$.  This second method is usually more accurate.