The *New York Times* (January 8, 2004, page A12)
reported the following data on death sentencing and race,
from a study in Maryland:

|               | No Death Sentence | Death Sentence |
| ------------- | ----------------- | -------------- |
| Black Victim  | 641               | 14             |
| White Victim  | 594               | 62             |

Analyze the data using the tools from this chapter.
Interpret the results.
Explain why, based only on this information, you can't make causal conclusions.
(The authors of the study did use much more information in their full report.)

In [1]:
from collections import namedtuple

import numpy as np
import scipy.stats

We have two binary variables, $Y$ and $Z$, defined as follows:
$$
\begin{align*}
    Y = 0 & \iff \text{No Death Sentence} & Z = 0 & \iff \text{Black Victim}\\
    Y = 1 & \iff \text{Death Sentence}    & Z = 1 & \iff \text{White Victim}
\end{align*}
$$
We choose this convention in order to be consistent with the notes,
where the odds ratio $\psi$ is defined to describe the impact of $Z$ on $Y$
such that it tells us much more (or less) likely $Y=1$ is than $Y = 0$ given $Z$.

In [3]:
# Manually enter the data
X = np.array([[641, 14], [594, 62]])

TestResult = namedtuple('TestResult', [
    'name', 'statistic', 'p_value'
])
ConfidenceInterval = namedtuple('ConfidenceInterval', [
    'interval', 'name'
])
OddsRatioMLE = namedtuple('OddsRatioMLE', [
    'psi', 'gamma', 'se_gamma'
])

def report_test_result(test_result):
    """
    Report out the p-value of a test for
    the independence of two binary random variables.
    """
    
    print(
        f"{test_result.name} test has a p-value of {test_result.p_value:.3}"
    )

def lrt_indep(X):
    """
    Perform the likelihood ratio test for
    the independence of two binary random variables.
    
    Return a TestResult instance.
    """
    
    # D_{ij} = X_{i.} X_{.j}
    D = np.tensordot(
        X.sum(axis=1),
        X.sum(axis=0),
        axes=0
    )
    
    # Test statistic
    T = 2*np.sum(X*np.log(X.sum()*X/D))
    
    # p-value
    pval = scipy.stats.chi2.sf(T, df=1)
    
    return TestResult('The likelihood ratio', T, pval)

def pearson_indep(X):
    """
    Perform Pearson's chi-squared test
    for the independence of two binary random variables.
    
    Return a TestResult instance.
    """
    
    # E_{ij} = X_{i.} X_{.j} / X_{..}
    E = np.tensordot(
        X.sum(axis=1),
        X.sum(axis=0),
        axes=0
    )/X.sum()
    
    # Test statistic
    U = np.sum((X-E)**2/E)
    
    # p-value
    pval = scipy.stats.chi2.sf(U, df=1)
    
    return TestResult('Pearsons chi-squared', U, pval)

def odds_ratio_mle(X):
    """
    Compute the MLE for the odds ratio and
    log odds ratio, as well as an estimate
    of the standard error for the log odds ratio.
    
    Return a OddsRatioMLE object.
    """
    
    # Maximum likelihood estimates
    # of the odds ratio and log odds ratio
    psi = (X[0,0]*X[1,1])/(X[0,1]*X[1,0])
    gamma = np.log(psi)
    
    # Estimate of the standard error
    se_gamma = np.sum(1/X)
    
    return OddsRatioMLE(psi, gamma, se_gamma)

def wald_indep(X, alpha=0.05):
    """
    Perform the Wald test, using the log odds ratio,
    for the independence of two binary random variables.
    
    Return a TestResult instance.
    """
    
    # Maximum likelihood estimates
    # of the odds ratio and log odds ratio and
    # estimate of the standard error
    mle = odds_ratio_mle(X)
    
    # Test statistic
    W = mle.gamma/mle.se_gamma
    
    # p-value
    pval = 2*scipy.stats.norm.cdf(-np.abs(W))
    
    return TestResult('The Wald', W, pval)

def confidence_interval_odds_ratio(X, alpha=0.05):
    """
    Returns a confidence interval for the odds ratio psi
    using the standard error of the **log** odds ratio.
    """

    # Maximum likelihood estimates
    # of the odds ratio and log odds ratio and
    # estimate of the standard error
    mle = odds_ratio_mle(X)
    
    # Bounds of the confidence interval
    z = scipy.stats.norm.isf(alpha/2)
    lower_bound = np.exp(mle.gamma - z*mle.se_gamma)
    upper_bound = np.exp(mle.gamma + z*mle.se_gamma)
    
    return ConfidenceInterval(
        (lower_bound, upper_bound),
        'odds ratio'
    )

def report_interval(confidence_interval):
    """
    Report out a confidence interval
    """
    
    lower_bound, upper_bound = confidence_interval.interval
    print(
        f"Confidence interval for the {confidence_interval.name}: "
        f"({lower_bound:4.3}, {upper_bound:4.3})."
    )
    
def analyze_independence(X):
    
    report_test_result(lrt_indep(X))
    report_test_result(pearson_indep(X))
    report_test_result(wald_indep(X))
    report_interval(confidence_interval_odds_ratio(X))

In [4]:
analyze_independence(X)

The likelihood ratio test has a p-value of 4.19e-09
Pearsons chi-squared test has a p-value of 1.46e-08
The Wald test has a p-value of 1.67e-66
Confidence interval for the odds ratio: ( 4.0, 5.71).


## Analysis of the results
1. All three tests offer overwhelming evidence that the two random variables are
   **not** independent.
2. The confidence interval for the odds ratio indicates that
   a death sentence is 4-5 times more likely when the victim
   is White than when the victim is Black.
3. We **cannot**, with this data, argue that there is a *causal*
   relation between the two variables. (It could be for example
   that cases involving Black victims are more likely to be brought
   to a more lenient judge or jury).