# Ballot-polling SPRT

This notebook explores the ballot-polling SPRT we've developed.

In [1]:
%matplotlib inline
from __future__ import division
import math
import numpy as np
import numpy.random
import scipy as sp
import scipy.stats
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sb

from sprt import ballot_polling_sprt
from hypergeometric import hypergeometric_optim

  return f(*args, **kwds)
  return f(*args, **kwds)
  return f(*args, **kwds)
  return f(*args, **kwds)


The proportion of votes for each candidate in the sample is exactly those in the population, except the population is 50 times larger. The sample of votes is made up of 2000 votes for candidate $w$, 1800 votes for candidate $\ell$, and 500 invalid votes. 

Candidate $w$ earned $46.5\%$ of the votes and candidate $\ell$ earned $41.9\%$ of the votes, corresponding to a difference of about $4.6\%$. We will test the null hypothesis that they received the same proportion of votes overall against the alternative that the reported vote totals are correct.

## Trinomial SPRT without replacement

First, suppose we don't know the number of invalid ballots. Minimize the LR over possible values.

In [2]:
alpha = 0.05
sample = [1]*2000 + [0]*1800 + [np.nan]*500
popsize = 50*len(sample)
res = ballot_polling_sprt(sample, popsize, alpha, Vw=2000*50, Vl=1800*50)
print(res)

{'pvalue': 0.004640183723708297, 'Nu_used': 25000, 'upper_threshold': 20.0, 'decision': 1, 'sample_proportion': (0.46511627906976744, 0.4186046511627907, 0.11627906976744186), 'Nw_used': 95000, 'lower_threshold': 0.0, 'LR': 215.50870817693178}


The optimization does the right thing: if we did know that there were $500 \times 50$ invalid votes in the population, we'd get the same result!

In [3]:
res = ballot_polling_sprt(sample, popsize, alpha, Vw=2000*50, Vl=1800*50, number_invalid=500*50)
print(res)

{'pvalue': 0.004640183723708297, 'Nu_used': 25000, 'upper_threshold': 20.0, 'decision': 1, 'sample_proportion': (0.46511627906976744, 0.4186046511627907, 0.11627906976744186), 'Nw_used': 95000.0, 'lower_threshold': 0.0, 'LR': 215.50870817693178}


## What happens when the reported outcome is wrong

In 100 replicates, we draw samples of 500 ballots and conduct the SPRT using the reported results as the alternative hypothesis. We never reject the null.

We do the same for samples of size 1000.

We also test the null hypothesis with the conditional hypergeometric test. The first comparison uses the same risk limit of 5%, while the second cuts the risk limit in half to account for the possibility of escalating the audit.

Candidate | Reported | Actual
---|---|---
A | 5500 | 4900 
B | 4200 | 5000 
Ballots | 10,000 | 10,000 
Diluted margin | 13% | -1% 

In [4]:
np.random.seed(8062018)
alpha = 0.05
population = [1]*4900 + [0]*5000 + [np.nan]*100
popsize = len(population)
reps = 100
rejects_sprt = 0
rejects_hyper = 0
rejects_hyper_red = 0

for i in range(reps):
    sample = np.random.choice(population, replace=False, size=500)
    res = ballot_polling_sprt(sample, popsize, alpha, Vw=5500, Vl=4200)
    if res['decision']==1:
        rejects_sprt += 1
    res2 = hypergeometric_optim(sample, popsize)
    if res2['pvalue'] <= alpha:
        rejects_hyper += 1
    if res2['pvalue'] <= alpha/2:
        rejects_hyper_red += 1

print("Samples of size 500, SPRT rejection rate:", rejects_sprt/reps)
print("Samples of size 500, fixed n hypergeometric rejection rate:", rejects_hyper/reps)
print("Samples of size 500, fixed n hypergeometric rejection rate at reduced risk limit:", rejects_hyper_red/reps)


rejects_sprt = 0
rejects_hyper = 0
rejects_hyper_red = 0

for i in range(reps):
    sample = np.random.choice(population, replace=False, size=1000)
    res = ballot_polling_sprt(sample, popsize, alpha, Vw=5500, Vl=4200)
    if res['decision']==1:
        rejects_sprt += 1
    res2 = hypergeometric_optim(sample, popsize)
    if res2['pvalue'] <= alpha:
        rejects_hyper += 1
    if res2['pvalue'] <= alpha/2:
        rejects_hyper_red += 1

    
print("Samples of size 1000, SPRT rejection rate:", rejects_sprt/reps)
print("Samples of size 1000, fixed n hypergeometric rejection rate:", rejects_hyper/reps)
print("Samples of size 1000, fixed n hypergeometric rejection rate at reduced risk limit:", rejects_hyper_red/reps)

Samples of size 500, SPRT rejection rate: 0.0
Samples of size 500, fixed n hypergeometric rejection rate: 0.03
Samples of size 500, fixed n hypergeometric rejection rate at reduced risk limit: 0.01
Samples of size 1000, SPRT rejection rate: 0.0
Samples of size 1000, fixed n hypergeometric rejection rate: 0.04
Samples of size 1000, fixed n hypergeometric rejection rate at reduced risk limit: 0.01


## What happens when the reported outcome is right, but counts are off

In 100 replicates, we draw samples of ballots and conduct the SPRT using the reported results as the alternative hypothesis. The power of the SPRT is less than the fixed sample size conditional hypergeometric test, even when the risk limit is halved. But a population of $N=10,000$ is probably smaller than what we'd use Bernoulli ballot polling for.

Candidate | Reported | Actual
---|---|---
A | 5500 | 5300 
B | 4200 | 4300 
Ballots | 10,000 | 10,000 
Diluted margin | 13% | 10% 

In [5]:
np.random.seed(8062018)
alpha = 0.05
population = [1]*5300 + [0]*4300 + [np.nan]*400
popsize = len(population)
reps = 100
rejects_sprt = 0
rejects_hyper = 0
rejects_hyper_red = 0

for i in range(reps):
    sample = np.random.choice(population, replace=False, size=1000)
    res = ballot_polling_sprt(sample, popsize, alpha, Vw=5500, Vl=4200)
    if res['decision']==1:
        rejects_sprt += 1
    res2 = hypergeometric_optim(sample, popsize)
    if res2['pvalue'] <= alpha:
        rejects_hyper += 1
    if res2['pvalue'] <= alpha/2:
        rejects_hyper_red += 1

print("Samples of size 1000, SPRT rejection rate:", rejects_sprt/reps)
print("Samples of size 1000, fixed n hypergeometric rejection rate:", rejects_hyper/reps)
print("Samples of size 1000, fixed n hypergeometric rejection rate at reduced risk limit:", rejects_hyper_red/reps)

rejects_sprt = 0
rejects_hyper = 0
rejects_hyper_red = 0

for i in range(reps):
    sample = np.random.choice(population, replace=False, size=2000)
    res = ballot_polling_sprt(sample, popsize, alpha, Vw=5500, Vl=4200)
    if res['decision']==1:
        rejects_sprt += 1
    res2 = hypergeometric_optim(sample, popsize)
    if res2['pvalue'] <= alpha:
        rejects_hyper += 1
    if res2['pvalue'] <= alpha/2:
        rejects_hyper_red += 1

print("Samples of size 2000, SPRT rejection rate:", rejects_sprt/reps)
print("Samples of size 2000, fixed n hypergeometric rejection rate:", rejects_hyper/reps)
print("Samples of size 2000, fixed n hypergeometric rejection rate at reduced risk limit:", rejects_hyper_red/reps)

rejects_sprt = 0
rejects_hyper = 0
rejects_hyper_red = 0

for i in range(reps):
    sample = np.random.choice(population, replace=False, size=3000)
    res = ballot_polling_sprt(sample, popsize, alpha, Vw=5500, Vl=4200)
    if res['decision']==1:
        rejects_sprt += 1
    res2 = hypergeometric_optim(sample, popsize)
    if res2['pvalue'] <= alpha:
        rejects_hyper += 1
    if res2['pvalue'] <= alpha/2:
        rejects_hyper_red += 1

print("Samples of size 3000, SPRT rejection rate:", rejects_sprt/reps)
print("Samples of size 3000, fixed n hypergeometric rejection rate:", rejects_hyper/reps)
print("Samples of size 3000, fixed n hypergeometric rejection rate at reduced risk limit:", rejects_hyper_red/reps)

Samples of size 1000, SPRT rejection rate: 0.61
Samples of size 1000, fixed n hypergeometric rejection rate: 0.97
Samples of size 1000, fixed n hypergeometric rejection rate at reduced risk limit: 0.95
Samples of size 2000, SPRT rejection rate: 0.77
Samples of size 2000, fixed n hypergeometric rejection rate: 1.0
Samples of size 2000, fixed n hypergeometric rejection rate at reduced risk limit: 1.0
Samples of size 3000, SPRT rejection rate: 0.9
Samples of size 3000, fixed n hypergeometric rejection rate: 1.0
Samples of size 3000, fixed n hypergeometric rejection rate at reduced risk limit: 1.0


## What happens when the reported outcome is exactly right, N=100k

In 100 replicates, we draw samples of ballots and conduct the SPRT using the reported results as the alternative hypothesis. This population is a factor of 10 bigger. In the paper Table 1, we estimate that a with a risk limit of 5%, a sampling rate of 1% gives 80% power and a sampling rate of 6% gives 99% power using the fixed $n$ hypergeometric test. How does the SPRT compare?

We run the SPRT with both 5% and 10% risk limits. Using a risk limit of 10% makes the SPRT comparable to running the conditional hypergeometric test with a reduced first-stage risk limit, allowing for further escalation.

Candidate | Reported | Actual
---|---|---
A | 55,000 | 55,000 
B | 45,000 | 45,000 
Ballots | 100,000 | 100,000 
Diluted margin | 10% | 10% 

In [8]:
np.random.seed(8062018)
population = [1]*55000 + [0]*45000
popsize = len(population)
reps = 100

for alpha in [0.05, 0.1]:
    print("Risk limit:", alpha)

    rejects_sprt = 0
    for i in range(reps):
        sample = np.random.choice(population, replace=False, size=1000)
        res = ballot_polling_sprt(sample, popsize, alpha, Vw=55000, Vl=45000)
        if res['decision']==1:
            rejects_sprt += 1

    print("Samples of size 1% of population, SPRT rejection rate:", rejects_sprt/reps)

    rejects_sprt = 0

    for i in range(reps):
        sample = np.random.choice(population, replace=False, size=2000)
        res = ballot_polling_sprt(sample, popsize, alpha, Vw=55000, Vl=45000)
        if res['decision']==1:
            rejects_sprt += 1

    print("Samples of size 2% of population, SPRT rejection rate:", rejects_sprt/reps)

    rejects_sprt = 0

    for i in range(reps):
        sample = np.random.choice(population, replace=False, size=6000)
        res = ballot_polling_sprt(sample, popsize, alpha, Vw=55000, Vl=45000)
        if res['decision']==1:
            rejects_sprt += 1

    print("Samples of size 6% of population, SPRT rejection rate:", rejects_sprt/reps)

Risk limit: 0.05
Samples of size 1% of population, SPRT rejection rate: 0.73
Samples of size 2% of population, SPRT rejection rate: 0.96
Samples of size 6% of population, SPRT rejection rate: 1.0
Risk limit: 0.1
Samples of size 1% of population, SPRT rejection rate: 0.82
Samples of size 2% of population, SPRT rejection rate: 0.94
Samples of size 6% of population, SPRT rejection rate: 1.0


Do it again, but now the margin is 2%. The paper Table 1 says that at risk limit 5%, the conditional hypergeometric test needs a 14% sample for 80% power, 19% sample for 90% power, and 48% sample for 99% power.

Candidate | Reported | Actual
---|---|---
A | 51,000 | 51,000 
B | 49,000 | 49,000 
Ballots | 100,000 | 100,000 
Diluted margin | 2% | 2% 

In [7]:
np.random.seed(8062018)
population = [1]*51000 + [0]*49000
popsize = len(population)
reps = 100

for alpha in [0.05, 0.1]:
    print("Risk limit:", alpha)
    rejects_sprt = 0
    for i in range(reps):
        sample = np.random.choice(population, replace=False, size=14000)
        res = ballot_polling_sprt(sample, popsize, alpha, Vw=51000, Vl=49000)
        if res['decision']==1:
            rejects_sprt += 1

    print("Samples of size 14% of population, SPRT rejection rate:", rejects_sprt/reps)

    rejects_sprt = 0

    for i in range(reps):
        sample = np.random.choice(population, replace=False, size=19000)
        res = ballot_polling_sprt(sample, popsize, alpha, Vw=51000, Vl=49000)
        if res['decision']==1:
            rejects_sprt += 1

    print("Samples of size 19% of population, SPRT rejection rate:", rejects_sprt/reps)

    rejects_sprt = 0

    for i in range(reps):
        sample = np.random.choice(population, replace=False, size=48000)
        res = ballot_polling_sprt(sample, popsize, alpha, Vw=51000, Vl=49000)
        if res['decision']==1:
            rejects_sprt += 1

    print("Samples of size 48% of population, SPRT rejection rate:", rejects_sprt/reps)

Risk limit: 0.05
Samples of size 14% of population, SPRT rejection rate: 0.44
Samples of size 19% of population, SPRT rejection rate: 0.66
Samples of size 48% of population, SPRT rejection rate: 0.98
Risk limit: 0.1
Samples of size 14% of population, SPRT rejection rate: 0.63
Samples of size 19% of population, SPRT rejection rate: 0.77
Samples of size 48% of population, SPRT rejection rate: 0.99
