# Kaplan-Markov Risk-Limiting Batch Comparison Audits

This notebook provides an example of the use of the `kaplan_markov.py` code in this repository.
It reproduces a full example of the set-up for the audit of the 2010 Boulder County Coroner contest, as documented at:

  http://bcn.boulder.co.us/~neal/elections/boulder-audit-10-11/

For the underlying formulas and a worked example, see
 A Kaplan-Markov auditing example using 2008 California data
 Mark Lindeman, 1/10/2010 (v. 1.2x, 3/1/2010)
 https://d56fe2f5-a-62cb3a1a-s-sites.googlegroups.com/site/electionaudits/small-batch/kaplan-example-12x.pdf

Another test case in this repository works with the batch data from Lindeman's example, a 2008 election in California’s 3rd Congressional District (CD3), reproduced here in the file `ca-cd3-2018-batches.csv`

That is a subset of the full set of batches, from California's Statewide Database (SWDB).
To fully replicate the calculations, we'd need that whole dataset.

In [1]:
import csv
from kaplan_markov import *

In [2]:
csvfile = "boulder_2010_coroner_contest_batch_data.csv"

In [3]:
reader = csv.DictReader(open(csvfile, "r"))

In [4]:
rows = [row for row in reader]

If all the batches of the election had been included in our data, the tally, margins and total_error_bound U would be:

In [5]:
audit_data(rows, "name", "ballots", ["pruett", "hall"], ["hall"], 0)

({'pruett': 33924, 'hall': 49627, 'ballots': 121138},
 {'hall:pruett': 15703},
 8.714322104056551)

But 5347 ballots were not tallied at the time of the audit, so we reduce all the margins by that count, and get this:

In [6]:
audit_data(rows, "name", "ballots", ["pruett", "hall"], ["hall"], -5347)

({'pruett': 33924, 'hall': 49627, 'ballots': 121138},
 {'hall:pruett': 10356},
 13.213692545384328)

The U value of 13.21 means that for a risk limit of 10%, an audit would require 30 batches to be audited if there were no discrepancies:

In [7]:
batch_draws(0.1, 13.21)

29.250753357917453

In order to meet a 50% risk limit, 9 draws would be necessary:

In [8]:
batch_draws(0.5, 13.21)

8.805354156502075