# Stratified Assertion RLA

## Overview of stratified assertion audit tool

The tool requires the following as input per stratum: 

+ audit-specific and contest-specific parameters, such as
    - whether to sample with or without replacement
    - the name of the risk function to use, and any parameters it requires
    - a risk limit for each contest to be audited
    - the social choice function for each contest, including the number of winners
    - candidate identifiers
+ a ballot manifest**
+ a random seed
+ a file of cast vote records **(CVR strata only)**
+ reported results for each contest
+ json files of assertions for IRV contests (one file per IRV contest)
+ human reading of voter intent from the paper cards selected for audit

** The ballot manifest could be only for cards purported to contain the
contests under audit (manifest_type == "STYLE"), or could include cards that might not contain the
contest (manifest_type == "ALL"). These are treated differently. If the sample is to be drawn only from cards that--according to the CVR/manifest--contain the contest, and a sampled card turns out not to
contain the contest, that is considered a discrepancy, dealt with using the "phantoms to zombies" approach.
It is assumed that every CVR/manifest entry corresponds to a card in the manifest, but there might be cards cast in the contest for which there is no corresponding CVR/manifest entry. In that case, phantom records are created to ensure that the audit is still truly risk-limiting.

Given an independent (i.e., not relying on the voting system) upper bound on the number of cards that contain the contest, if the number of CVRs/manifest entries that contain the contest does not exceed that bound, we can sample from paper purported to contain the contest and use the "zombies" approach (Banuelos & Stark) to deal with missing CVRs/manifest entries. This can greatly increase the efficiency of the audit if the contest is on only a small percentage of the cast cards.

Any sampled phantom card (i.e., a card for which there are no CVRs) is treated as if its CVR is a non-vote (which it is), and as if its MVR was least favorable (a "zombie" producing the greatest doubt in every assertion, separately). Any sampled card for which
there is a CVR is compared to its corresponding MVR. 
If the card turns out not to contain the contest (despite the fact that the CVR says it does), the MVR is treated in the least favorable way for each assertion (i.e., as a zombie rather than as a non-vote).

The tool helps select cards for the stratified audit, and reports when the audit has found sufficiently strong evidence to stop.

The tool exports a log of all the audit inputs except the CVR file, but including the auditors' manually determined voter intent from the audited cards in each strata.

The current version uses a single sample per stratum to audit all contests. It is possible to refine things to target smaller contests.

In [1]:
from __future__ import division, print_function

import math
import json
import warnings
import numpy as np
import pandas as pd
import csv
import copy

from collections import OrderedDict
from IPython.display import display, HTML

from cryptorandom.cryptorandom import SHA256
from cryptorandom.sample import sample_by_index

from assertion_audit_utils import \
    Assertion, Assorter, CVR, TestNonnegMean, check_audit_parameters, find_margins,\
    find_fisher_p_values, find_sample_size, new_sample_size, prep_sample, summarize_status,\
    write_fisher_audit_parameters
from dominion_tools import \
    prep_dominion_manifest, sample_from_cvr, sample_from_manifest, write_cards_sampled


## Audit parameters.

* `seed`: the numeric seed for the pseudo-random number generator used to draw sample 
* `replacement`: whether to sample with replacement. If the sample is drawn with replacement, gamma must also be specified.
* `risk_function`: the function to be used to measure risk. Options are `kaplan_markov`,`kaplan_wald`,`kaplan_kolmogorov`,`wald_sprt`,`kaplan_martingale`. Not all risk functions work with every social choice function. `wald_sprt` applies only to plurality contests.
* `g`: a parameter to hedge against the possibility of observing a maximum overstatement. Require $g \in [0, 1)$ for `kaplan_kolmogorov`, `kaplan_markov`, and `kaplan_wald`.
* `N_cards`: an upper bound on the number of pieces of paper cast in the contest. This should be derived independently of the voting system. A ballot consists of one or more cards.

----

* `cvr_file`: filename for CVRs (input)
* `manifest_file`: filename for ballot manifest (input)
* `manifest_type`: "STYLE" if the manifest is supposed to list only cards that contain the contests under audit; "ALL" if the manifest contains all cards cast in the election
* `assertion_file`: filename of assertions for IRV contests, in RAIRE format (input)
* `sample_file`: filename for sampled card identifiers (output)
* `mvr_file`: filename for manually ascertained votes from sampled cards (input)
* `log_file`: filename for audit log (output)

----

* `error_rate`: expected rate of 1-vote overstatements. Recommended value $\ge$ 0.001 if there are hand-marked ballots. Larger values increase the initial sample size, but make it more likely that the audit will conclude in a single round if the audit finds errors

* `contests`: a dict of contest-specific data 
    + the keys are unique contest identifiers for contests under audit
    + the values are dicts with keys:
        - `risk_limit`: the risk limit for the audit of this contest
        - `cards_cast`: an upper bound on the number of cast cards that contain the contest
        - `choice_function`: `plurality`, `supermajority`, or `IRV`
        - `n_winners`: number of winners for majority contests. (Multi-winner IRV not supported; multi-winner super-majority is nonsense)
        - `share_to_win`: for super-majority contests, the fraction of valid votes required to win, e.g., 2/3.
        - `candidates`: list of names or identifiers of candidates
        - `reported_winners` : list of identifier(s) of candidate(s) reported to have won. Length should equal `n_winners`.
        - `assertion_file`: filename for a set of json descriptors of Assertions (see technical documentation) that collectively imply the reported outcome of the contest is correct. Required for IRV; ignored for other social choice functions

In [2]:
seed = 93686630803205229070  # use, e.g., 20 rolls of a 10-sided die. Seed doesn't have to be numeric
replacement = False
g = 0.1 #padding for risk functions in both strata

sample_file = './Data/sample_combined.csv'
log_file = './Data/log_combined.json'

### CVR stratum parameters

In [3]:
risk_function_1 = "kaplan_martingale"
risk_fn_1 = lambda x, t: TestNonnegMean.kaplan_martingale(x, N=N_cards_s1, t=t)[0]

# Another option for the risk function:
#risk_function_1 = "kaplan_kolmogorov"
#risk_fn_1 = lambda x, t: TestNonnegMean.kaplan_kolmogorov(x=np.array(x), N=N_cards_s1, t=t, g=g)

risk_function_1 = "kaplan_markov"
risk_fn_1 = lambda x, t: TestNonnegMean.kaplan_markov(x=np.array(x), t=t, g=g)

#risk_function_1 = "kaplan_wald"
#risk_fn_1 = lambda x, t: TestNonnegMean.kaplan_wald(x=np.array(x), t=t, g=g)

N_cards_s1 = 146662 # VBM turnout per SF Elections release 12
        # https://sfelections.sfgov.org/november-5-2019-election-results-summary

In [4]:
cvr_file = './Data/SFDA2019_PrelimReport12VBMJustDASheets.raire'
manifest_file_s1 = './Data/N19 ballot manifest with WH location for RLA Upload VBM 11-14.xlsx'
manifest_type_s1 = 'STYLE'  # every card should contain the contest
mvr_file_s1 = './Data/mvr_prepilot_test.json'
#mvr_file = './Data/mvrTest-PR12-DA-VBM-AllBallots-4TargetedErrors.json'

In [5]:
error_rate = 0.002      # expect 2 1-vote overstatements per 1000 ballots in stratum 1

### No CVR stratum parameters

In [6]:
risk_function_2 = "kaplan_martingale"
risk_fn_2 = lambda x, t: TestNonnegMean.kaplan_martingale(x, N=N_cards_s2, t=t)[0]

# Another option for the risk function:
#risk_function_2 = "kaplan_kolmogorov"
#risk_fn_2 = lambda x, t: TestNonnegMean.kaplan_kolmogorov(x=np.array(x), N=N_cards_s2, t=t, g=g, random_order=True)

risk_function_2 = "kaplan_markov"
risk_fn_2 = lambda x, t: TestNonnegMean.kaplan_markov(x=np.array(x), t=t, g=g, random_order=False)

#risk_function_2 = "kaplan_wald"
#risk_fn_2 = lambda x, t: TestNonnegMean.kaplan_wald(x=np.array(x), t=t, g=g)

N_cards_s2 = 3000 # ballot count < manifest ballot # for calculation purposes

In [7]:
manifest_file_s2 = './Data/N19 ballot manifest with WH location for RLA Upload VBM 11-14.xlsx'
manifest_type_s2 = 'STYLE'
mvr_file_s2 = './Data/mvr_prepilot_test.json'

### Contest parameters

In [8]:
# contests to audit. Edit with details of your contest (eg., Contest 339 is the DA race)
contests = {'339':{'risk_limit':0.05,
                     'choice_function':'IRV',
                     'n_winners':1,
                     'candidates':['15','16','17','18'],
                     'reported_winners' : ['15'],
                     'assertion_file' : './Data/SF2019Nov8Assertions.json'
                    }
           }

Example of other social choice functions:

> contests =  {'city_council':{'risk_limit':0.05,
                     'choice_function':'plurality',
                     'n_winners':3,
                     'candidates':['Doug','Emily','Frank','Gail','Harry'],
                     'reported_winners' : ['Doug', 'Emily', 'Frank']
                    },
            'measure_1':{'risk_limit':0.05,
                     'choice_function':'supermajority',
                     'share_to_win':2/3,
                     'n_winners':1,
                     'candidates':['yes','no'],
                     'reported_winners' : ['yes']
                    }                  
           }

## Make assertions

In [9]:
# read the assertions for the IRV contest
for c in contests:
    if contests[c]['choice_function'] == 'IRV':
        with open(contests[c]['assertion_file'], 'r') as f:
            contests[c]['assertion_json'] = json.load(f)['audits'][0]['assertions']

In [10]:
# construct the dict of dicts of assertions for each contest
all_assertions = Assertion.make_all_assertions(contests)

In [11]:
all_assertions

{'339': {'18 v 17 elim 15 16 45': <assertion_audit_utils.Assertion at 0x1317f94cb48>,
  '17 v 16 elim 15 18 45': <assertion_audit_utils.Assertion at 0x1317f94c588>,
  '15 v 18 elim 16 17 45': <assertion_audit_utils.Assertion at 0x1317f94ca48>,
  '18 v 16 elim 15 17 45': <assertion_audit_utils.Assertion at 0x1317f94c988>,
  '17 v 16 elim 15 45': <assertion_audit_utils.Assertion at 0x1317f94c688>,
  '15 v 17 elim 16 45': <assertion_audit_utils.Assertion at 0x1317f94c948>,
  '15 v 17 elim 16 18 45': <assertion_audit_utils.Assertion at 0x1317f94c8c8>,
  '18 v 16 elim 15 45': <assertion_audit_utils.Assertion at 0x1317f94c888>,
  '15 v 16 elim 17 45': <assertion_audit_utils.Assertion at 0x1317f94c488>,
  '15 v 16 elim 17 18 45': <assertion_audit_utils.Assertion at 0x1317f94c2c8>,
  '15 v 16 elim 18 45': <assertion_audit_utils.Assertion at 0x1317f94c408>,
  '15 v 16 elim 45': <assertion_audit_utils.Assertion at 0x1317f94cac8>,
  '15 v 45': <assertion_audit_utils.Assertion at 0x1317f94cf88>}}

## Stratum 1: CVR - Ballot Comparison Audit

### Read the ballot manifest

In [12]:
# special for Primary/Dominion manifest format
manifest_s1 = pd.read_excel(manifest_file_s1)

### Read the CVRs 

In [13]:
cvr_input = []
with open(cvr_file) as f:
    cvr_reader = csv.reader(f, delimiter=',', quotechar='"')
    for row in cvr_reader:
        cvr_input.append(row)

print("Read {} rows".format(len(cvr_input)))

Read 146664 rows


In [14]:
# Import CVRs
cvr_list = CVR.from_raire(cvr_input)
print("After merging, there are CVRs for {} cards".format(len(cvr_list)))

After merging, there are CVRs for 146662 cards


In [15]:
# turn RAIRE-style identifiers into Dominion's style by substituting "-" for "_"
for c in cvr_list:
    c.set_id(str(c.id).replace("_","-"))

In [16]:
for i in range(10):
    print(str(cvr_list[i]))

id: 99813-1-1 votes: {'339': {'17': 1}} phantom: False
id: 99813-1-3 votes: {'339': {'16': 1}} phantom: False
id: 99813-1-6 votes: {'339': {'18': 1, '17': 2, '15': 3, '16': 4}} phantom: False
id: 99813-1-8 votes: {'339': {'18': 1}} phantom: False
id: 99813-1-9 votes: {'339': {'': 1}} phantom: False
id: 99813-1-11 votes: {'339': {'16': 1, '17': 2, '15': 3, '18': 4}} phantom: False
id: 99813-1-13 votes: {'339': {'15': 1, '16': 2, '17': 3, '18': 4}} phantom: False
id: 99813-1-16 votes: {'339': {'15': 1}} phantom: False
id: 99813-1-17 votes: {'339': {'15': 1}} phantom: False
id: 99813-1-19 votes: {'339': {'16': 1}} phantom: False


In [17]:
# Check that there is a CVR for every card cast in the contest. If not, add phantoms.

n_cvrs = len(cvr_list)
manifest_s1, manifest_cards_s1, phantom_cards_s1 = prep_dominion_manifest(manifest_s1, N_cards_s1, n_cvrs)

manifest_s1

Unnamed: 0,Tray #,Tabulator Number,Batch Number,Total Ballots,VBMCart.Cart number,cum_cards
0,1,99808,78,116,3,116
1,1,99808,77,115,3,231
2,1,99808,79,120,3,351
3,1,99808,81,76,3,427
4,1,99808,80,116,3,543
...,...,...,...,...,...,...
5476,3506,99815,86,2,19,292557
5477,3506,99815,84,222,19,292779
5478,3506,99815,83,346,19,293125
5479,3506,99815,82,332,19,293457


In [18]:
# Create CVRs and MVRs for phantom cards
# If the sample draws a phantom card, these CVRs will be used in the comparison.
# phantom MVRs should be treated as zeros by the Assorter for every contest
phantom_vrs = []
for i in range(phantom_cards_s1):
    phantom_vrs.append(CVR(id='phantom-1-'+str(i+1), votes={}, phantom = True))  # matches expected RAIRE id for parsing later
    
cvr_list = cvr_list + phantom_vrs

print("Created {} phantom records".format(len(phantom_vrs)))

Created 0 phantom records


In [19]:
manifest_cards_s1

293555

In [20]:
# find the mean of the assorters for the CVRs and check whether the assertions are met
min_margin = find_margins(contests, all_assertions, cvr_list)

print("minimum assorter margin {}".format(min_margin))
for c in contests:
    print("margins in contest {}".format(c))
    for a, m in contests[c]['margins'].items():
        print(a, m)


minimum assorter margin 0.019902906001554532
margins in contest 339
18 v 17 elim 15 16 45 0.045792366120740224
17 v 16 elim 15 18 45 0.019902906001554532
15 v 18 elim 16 17 45 0.028923647570604505
18 v 16 elim 15 17 45 0.0830003681935334
17 v 16 elim 15 45 0.058079120699294995
15 v 17 elim 16 45 0.08064120222007065
15 v 17 elim 16 18 45 0.10951712099930444
18 v 16 elim 15 45 0.14875018750596603
15 v 16 elim 17 45 0.13548158350492967
15 v 16 elim 17 18 45 0.1365247985163165
15 v 16 elim 18 45 0.16666893946625572
15 v 16 elim 45 0.15626406294745743
15 v 45 0.2956457705472446


In [21]:
check_audit_parameters(risk_function_1, g, contests, error_rate)

In [22]:
write_fisher_audit_parameters(log_file, seed, replacement, contests, [risk_function_1, risk_function_2], \
                                g, [N_cards_s1, 0], [manifest_cards_s1, 0], [phantom_cards_s1, 0], n_cvrs, error_rate)

### Set up for sampling

### Find initial sample size

In [23]:
# find initial sample size
risk_fn_1_ss = lambda x: risk_fn_1(x, t=1/2)
ss_fn = lambda m, r: TestNonnegMean.initial_sample_size(\
                        risk_function=risk_fn_1_ss, N=N_cards_s1, margin=m,\
                        error_rate=error_rate, alpha=r)
sample_size_s1 = find_sample_size(contests, all_assertions, sample_size_function=ss_fn)

print(sample_size_s1)

sample_size_s1 = 200
print(sample_size_s1)


424
200


### Draw the first sample

In [24]:
# draw the initial sample
prng = SHA256(seed)
sample_s1 = sample_by_index(N_cards_s1, sample_size_s1, prng=prng) # 1-indexed
n_phantom_sample_s1 = np.sum([cvr_list[i].phantom for i in sample_s1])
print("The sample includes {} phantom cards.".format(n_phantom_sample_s1))

The sample includes 0 phantom cards.


In [25]:
cvr_sample_lookup, cvr_sample, mvr_phantoms_sample = sample_from_cvr(cvr_list, manifest_s1, sample_s1)

In [26]:
# write the sample
write_cards_sampled(sample_file, cvr_sample_lookup, print_phantoms=False)

### Read the audited sample data

In [27]:
with open(mvr_file_s1) as f1:
    mvr_json_s1 = json.load(f1)

mvr_sample_s1 = CVR.from_dict(mvr_json_s1['ballots'])

In [28]:
# add MVRs for phantoms
mvr_sample_s1 = mvr_sample_s1 + mvr_phantoms_sample

In [29]:
prep_sample(mvr_sample_s1, cvr_sample)

## Stratum 2: No CVR - Ballot Polling Audit

### Process the ballot manifest

In [30]:
# Read manifest
manifest_s2 = pd.read_excel(manifest_file_s2)

# Add phantoms if necessary
manifest_s2, manifest_cards_s2, phantom_cards_s2 = prep_dominion_manifest(manifest_s2, N_cards_s2)

manifest_s2

Unnamed: 0,Tray #,Tabulator Number,Batch Number,Total Ballots,VBMCart.Cart number,cum_cards
0,1,99808,78,116,3,116
1,1,99808,77,115,3,231
2,1,99808,79,120,3,351
3,1,99808,81,76,3,427
4,1,99808,80,116,3,543
...,...,...,...,...,...,...
5476,3506,99815,86,2,19,292557
5477,3506,99815,84,222,19,292779
5478,3506,99815,83,346,19,293125
5479,3506,99815,82,332,19,293457


In [31]:
check_audit_parameters(risk_function_2, g, contests)

In [32]:
write_fisher_audit_parameters(log_file, seed, replacement, contests, [risk_function_1, risk_function_2], \
                                g, [N_cards_s1, N_cards_s2], [manifest_cards_s1, manifest_cards_s2], \
                                [phantom_cards_s1, phantom_cards_s2], n_cvrs, error_rate)

### Set up for sampling (Average Sample Number) 

$ASN \approx 2\ln(1/\alpha)/x^2$ where $\alpha$ is the risk limit and $x$ is the margin. We can maximize $ASN$ by minimizing both $\alpha$ and $x$. 

In [33]:
# TODO initial_sample_size-like function
sample_size_s2 = int(min(N_cards_s2, np.sqrt(2*np.log(1/min(contests[c]['risk_limit'] for c in contests))/min_margin**2)))
print(sample_size_s2)

122


In [34]:
prng = SHA256(seed)
sample_s2 = sample_by_index(N_cards_s2, sample_size_s2, prng=prng)
n_phantom_sample_s2 = np.sum([i>manifest_cards_s2 for i in sample_s2]) 
print("The sample includes {} phantom cards.".format(n_phantom_sample_s2))

The sample includes 0 phantom cards.


In [35]:
manifest_sample_lookup = sample_from_manifest(manifest_s2, sample_s2)

In [36]:
# write the sample
write_cards_sampled(sample_file, manifest_sample_lookup, print_phantoms=True)

In [37]:
with open(mvr_file_s2) as f2:
    mvr_json_s2 = json.load(f2)

mvr_sample_s2 = CVR.from_dict(mvr_json_s2['ballots'])

for i in range(10):
    print(mvr_sample_s2[i])

id: 99807-3-2 votes: {'339': {'16': 3, '17': 2, '18': 1}} phantom: False
id: 99809-27-41 votes: {'339': {'15': 3, '16': 1, '17': 4, '18': 2}} phantom: False
id: 99807-4-20 votes: {'339': {'15': 1, '17': 2}} phantom: False
id: 99805-68-45 votes: {'339': {'15': 4, '16': 1, '17': 2, '18': 3}} phantom: False
id: 99805-30-44 votes: {'339': {'15': 3, '16': 2, '17': 1, '18': 4}} phantom: False
id: 99805-30-89 votes: {'339': {'15': 2, '17': 1}} phantom: False
id: 99808-28-57 votes: {'339': {'17': 1}} phantom: False
id: 99811-26-37 votes: {'339': {'18': 1}} phantom: False
id: 99804-19-38 votes: {'339': {'15': 2, '18': 1}} phantom: False
id: 99802-15-23 votes: {'339': {'15': 3, '16': 4, '17': 1, '18': 2}} phantom: False


## Fisher's Combining Function

In [38]:
pvalue_tests = [risk_function_1, risk_function_2]
pvalue_funs = [risk_fn_1, risk_fn_2]
manifest_types = [manifest_type_s1, manifest_type_s2]
N_cards = [N_cards_s1, N_cards_s2]
mvr_samples = [mvr_sample_s1, mvr_sample_s2]

In [39]:
fisher_p_max = find_fisher_p_values(contests, all_assertions, pvalue_tests, pvalue_funs, g, \
                                    manifest_types, N_cards, mvr_samples, cvr_sample)
print("maximum assertion p-value {}".format(fisher_p_max))

done = summarize_status(contests, all_assertions)


maximum assertion p-value 0.5140167509135976
p-values for assertions in contest 339
18 v 17 elim 15 16 45 0.10840969417437585
17 v 16 elim 15 18 45 0.5140167509135976
15 v 18 elim 16 17 45 0.31264039571878244
18 v 16 elim 15 17 45 0.008046409385325304
17 v 16 elim 15 45 0.048403963039792
15 v 17 elim 16 45 0.009582338739282825
15 v 17 elim 16 18 45 0.001068282318011593
18 v 16 elim 15 45 4.6312908403844943e-05
15 v 16 elim 17 45 0.00013623347326918722
15 v 16 elim 17 18 45 0.00012523034747780049
15 v 16 elim 18 45 1.0513308880688577e-05
15 v 16 elim 45 2.4956505131012996e-05
15 v 45 9.271139411737295e-11

contest 339 audit INCOMPLETE at risk limit 0.05. Attained risk 0.5140167509135976
assertions remaining to be proved:
18 v 17 elim 15 16 45: current risk 0.10840969417437585
17 v 16 elim 15 18 45: current risk 0.5140167509135976
15 v 18 elim 16 17 45: current risk 0.31264039571878244


In [40]:
write_fisher_audit_parameters(log_file, seed, replacement, contests, [risk_function_1, risk_function_2], \
                                g, [N_cards_s1, N_cards_s2], [manifest_cards_s1, manifest_cards_s2], \
                                [phantom_cards_s1, phantom_cards_s2], n_cvrs, error_rate)

### END

In [41]:
#TODO new sample size for two strata