# Assertion RLA

## Overview of the assertion audit tool

The tool requires as input:

+ audit-specific and contest-specific parameters, such as
    - whether to sample with or without replacement
    - the name of the risk function to use, and any parameters it requires
    - a risk limit for each contest to be audited
    - the social choice function for each contest, including the number of winners
    - candidate identifiers
+ a ballot manifest**
+ a random seed
+ a file of cast vote records
+ reported results for each contest
+ json files of assertions for IRV contests (one file per IRV contest)
+ human reading of voter intent from the paper cards selected for audit

** The ballot manifest could be only for cards purported to contain the
contests under audit (manifest_type == "STYLE"), or could include cards that might not contain the
contest (manifest_type == "ALL"). These are treated differently. If the sample is to be drawn only from cards that--according to the CVR--contain the contest, and a sampled card turns out not to
contain the contest, that is considered a discrepancy, dealt with using the "phantoms to zombies" approach.
It is assumed that every CVR corresponds to a card in the manifest, but there might
be cards cast in the contest for which there is no corresponding CVR. In that case,
phantom records are created to ensure that the audit is still truly risk-limiting.

The tool helps select cards for audit, and reports when the audit has found sufficiently strong evidence to stop.

The tool exports a log of all the audit inputs except the CVR file, but including the auditors' manually determined voter intent from the audited cards.

The current version uses a single sample to audit all contests. It would be possible to refine things to target smaller contests.

In [1]:
from __future__ import division, print_function

import math
import json
import warnings
import numpy as np
import pandas as pd
import csv

from collections import OrderedDict
from IPython.display import display, HTML

from cryptorandom.cryptorandom import SHA256
from cryptorandom.sample import sample_by_index

from assertion_audit_utils import \
    Assertion, Assorter, CVR, TestNonnegMean, check_audit_parameters, write_audit_parameters
from dominion_tools import \
    prep_dominion_manifest, sample_from_manifest, write_cards_sampled


# Audit parameters.

* `seed`: the numeric seed for the pseudo-random number generator used to draw sample 
* `replacement`: whether to sample with replacement. If the sample is drawn with replacement, gamma must also be specified.
* `risk_function`: the function to be used to measure risk. Options are `kaplan_markov`,`kaplan_wald`,`kaplan_kolmogorov`,`wald_sprt`,`kaplan_martingale`. Not all risk functions work with every social choice function. `wald_sprt` applies only to plurality contests.
* `g`: a parameter to hedge against the possibility of observing a maximum overstatement. Require $g \in [0, 1)$ for `kaplan_markov` and `kaplan_wald`
* `N_cards`: an upper bound on the number of pieces of paper cast in the contest. This should be derived independently of the voting system. A ballot consists of one or more cards.

----

* `cvr_file`: filename for CVRs (input)
* `manifest_file`: filename for ballot manifest (input)
* `manifest_type`: "STYLE" if the manifest is supposed to list only cards that contain the contests under audit; "ALL" if the manifest contains all cards cast in the election
* `assertion_file`: filename of assertions for IRV contests, in RAIRE format
* `sample_file`: filename for sampled card identifiers (output)
* `mvr_file`: filename for manually ascertained votes from sampled cards (input)
* `log_file`: filename for audit log (output)

----

* `error_rates`: dict of expected error rates. The keys are
    + `o1_rate`: expected rate of 1-vote overstatements. Recommended value $\ge$ 0.001 if there are hand-marked ballots. Larger values increase the initial sample size, but make it more likely that the audit will conclude in a single round if the audit finds errors
    + `o2_rate`: expected rate of 2-vote overstatements. Recommended value 0.
    + `u1_rate`: expected rate of 1-vote understatements. Recommended value 0.
    + `u2_rate`: expected rate of 2-vote understatements. Recommended value 0.

* `contests`: a dict of contest-specific data 
    + the keys are unique contest identifiers for contests under audit
    + the values are dicts with keys:
        - `risk_limit`: the risk limit for the audit of this contest
        - `cards_cast`: an upper bound on the number of cast cards that contain the contest
        - `choice_function`: `plurality`, `supermajority`, or `IRV`
        - `n_winners`: number of winners for majority contests. (Multi-winner IRV not supported; multi-winner super-majority is nonsense)
        - `share_to_win`: for super-majority contests, the fraction of valid votes required to win, e.g., 2/3.
        - `candidates`: list of names or identifiers of candidates
        - `reported_winners` : list of identifier(s) of candidate(s) reported to have won. Length should equal `n_winners`.
        - `assertion_file`: filename for a set of json descriptors of Assertions (see technical documentation) that collectively imply the reported outcome of the contest is correct. Required for IRV; ignored for other social choice functions

In [2]:
seed = 12345678901234567890  # use, e.g., 20 rolls of a 10-sided die. Seed doesn't have to be numeric
replacement = True  # Sampling without replacement isn't implemented
risk_function = "kaplan_martingale"
g=0.1
N_cards = 206036 # per SF Elections release 9

In [3]:
cvr_file = './Data/SFDA_2019_Nov8Partial.raire'
manifest_file = './Data/N19 ballot manifest with WH location for RLA Upload.xlsx'
manifest_type = 'STYLE'
sample_file = './Data/sample.csv'
mvr_file = './Data/mvr.json'
log_file = './Data/log.json'

In [4]:
error_rates = {'o1_rate':0.002,      # expect 2 1-vote overstatements per 1000 ballots in the CVR stratum
               'o2_rate':0,          # expect 0 2-vote overstatements
               'u1_rate':0,          # expect 0 1-vote understatements
               'u2_rate':0}          # expect 0 2-vote understatements

In [5]:
# contests to audit. Contest 339 is the DA race
contests = {'339':{'risk_limit':0.05,
                     'choice_function':'IRV',
                     'n_winners':1,
                     'candidates':['15','16','17','18','45'],
                     'reported_winners' : ['15'],
                     'assertion_file' : './Data/SF2019Nov8Assertions.json'
                    }
           }

Example of other social choice functions:

> contests =  {'city_council':{'risk_limit':0.05,
                     'choice_function':'plurality',
                     'n_winners':3,
                     'candidates':['Doug','Emily','Frank','Gail','Harry'],
                     'reported_winners' : ['Doug', 'Emily', 'Frank']
                    },
            'measure_1':{'risk_limit':0.05,
                     'choice_function':'supermajority',
                     'share_to_win':2/3,
                     'n_winners':1,
                     'candidates':['yes','no'],
                     'reported_winners' : ['yes']
                    }                  
           }

## Find audit parameters and conduct audit

* Import contest data
    - ballot manifest
    - cast vote records (CVRs)

* Set up data for the phantom/zombie treatment of missing CVRs and cards not listed in the manifest
    - create empty CVRs and MVRs for unaccounted-for cards that could contain the contests

* For each contest:
    - find claimed outcome by applying SCF to CVRs
    - complain if claimed outcome disagrees with reported outcome
    - construct assertions that imply contest outcome is correct
    - for each assertion:
        + find generalized diluted margin
        
* Find initial (incremental) sample size from smallest diluted margin, for the sampling plan
    - Complain if expected error rates imply any assertion is incorrect

* For each assertion:
    - Initialize discrepancy counts to zero (o1, o2, u1, u2)
    - Initialize measured risk to 1

* While measured risk for any assertion exceeds its risk limit:
    - expand sample by estimated increment
        + identify sampled cards in manifest
        + update the log file with incremental sample
    - import audit results when cards have been audited
    - for each assertion:
        + for each sampled card:
            - increment discrepancy count for the assertion
        + find measured risk
    - update log file with new measured risks
    - if any measured risk exceeds its risk limit:
        + estimate incremental sample required to complete the audit

In [6]:
# read the assertions for the IRV contest
for c in contests:
    if contests[c]['choice_function'] == 'IRV':
        with open(contests[c]['assertion_file'], 'r') as f:
            contests[c]['assertion_json'] = json.load(f)['audits'][0]['assertions']

In [7]:
# construct the dict of dicts of assertions for each contest
all_assertions = Assertion.make_all_assertions(contests)

In [8]:
all_assertions

{'339': {'18 v 17 elim 15 16 45': <assertion_audit_utils.Assertion at 0x1143590f0>,
  '17 v 16 elim 15 18 45': <assertion_audit_utils.Assertion at 0x114359160>,
  '15 v 18 elim 16 17 45': <assertion_audit_utils.Assertion at 0x1143599e8>,
  '18 v 16 elim 15 17 45': <assertion_audit_utils.Assertion at 0x114359c88>,
  '17 v 16 elim 15 45': <assertion_audit_utils.Assertion at 0x114359cf8>,
  '15 v 17 elim 16 45': <assertion_audit_utils.Assertion at 0x114359c50>,
  '15 v 17 elim 16 18 45': <assertion_audit_utils.Assertion at 0x114359ba8>,
  '18 v 16 elim 15 45': <assertion_audit_utils.Assertion at 0x114359a58>,
  '15 v 16 elim 17 45': <assertion_audit_utils.Assertion at 0x114348c50>,
  '15 v 16 elim 17 18 45': <assertion_audit_utils.Assertion at 0x114348cc0>,
  '15 v 16 elim 18 45': <assertion_audit_utils.Assertion at 0x114348d68>,
  '15 v 16 elim 45': <assertion_audit_utils.Assertion at 0x114348b70>,
  '15 v 45': <assertion_audit_utils.Assertion at 0x114348da0>}}

## Read the ballot manifest

In [9]:
# special for SF/Dominion manifest format
manifest = pd.read_excel(manifest_file)

FileNotFoundError: [Errno 2] No such file or directory: './Data/N19 ballot manifest with WH location for RLA Upload'

## Read the CVRs 

In [None]:
cvr_input = []
with open(cvr_file) as f:
    cvr_reader = csv.reader(f, delimiter=',', quotechar='"')
    for row in cvr_reader:
        cvr_input.append(row)

print("Read {} rows".format(len(cvr_input)))

In [None]:
# Import CVRs
cvr_list = CVR.from_raire(cvr_input)
print("After merging, there are CVRs for {} cards".format(len(cvr_list)))

In [None]:
for i in range(10):
    print(cvr_list[i].id)

In [None]:
# Check that there is a CVR for every card cast in the contest. If not, add phantoms.

n_cvrs = len(cvr_list)
manifest, manifest_cards, phantom_cards = prep_dominion_manifest(manifest, N_cards, n_cvrs)

manifest

In [None]:
# Create CVRs and MVRs for phantom cards
# If the sample draws a phantom card, these CVRs will be used in the comparison.
# phantom MVRs should be treated as zeros by the Assorter for every contest
phantom_vrs = []
for i in range(phantom_cards):
    phantom_vrs.append(CVR(id='phantom_1_'+str(i+1), votes={}, phantom = True))  # matches expected RAIRE id for parsing later
    
cvr_list = cvr_list + phantom_vrs

print("Created {} phantom records".format(len(phantom_vrs)))

In [None]:
manifest_cards

In [None]:
# find the mean of the assorters for the CVRs and check whether the assertions are met
assorter_means = {}
min_mean = np.infty
for c in contests.keys():
    contests[c]['cvr_means'] = {}
    for asrtn in all_assertions[c]:
        # find mean of the assertion for the CVRs
        amean = all_assertions[c][asrtn].assorter_mean(cvr_list)
        if amean < 1/2:
            warn("assertion {} not satisfied by CVRs: mean value is {}".format(asrtn, amean))
        contests[c]['cvr_means'][asrtn] = amean
        min_mean = np.min([min_mean, amean])

print("minimum assorter mean {}".format(min_mean))

In [None]:
check_audit_parameters(risk_function, g, error_rates, contests)

In [None]:
write_audit_parameters(log_file, seed, replacement, risk_function, g, N_cards, n_cvrs, \
                       manifest_cards, phantom_cards, error_rates, contests)

## Set up for sampling

## Estimate initial sample size

In [None]:
# find initial sample size
initial_size = 100 # FIX ME

## Draw the first sample

In [None]:
# draw the initial sample
prng = SHA256(seed)
sample = sample_by_index(N_cards, initial_size, prng=prng) # 1-indexed
phantoms = np.sum(sample > manifest_cards)
print("The sample includes {} phantom cards, which will be treated conservatively.".format(phantoms))

In [None]:
# look up the sampled cards in the manifest
sample_cards = sample_from_cvrs(cvr_list, sample)

# write the sample
write_cards_sampled(sample_file, sample_cards, print_phantoms=False)

## Read the audited sample data

In [None]:
with open(mvr_file) as f:
    mvr_json = json.load(f)

mvr = CVR.from_dict(mvr_json['ballots'])

In [None]:
for i in range(5):
    print(mvr[i].id, mvr[i].votes)

## Find measured risks for all assertions

### Deal with Phantoms/Zombies

Given an independent (i.e., not relying on the voting system) upper bound on the number of cards that contain the contest, if the number of CVRs that contain the contest does not exceed that bound, we can sample from paper purported to contain the contest and use the "zombies" approach (Banuelos & Stark) to deal with missing CVRs. This can greatly increase the efficiency of the audit if the contest is on only a small percentage of the cast cards.

Any sampled phantom card (i.e., a card for which there are no CVRs) is treated as if its CVR is a non-vote (which it is), and as if its MVR was least favorable (a "zombie" producing the greatest doubt in every assertion, separately). Any sampled card for which
there is a CVR is compared to its corresponding CVR. 
If the card turns out not to contain the contest (despite the fact that the CVR says it does), the MVR is treated in the least favorable way for each assertion (i.e., as a zombie rather than as a non-vote).

In [None]:
# adjust for phantoms

In [None]:
# Identify assertions not yet confirmed

In [None]:
# Log the status of the audit 

# Escalation: how many more cards should be audited?

This tool estimates how many more cards will need to be audited to confirm any remaining contests. The enlarged sample size is based on:

* cards already sampled
* assumption that we will continue to see errors at the same rate observed in the sample

In [None]:
sample_sizes_new = {}

# TBD


In [None]:
# augment the sample
# reset the seed
prng = SHA256(seed)
old_sample = sample
sample = sample_by_index(N_cards, sample_size, prng=prng)
incremental_sample = np.sort(list(set(sample) - set(old_sample)))
