# Assertion RLA

Tool to audit generalized assertions about contests, including assertions for RLAs of IRV contests.

This tool can audit any number of contests simultaneously using the same sample.
The contests can be audited to different risk limits.
The contests can have different social choice functions, including majority or super-majority,
plurality, multi-winner plurality, and IRV.

The audit is "simultaneous" across contests in the following sense: 

>If the reported outcome of a contest is incorrect, the chance that the audit stops without a full hand count is at most the risk limit for that contest. 

The sample is drawn as a random sample of individual ballots, with or without replacement, from a single pool of all ballots cast in the contest(s).

The approach taken here uses a new abstract framing of RLAs.
It involves constructing a set of _assertions_ about each contest. The assertions are predicates on the set of votes, that is, they are either true or false, depending on what votes the whole set of ballots shows. 

Each assertion is characterized by an _assorter_.  An assorter assigns a nonnegative number to each ballot, depending on what votes the ballot shows.

For instance, in a two-candidate plurality contest in which Alice was reported to have beaten Bob, an assorter might assign the value 1 to a ballot if the ballot has a vote for Alice and the value 0 to a ballot if it has a vote for Bob.
If every ballot has a vote either for Alice or for Bob, Alice beat Bob iff the average value of the assorter applied to all the ballots is greater than 1/2.

What if there are undervotes, overvotes, or invalid ballots--or votes for other candidates? Let the assorter assign the value 1/2 to such ballots. Then it is still the case that Alice got more votes than Bob iff the average value of the assorter applied to all the ballots is greater than 1/2.

### Multi-winner plurality

The reported winner(s) of a (multi-winner) plurality contest really won if every reported winner got more votes than every reported loser.
If we construct (#winners)x(#losers) assorters, one for each (winner, loser) pair, then the reported outcome is correct if the average each of assorter applied to the ballots is greater than 1/2.
This is essentially the approach taken in Stark (20??) to reduce auditing multi-winner contests to auditing a set of two-candidate contests.


### IRV

Blom et al. (2018) show how to reduce the correctness of a reported IRV winner to the correctness of the reported winners of a set of two-"generalized candidate" contests
The "generalized candidates" in those contests are not necessarily the candidates in the original contest; they are just two mutually exclusive (but not exhaustive) conditions that a ballot might satisfy.
If the set of cast ballots has more ballots that satisfy the first condition than the second, the assertion is true.
Thus if we define an assorter to assign 1 to ballots that satisfy the first condition, 0
to ballots that satisfy the second condition, and 1/2 to ballots that satisfy neither,
the assertion is true iff the average of the assorter over the collection of cast ballots
is greater than 1/2.

### Super-majority

Suppose that a candidate must get at least a fraction $f \in [1/2, 1)$ of the valid votes
to win. Stark (20??) shows how to audit this social choice function, but it can also be expressed in terms of an assertion that the average of an assorter applied to the cast ballots is greater than 1/2.

Alice really won a super-majority contest with required winning fraction $f$ iff

$$ \mbox{(votes for Alice)} > f \times \left ( \mbox{(votes for Alice)} + \mbox{(votes for everyone else)} \right )
$$

$$  (1-f) \times \mbox{(votes for Alice)} > f \times \mbox{(votes for everyone else)} $$

$$ \frac{1-f}{f} \times \mbox{(votes for Alice)} > \mbox{(votes for everyone else)}. $$

Define an assorter as follows: it assigns the value $1/(2f)$ to a ballot if it contains
a vote for Alice, the value 0 if the ballot contains a valid vote for any other candidate
in the contest, and the value 1/2 if the ballot does not have a valid vote.
Suppose that a fraction $p > f$ of the valid votes are for Alice, and that a fraction $q$ of
the ballots have valid votes.
Then the average of this assorter over the ballots is

$$
pq/(2f) + (1-q)/2 \ge 2/2 + (1-q)/2 = 1/2.
$$

Again, using assorters reduces auditing to the question of whether the average of a list of non-negative
numbers is greater than 1/2.
   

## Assertion audits

A contest is audited to risk limit $\alpha$ if the negation of each assertion about the 
contest can be rejected at significance level $\alpha$.
The audit of a contest continues until all the assertions about it have been established,
or until there has been a full manual tally.

The audit stops short of a full tally only if all the assertions are confirmed.
The chance of confirming any assertion that is false is at most $\alpha$, so if there
are one or more false assertions (i.e., if the reported outcome is incorrect), 
the chance that the audit continues to a full hand
count is at least $1-\alpha$ (and possibly larger).

We directly test whether the complement of an assertion is false by testing the hypothesis that the average of the assorter applied to the cast ballots is less than or equal to 1/2.

## Ballot-polling assertion audits

These are immediate. Each assertion can be tested by testing the hypothesis
that the average of the assertion over the cast ballots is less than or equal to 1/2,
for instance, using Wald's SPRT (as in BRAVO and Gentle Introduction), or some other method.

## Ballot-comparison assertion audits

Suppose that we apply the assorters for a contest to the CVRs, all of them have averages
greater than 1/2.
Then the assertions are true for the actual ballots
provided the CVRs did not inflate the average value of the assorter by more than the _assorter margin_, the average of the assorter applied to the reported CVRs, minus 1/2.

By how much could error in an individual CVR inflate the value of the assorter compared
to the value the assorter would have for the actual ballot?
Since the assorter does not assign a negative value to any ballot, the _overstatement error_
for a CVR is at most the value the assorter assigned to that CVR.

For each CVR $i$, 



## Overview of the assertion audit tool

The tool requires as input:

+ audit-specific and contest-specific parameters, such as
    - whether to sample with or without replacement
    - a risk limit for each contest to be audited
    - the social choice function for each contest, including the number of winners
    - candidate identifiers
+ a ballot manifest
+ a random seed
+ a file of cast vote records
+ reported results for each contest
+ assertions about each contest
+ human reading of voter intent from the paper ballots selected for audit

The tool helps select ballots for audit, and reports when the audit has found sufficiently strong evidence to stop.

The tool exports a log of all the audit inputs except the CVR file, but including the auditors' manually determined voter intent from the audited ballots.

In [1]:
from __future__ import division, print_function

from ipywidgets import interact, interactive, fixed, interact_manual, Dropdown, Layout, Box
import ipywidgets as widgets
from IPython.display import display, HTML

from collections import OrderedDict
from itertools import product
import math
import json
import warnings

import numpy as np
from ballot_comparison import ballot_comparison_pvalue
from assertion_audit_utils import \
    Assertion, Assorter, CVR, \
    check_audit_parameters, write_audit_parameters, write_ballots_sampled

from cryptorandom.cryptorandom import SHA256
from cryptorandom.sample import sample_by_index

from suite_tools import write_audit_results, \
        check_valid_vote_counts, \
        check_overvote_rates, find_winners_losers, print_reported_votes, \
        estimate_n, estimate_escalation_n, \
        parse_manifest, unique_manifest, find_ballot, \
        audit_contest

# Audit parameters.

* `seed`: the numeric seed for the pseudo-random number generator used to draw sample 
* `replacement`: whether to sample with replacement. If the sample is drawn with replacement, gamma must also be specified.
* `gamma`: the gamma parameter used in the ballot-level comparison method from Lindeman and Stark (2012), based on Stark (2010). Require gamma $\ge$ 1.
gamma=1.03905 is a common value; it makes 2-vote overstatements "cost" 5 times more than 1-vote overstatements. Smaller values yield smaller sample sizes when there are no two-vote overstatement errors.
* `N_ballots`: an upper bound on the number of ballots cast in the contest. This should be derived independently of the voting system.


----

* `cvr_file`: filename for CVRs (input)
* `manifest_file`: filename for ballot manifest (input)
* `assertion_file`: filename of assertions for IRV contests, in RAIRE format
* `mvr_file`: filename for manually ascertained votes from sampled ballots (input)
* `log_file`: filename for audit log (output)

----

* `error_rates`: dict of expected error rates. The keys are
    + `o1_rate`: expected rate of 1-vote overstatements. Recommended value $\ge$ 0.001 if there are hand-marked ballots. Larger values increase the initial sample size, but make it more likely that the audit will conclude in a single round if the audit finds errors
    + `o2_rate`: expected rate of 2-vote overstatements. Recommended value 0.
    + `u1_rate`: expected rate of 1-vote understatements. Recommended value 0.
    + `u2_rate`: expected rate of 2-vote understatements. Recommended value 0.

* `contests`: a dict of contest-specific data 
    + the keys are unique contest identifiers for contests under audit
    + the values are dicts with keys:
        - `risk_limit`: the risk limit for the audit of this contest
        - `ballots_cast`: an upper bound on the number of cast ballots that contain the contest
        - `choice_function`: `plurality`, `supermajority`, or `IRV`
        - `n_winners`: number of winners for majority contests. (Multi-winner IRV not supported; multi-winner super-majority is nonsense)
        - `share_to_win`: for super-majority contests, the fraction of valid votes required to win, e.g., 2/3.
        - `candidates`: list of names or identifiers of candidates
        - `reported_winners` : list of identifier(s) of candidate(s) reported to have won. Length should equal `n_winners`.
        - `assertions`: a set of Assertions (see technical documentation) that collectively imply the reported outcome is correct

In [2]:
seed = 12345678901234567890  # use, e.g., 20 rolls of a 10-sided die. Seed doesn't have to be numeric
replacement = True  # Sampling without replacement isn't implemented
gamma=1.03905
N_ballots = 300000

In [3]:
cvr_file = './Data/cvr.json'
manifest_file = './Data/manifest.csv'
mvr_file = './Data/mvr.csv'
log_file = './Data/log.json'

In [4]:
error_rates = {'o1_rate':0.002,      # expect 2 1-vote overstatements per 1000 ballots in the CVR stratum
               'o2_rate':0,          # expect 0 2-vote overstatements
               'u1_rate':0,          # expect 0 1-vote understatements
               'u2_rate':0}          # expect 0 2-vote understatements

In [5]:
# contests to audit

contests = {'mayor':{'risk_limit':0.05,
                     'choice_function':'IRV',
                     'n_winners':1,
                     'candidates':['Alice','Bob','Cindy'],
                     'reported_winners' : ['Alice'],
                     'assertion_file' : './Data/assertion.json'
                    },
            'city_council':{'risk_limit':0.05,
                     'choice_function':'plurality',
                     'n_winners':3,
                     'candidates':['Doug','Emily','Frank','Gail','Harry'],
                     'reported_winners' : ['Doug', 'Emily', 'Frank']
                    },
            'measure_1':{'risk_limit':0.05,
                     'choice_function':'supermajority',
                     'share_to_win':2/3,
                     'n_winners':1,
                     'candidates':['yes','no'],
                     'reported_winners' : ['yes']
                    }                  
           }

In [6]:
check_audit_parameters(gamma, error_rates, contests)
write_audit_parameters(log_file, seed, replacement, gamma, N_ballots, error_rates, contests)

## Find audit parameters and conduct audit

* For each contest:
    - find claimed outcome by applying SCF to CVRs
    - complain if claimed outcome disagrees with reported outcome
    - construct assertions that imply contest outcome is correct
    - for each assertion:
        + find generalized diluted margin
        
* Find initial (incremental) sample size from smallest diluted margin, for the sampling plan
    - Complain if expected error rates imply any assertion is incorrect

* For each assertion:
    - Initialize discrepancy counts to zero (o1, o2, u1, u2)
    - Initialize measured risk to 1
* While measured risk for any assertion exceeds its risk limit:
    - expand sample by estimated increment
        + identify ballots in manifest
        + update the log file with incremental sample
    - import audit results when ballots have been audited
    - for each assertion:
        + for each sampled ballot:
            - increment discrepancy count for the assertion
        + find measured risk
    - update log file with new measured risks
    - if any measured risk exceeds its risk limit:
        + estimate incremental sample required to complete the audit

In [7]:
# read the assertions for the IRV contest
for c in contests:
    if contests[c]['choice_function'] == 'IRV':
        with open(contests[c]['assertion_file'], 'r') as f:
            contests[c]['assertions'] = {} # json.load(f)

In [8]:
# construct the dict of dicts of assertions for each contest
all_assertions = Assertion.make_all_assertions(contests)

In [9]:
aa

{'mayor': {},
 'city_council': {'Doug v Gail': <assertion_audit_utils.Assertion at 0x11faf1b38>,
  'Doug v Harry': <assertion_audit_utils.Assertion at 0x11fb763c8>,
  'Emily v Gail': <assertion_audit_utils.Assertion at 0x11fb76358>,
  'Emily v Harry': <assertion_audit_utils.Assertion at 0x11fb76278>,
  'Frank v Gail': <assertion_audit_utils.Assertion at 0x11fb76320>,
  'Frank v Harry': <assertion_audit_utils.Assertion at 0x11fb761d0>},
 'measure_1': {'yes v all': <assertion_audit_utils.Assertion at 0x11faf1b00>}}

## Read the CVRs 

In [None]:
# read the cast vote records
with open(cvr_file, 'r') as f:
    # do something
    # cvrs = ???
    pass

In [None]:
# find the mean of the assorters for the CVRs and check whether the assertions are met
assorter_means = {}
for c in contests.keys():
    contest[c]['cvr_means'] = {}
    for asrtn in audit_assertions[c]:
        # find mean of the assertion for the CVRs
        amean = audit_assertions[c][asrtn].assorter.assorter_mean(cvrs)
        if amean < 1/2:
            warn("assertion " + asrtn + " not satisfied by CVRs: mean value is " + amean)
        contest[c]['cvr_means'][asrtn] = amean

## Set up for sampling

In [None]:
# read the ballot manifest
manifest = read_manifest_from_csv(manifest_file)

In [None]:
# expand the ballot manifest into a dict. keys are batches, values are ballot numbers.
manifest = parse_manifest(ballot_manifest)
poll_manifest_parsed = parse_manifest(ballot_manifest_poll)

In [None]:
# find contest results
for c in contests.keys():
    contests[c]['winners'] = find_winners(contests[c])

In [None]:
# assign each ballot a unique ID
unique_cvr_manifest = unique_manifest(cvr_manifest_parsed)

In [None]:
# look up sample ballots

cvr_sample = []
for s in sample1:
    original_ballot_label, batch_label, which_ballot = find_ballot(s, \
                                                                   unique_cvr_manifest, \
                                                                   cvr_manifest_parsed)
    cvr_sample.append([s, batch_label, which_ballot])

cvr_sample.sort(key=lambda x: x[2]) # Sort second on order within batches
cvr_sample.sort(key=lambda x: x[1]) # Sort first based on batch label
cvr_sample.insert(0,["sampled ballot", "batch label", "which ballot in batch"])

display(HTML(
    '<table><tr>{}</tr></table>'.format(
        '</tr><tr>'.join(
            '<td>{}</td>'.format('</td><td>'.join(str(_) for _ in row)) for row in cvr_sample)
        )
 ))

# Enter the sample data

In [None]:
# Find audit p-values across assertions

In [None]:
# Identify assertions not yet confirmed

In [None]:
# Log the status of the audit 

# Escalation: how many more ballots should be drawn?

This tool estimates how many more ballots will need to be audited to confirm any remaining contests. The enlarged sample size is based on:

* ballots already sampled
* assumption that we will continue to see overstatements and understatements at the same rate that observed in the sample

In [None]:
sample_sizes_new = {}

# TBD
