# Assertion RLA

**From 20 October 2022, major re-write and restructuring of the code and classes.**

**From 18 May 2022, consistent sampling seems to be working.**

**From 1 April 2022, integrating alpha_mart, p_history, and other features.**

**From 24 October 2021, dev version implementing consistent sampling to target smaller contests.**

## Overview of the assertion audit tool

The tool requires as input:

+ audit-specific and contest-specific parameters, such as
    - whether to sample with or without replacement
    - the name of the risk function to use, and any parameters it requires
    - a risk limit for each contest to be audited
    - the social choice function for each contest, including the number of winners
    - candidate identifiers for each contest
    - reported winner(s) for each contest
    - an upper bound on the number of ballot cards that contain each contest
    - an upper bound on the total number of cards across all contests
    - whether to use card style information to target sampling
+ a ballot manifest (see below)
+ a random seed
+ a file of cast vote records (for ballot-comparison audits)
+ reported vote tallies for for each contest (for ballot-polling audits of plurality, supermajority, and approval social choice functions)
+ json files of assertions for IRV contests (one file per IRV contest)
+ human reading of voter intent from the paper cards selected for audit

`use_style` controls whether the sample is drawn from all cards (`use_style == False`) or card style information is used
to target the cards that purport to contain each contest (`use_style == True`).
In the current implementation, card style information is inferred from cast-vote records, with additional 'phantom' CVRs if there could be more cards that contain a contest than is accounted for in the CVRs.
Errors in the card style information are treated conservatively using the  "phantoms-to-evil-zombies" (~2EZ) approach ([Banuelos & Stark, 2012](https://arxiv.org/abs/1207.3413)) so that the risk limit remains valid, even if the CVRs misrepresent
which cards contain which contests.

The two ways of sampling are treated differently. 
If the sample is to be drawn only from cards that--according to the CVR--contain a particular contest, and a sampled card turns out not to
contain that contest, that is considered a discrepancy, dealt with using the ~2EZ approach.
It is assumed that every CVR corresponds to a card in the manifest, but there might
be cards cast in the contest for which there is no corresponding CVR. In that case,
phantom CVRs are created to ensure that the audit is still truly risk-limiting.

Given an independent (i.e., not relying on the voting system) upper bound on the number of cards that contain the contest, if the number of CVRs that contain the contest does not exceed that bound, we can sample from paper purported to contain the contest and use the ~2EZ approach to deal with missing CVRs. This can greatly increase the efficiency of the audit if 
some contests appear on only a small percentage of the cast cards ([Glazer, Spertus, and Stark, 2021](https://dl.acm.org/doi/10.1145/3457907)).
If there are more CVRs than the upper bound on the number of cards, extra CVRs can be deleted provided
that deletion does not change any contest outcome. See [Stark, 2022](https://arxiv.org/abs/2207.01362).
(However, if there more CVRs than cards, that is evidence of a process failure.)

Any sampled phantom card (i.e., a card for which there is no CVR) is treated as if its CVR is a non-vote (which it is), and as if its MVR was least favorable (an "evil zombie" producing the greatest doubt in every assertion, separately). Any sampled card for which there is a CVR is compared to its corresponding CVR. 
If the card turns out not to contain the contest (despite the fact that the CVR says it does), the MVR is treated in the least favorable way for each assertion (i.e., as a zombie rather than as a non-vote).

The tool helps select cards for audit, and reports when the audit has found sufficiently strong evidence to stop.

The tool exports a log of all the audit inputs except the CVR file, but including the auditors' manually determined voter intent from the audited cards.

The pre-10/2021 version used a single sample to audit all contests. 

### Internal workflow

+ Read overall audit information (including the seed) and contest information
+ Read assertions for IRV contests and construct assertions for all other contests
+ Read ballot manifest
+ Read cvrs. Every CVR should have a corresponding manifest entry.
+ Prepare ~2EZ:
    - `N_phantoms = max_cards - cards_in_manifest`
    - If `N_phantoms < 0`, complain
    - Else create `N_phantoms` phantom cards
    - For each contest `c`:
        + `N_c` is the input upper bound on the number of cards that contain `c`
        + if `N_c is None`, `N_c = max_cards - non_c_cvrs`, where `non_c_cvrs` is #CVRs that don't contain `c`
        + `C_c` is the number of CVRs that contain the contest
        + if `C_c > N_c`, complain
        + else if `N_c - C_c > N_phantoms`, complain
        + else:
            - Consider contest `c` to be on the first `N_c - C_c` phantom CVRs
            - Consider contest `c` to be on the first `N_c - C_c` phantom ballots
+ Create Assertions for every Contest. This involves also creating an Assorter for every Assertion, and a `NonnegMean` test
for every Assertion.
+ Calculate assorter margins for all assorters:
    - If `not use_style`, apply the Assorter to all cards and CVRs, including phantoms
    - Else apply the assorter only to cards/cvrs reported to contain the contest, including phantoms that contain the contest
+ Set `assertion.test.u` to the appropriate value for each assertion: `assorter.upper_bound` for polling audits or 
      `2/(2-assorter.margin/assorter.upper_bound)` for ballot-level comparison audits
+ Estimate starting sample size for the specified sampling design (w/ or w/o replacement, stratified, etc.), for chosen risk function, use of card-style information, etc.:
    - User-specified criterion, controlled by parameters. Examples:
        + expected sample size for completion, on the assumption that there are no errors
        + 90th percentile of sample size for completion, on the assumption that errors are not more frequent than specified
    - If `not use_style`, base estimate on sampling from the entire manifest, i.e., smallest assorter margin
    - Else use consistent sampling:
        + Augment each CVR (including phantoms) with a probability of selection, `p`, initially 0
        + For each contest `c`:
            - Find sample size `n_c` that meets the criterion 
            - For each non-phantom CVR that contains the contest, set `p = max(p, n_c/N_c)` 
        + Estimated sample size is the sum of `p` over all non-phantom CVRs
+ Draw the random sample:
    - Use the specified design, including using consistent sampling for style information
    - Express sample cards in terms of the manifest
    - Export
+ Read manual interpretations of the cards (MVRs)
+ Calculate attained risk for each assorter
    - Use ~2EZ to deal with phantom CVRs or cards; the treatment depends on whether `use_style == True`
+ Report
+ Estimate incremental sample size if any assorter nulls have not been rejected
+ Draw incremental sample; etc

# Audit parameters.

The overall audit involves information that is the same across contests, encapsulated in
a dict called `audit`:

* `seed`: the numeric seed for the pseudo-random number generator used to draw sample (for SHA256 PRNG)
* `sim_seed`: seed for simulations to estimate sample sizes (for Mersenne Twister PRNG)
* `quantile`: quantile of the sample size to use for setting initial sample size
* `cvr_file`: filename for CVRs (input)
* `manifest_file`: filename for ballot manifest (input)
* `use_style`: Boolean. If True, use card style information (inferred from CVRs) to target samples. If False, sample from all cards, regardless of the contest.
* `sample_file`: filename for sampled card identifiers (output)
* `mvr_file`: filename for manually ascertained votes from sampled cards (input)
* `log_file`: filename for audit log (output)
* `error_rate_1`: expected rate of 1-vote overstatements. Recommended value $\ge$ 0.001 if there are hand-marked ballots. Larger values increase the initial sample size, but make it more likely that the audit will conclude after a single round even if the audit finds errors
* `error_rate_2`: expected rate of 2-vote overstatements. 2-vote overstatements should be extremely rare.
Recommended value: 0. Larger values increase the initial sample size, but make it more likely that the audit will conclude after a single round even if the audit finds errors
* `reps`: number of replications to use to estimate sample sizes. If `reps is None`, uses a deterministic method
* `quantile`: quantile of sample size to estimate. Not used if `reps is None`
* `strata`: a dict describing the strata. Keys are stratum identifiers; values are dicts containing:
    + `max_cards`: an upper bound on the number of pieces of paper cast in the contest. This should be derived independently of the voting system. A ballot consists of one or more cards.
    + `replacement`: whether to sample from this stratum with replacement. 
    + `use_style`: True if the sample in that stratum uses card-style information.
    + `audit_type` one of Contest.POLLING, Contest.BALLOT_COMPARISON, Contest.BATCH_COMPARISON but only POLLING and BALLOT_COMPARISON are currently implemented. 
    + `test`: the name of the function to be used to measure risk. Options are `kaplan_markov`,`kaplan_wald`,`kaplan_kolmogorov`,`wald_sprt`,`kaplan_mart`, `alpha_mart`. 
Not all risk functions work with every social choice function or every sampling method. 
    + `estimator`: the estimator to be used by the risk function. Options are [FIX ME!]
    + `test_kwargs`: keyword arguments for the risk function

----

* `contests`: a dict of contest-specific data 
    + the keys are unique contest identifiers for contests under audit
    + the values are Contest objects with attributes:
        - `risk_limit`: the risk limit for the audit of this contest
        - `cards`: an upper bound on the number of cast cards that contain the contest
        - `choice_function`: `Audit.SOCIAL_CHOICE_FUNCTION.PLURALITY`, 
          `Audit.SOCIAL_CHOICE_FUNCTION.SUPERMAJORITY`, or `Audit.SOCIAL_CHOICE_FUNCTION.IRV`
        - `n_winners`: number of winners for majority contests. (Multi-winner IRV not supported)
        - `share_to_win`: for super-majority contests, the fraction of valid votes required to win, e.g., 2/3. share_to_win*n_winners must be less than 100%)
        - `candidates`: list of names or identifiers of candidates
        - `reported_winners` : list of identifier(s) of candidate(s) reported to have won. Length should equal `n_winners`.
        - `assertion_file`: filename for a set of json descriptors of Assertions (see technical documentation) that collectively imply the reported outcome of the contest is correct. Required for IRV; ignored for other social choice functions
        - `audit_type`: the audit strategy. Currently `Audit.AUDIT_TYPE.POLLING (ballot-polling)` and 
           `Audit.AUDIT_TYPE.BALLOT_COMPARISON` (ballot-level comparison audits) are implemented. 
           HYBRID and STRATIFIED are planned.
        - `test`: the risk function for the audit. Default is `NonnegMean.alpha_mart`, the alpha supermartingale test
        - `estim`: estimator for the alternative hypothesis for the test. Default is `NonnegMean.shrink_trunc`
        - `use_style`: True to use style information from CVRs to target the sample. False for polling audits or for sampling from all ballots for every contest.
        - other keys and values are added by the software, including `cvrs`, the number of CVRs that contain the contest, and `p`, the sampling fraction expected to be required to confirm the contest

In [1]:
import math
import json
import warnings
import numpy as np
import pandas as pd
import csv
import copy

from collections import OrderedDict
from IPython.display import display, HTML

from cryptorandom.cryptorandom import SHA256, int_from_hash
from cryptorandom.sample import sample_by_index

from Audit import Audit, Assertion, Assorter, Contest, CVR, Stratum
from NonnegMean import NonnegMean
from Dominion import Dominion
from Hart import Hart

In [2]:
audit = Audit.from_dict({
         'seed':           12345678901234567890,
         'sim_seed':       314159265,
         'cvr_file':       '/Users/amanda/Downloads/oc_cvrs.zip', 
         #'cvr_file':       '/Users/Jake/Desktop/oc_cvrs.zip', 
         'manifest_file':  'Data/OC_mock_manifest_detailed.xlsx',
         'sample_file':    '',
         'mvr_file':       '',
         'log_file':       'Data/OC_example_log.json',
         'quantile':       0.8,
         'error_rate_1':   0.001,
         'error_rate_2':   0.0001,
         'reps':           100,
         'strata':         {'stratum_1': {'max_cards':   10010, 
                                          'use_style':   True,
                                          'replacement': False,
                                          'audit_type':  Audit.AUDIT_TYPE.BALLOT_COMPARISON,
                                          'test':        NonnegMean.alpha_mart,
                                          'estimator':   NonnegMean.optimal_comparison,
                                          'test_kwargs': {}
                                         }
                           }
        })

# find upper bound on total cards across strata
audit.max_cards = np.sum([s.max_cards for s in audit.strata.values()])

In [3]:
#cvr_zip = "/Users/Jake/Desktop/oc_cvrs.zip"
cvr_zip = "/Users/amanda/Downloads/oc_cvrs.zip"
cvr_list = Hart.read_cvrs_zip(cvr_zip, size = 10000)

In [4]:
CVR.tabulate_votes(cvr_list)

defaultdict(<function Audit.CVR.tabulate_votes.<locals>.<lambda>()>,
            {'Proposition 19': defaultdict(int,
                         {'Yes': 1165, 'NA': 174, 'No': 1309}),
             'Proposition 20': defaultdict(int,
                         {'No': 2281, 'Yes': 1644, 'NA': 254}),
             'Proposition 21': defaultdict(int,
                         {'Yes': 2457, 'No': 4949, 'NA': 389}),
             'Proposition 22': defaultdict(int,
                         {'Yes': 5830, 'No': 3011, 'NA': 358}),
             'Proposition 23': defaultdict(int,
                         {'Yes': 2866, 'No': 5862, 'NA': 470}),
             'Proposition 24': defaultdict(int,
                         {'Yes': 5105, 'No': 4339, 'NA': 530}),
             'Proposition 25': defaultdict(int,
                         {'No': 5660, 'Yes': 3737, 'NA': 575}),
             'AA-City of Orange': defaultdict(int,
                         {'No': 283, 'Yes': 164, 'NA': 34}),
             'Proposition 14': defa

In [5]:
cvr_list[0].votes

{'Proposition 19': {'Yes': True},
 'Proposition 20': {'No': True},
 'Proposition 21': {'Yes': True},
 'Proposition 22': {'Yes': True},
 'Proposition 23': {'Yes': True},
 'Proposition 24': {'Yes': True},
 'Proposition 25': {'No': True},
 'AA-City of Orange': {'No': True}}

In [6]:
# Contests to audit
## Just choose Prop 19 for now to test
contest_dict = {'Proposition 19':{
                   'name': 'Prop 19',
                   'risk_limit': 0.05,
                   'cards': 1165 + 174 + 1309 + 10, # should create 10 phantoms
                   'choice_function': Contest.SOCIAL_CHOICE_FUNCTION.PLURALITY, # in @Contest, not @Audit
                   'n_winners': 1,
                   'candidates': ['Yes','No'],
                   'winner': ['No'],
                   'assertion_file': None,
                   'audit_type': Audit.AUDIT_TYPE.BALLOT_COMPARISON,
                   'test': NonnegMean.alpha_mart,
                   'estim': NonnegMean.optimal_comparison
                  },
                'Proposition 20':{
                   'name': 'Prop 20',
                   'risk_limit': 0.05,
                   'cards': 2281 + 1644 + 254 + 10, # should create 10 phantoms
                   'choice_function': Contest.SOCIAL_CHOICE_FUNCTION.PLURALITY, # in @Contest, not @Audit
                   'n_winners': 1,
                   'candidates': ['Yes','No'],
                   'winner': ['No'],
                   'assertion_file': None,
                   'audit_type': Audit.AUDIT_TYPE.BALLOT_COMPARISON,
                   'test': NonnegMean.alpha_mart,
                   'estim': NonnegMean.optimal_comparison
                  },
                'Proposition 21':{
                   'name': 'Prop 21',
                   'risk_limit': 0.05,
                   'cards': 2457 + 4949 + 389 + 10, # should create 10 phantoms
                   'choice_function': Contest.SOCIAL_CHOICE_FUNCTION.PLURALITY, # in @Contest, not @Audit
                   'n_winners': 1,
                   'candidates': ['Yes','No'],
                   'winner': ['No'],
                   'assertion_file': None,
                   'audit_type': Audit.AUDIT_TYPE.BALLOT_COMPARISON,
                   'test': NonnegMean.alpha_mart,
                   'estim': NonnegMean.optimal_comparison
                  },
                'V-City of Laguna Woods':{
                   'name': 'Measure V',
                   'risk_limit': 0.05,
                   'cards': 46 + 32 + 6 + 10, # should create 10 phantoms
                   'choice_function': Contest.SOCIAL_CHOICE_FUNCTION.PLURALITY, # in @Contest, not @Audit
                   'n_winners': 1,
                   'candidates': ['Yes','No'],
                   'winner': ['No'],
                   'assertion_file': None,
                   'audit_type': Audit.AUDIT_TYPE.BALLOT_COMPARISON,
                   'test': NonnegMean.alpha_mart,
                   'estim': NonnegMean.optimal_comparison
                  }
               }

contests = Contest.from_dict_of_dicts(contest_dict)

Example of other social choice functions:

        contests =  {'city_council':{'name': 'City Council',
                             'risk_limit':0.05,
                             'cards': None,
                             'choice_function': Contest.SOCIAL_CHOICE_FUNCTION.PLURALITY,
                             'n_winners':3,
                             'candidates':['Doug','Emily','Frank','Gail','Harry'],
                             'winner' : ['Doug', 'Emily', 'Frank']
                            },
                        'measure_1':{'name': 'Measure 1',
                             'risk_limit':0.05,
                             'cards': 65432,
                             'choice_function': Contest.SOCIAL_CHOICE_FUNCTION.SUPERMAJORITY,
                             'share_to_win':2/3,
                             'n_winners':1,
                             'candidates':['yes','no'],
                             'winner' : ['yes']
                            }                  
                      }

In [7]:
# read the assertions for the IRV contest
for c in contests:
    if contests[c].choice_function == Contest.SOCIAL_CHOICE_FUNCTION.IRV:
        with open(contests[c].assertion_file, 'r') as f:
            contests[c].assertion_json = json.load(f)['audits'][0]['assertions']

In [8]:
# construct the dict of dicts of assertions for each contest
Assertion.make_all_assertions(contests)

True

In [9]:
audit.check_audit_parameters(contests)

## Read the ballot manifest

In [10]:
# special for Primary/Dominion manifest format
manifest = pd.read_excel(audit.manifest_file)

In [11]:
manifest.head()

Unnamed: 0,Batch Name,Number of Ballots,Container,Tabulator,cum_cards
0,0,1706,1,99808,1706
1,10,64,1,99808,1770
2,100,80,1,99808,1850
3,101,6,1,99808,1856
4,102,69,1,99808,1925


## Read the CVR data and create CVR objects

In [12]:
# for ballot-level comparison audits
#cvr_list, cvrs_read, unique_ids = CVR.from_raire_file(audit.cvr_file)
cvr_list = Hart.read_cvrs_zip(audit.cvr_file, size = 10000) 

In [13]:
cvr_list[0].id # SHOULD SWITCH THIS TO - to be in line with dominion I believe

'274_2'

In [14]:
# check whether the manifest accounts for every card
# it doesn't because phantoms 
audit.max_cards, np.sum(manifest['Number of Ballots'])

(10010, 10000)

In [15]:
# Check that there is a card in the manifest for every card (possibly) cast. If not, add phantoms.
manifest, manifest_cards, phantom_cards = Hart.prep_manifest(manifest, audit.max_cards, len(cvr_list))
#manifest

  manifest = manifest.append(r, ignore_index = True)


In [16]:
## Note: for some reason prep manifest turns all the columns into string type..
audit.max_cards, np.sum(manifest['Number of Ballots'].astype(int))

(10010, 10010)

## Create CVRs for phantom cards

In [17]:
# For Comparison Audits Only
#----------------------------

# If the sample draws a phantom card, these CVRs will be used in the comparison.
# phantom MVRs should be treated as zeros by the Assorter for every contest

# setting use_style = False to generate phantoms

cvr_list, phantom_vrs = CVR.make_phantoms(audit=audit, contests=contests, 
                                          cvr_list=cvr_list, prefix='phantom-1-')
print(f"Created {phantom_vrs} phantom records")

Created 13 phantom records


In [18]:
# find the mean of the assorters for the CVRs and check whether the assertions are met
min_margin = Assertion.set_margins_from_cvrs(audit=audit, contests=contests, cvr_list=cvr_list)

print(f'minimum assorter margin {min_margin}')
Contest.print_margins(contests)

minimum assorter margin 0.05417607223476306
margins in contest Proposition 19:
	assertion No v Yes: 0.05417607223476306
margins in contest Proposition 20:
	assertion No v Yes: 0.15206493196466941
margins in contest Proposition 21:
	assertion No v Yes: 0.31928251121076223
margins in contest V-City of Laguna Woods:
	assertion No v Yes: 0.14893617021276606


In [19]:
audit.write_audit_parameters(contests=contests) 

## Set up for sampling

## Find initial sample size

In [20]:
# find initial sample size 
# error here, there are some CVRs that aren't receiving a p, 
# which then causes an error when they are summed over. 
# See line 746-749 in Audit.py;
# one fix is to set the p for CVRs not in the contest to 0; 
# another is to skip counting them in the sum...
sample_size = audit.find_sample_size(contests, cvrs=cvr_list)  
print(f'{sample_size=}\n{[(i, c.sample_size) for i, c in contests.items()]}')

sample_size=199
[('Proposition 19', 133), ('Proposition 20', 47), ('Proposition 21', 22), ('V-City of Laguna Woods', 40)]


## Draw the first sample

In [21]:
# draw the initial sample using consistent sampling
prng = SHA256(audit.seed)
CVR.assign_sample_nums(cvr_list, prng)
#sampled_cvr_indices needs to be an array for Hart.sample_from_cvrs?
#why are we subtracting 1 from it, i.e. `enumerate(sample-1)` in Hart.sample_from_cvrs
sampled_cvr_indices = CVR.consistent_sampling(cvr_list=cvr_list, contests=contests)
n_sampled_phantoms = np.sum(sampled_cvr_indices > manifest_cards)
print(f'The sample includes {n_sampled_phantoms} phantom cards.')

The sample includes 5 phantom cards.


In [22]:
len(cvr_list), manifest_cards, audit.max_cards

(10013, 10000, 10010)

In [23]:
# for comparison audit
cards_to_retrieve, sample_order, cvr_sample, mvr_phantoms_sample = \
    Hart.sample_from_cvrs(cvr_list, manifest, sampled_cvr_indices)

# for polling audit
# cards_to_retrieve, sample_order, mvr_phantoms_sample = Dominion.sample_from_manifest(manifest, sample)

In [24]:
cvr_sample[69].id

'phantom-1-3'

In [26]:
# write the sample
#Dominion.write_cards_sampled(audit.sample_file, cards_to_retrieve, print_phantoms=False)

## Read the audited sample data

In [27]:
# for real data
# with open(audit.mvr_file) as f:
#     mvr_json = json.load(f)

# mvr_sample = CVR.from_dict(mvr_json['ballots'])

# for simulated data, no errors
mvr_sample = cvr_sample

## Find measured risks for all assertions

In [28]:
### ISSUE HERE because phantom has no selection order which flags error
# in prep_comparison_sample in Audit.py
j = 0
for x in mvr_sample:
    if x.id == "phantom-1-3":
        print(j)
    j = j + 1
    
print(mvr_sample[69].id)

69
phantom-1-3


In [29]:
sample_order[mvr_sample[69].id]

{'selection_order': 69, 'serial': 10003}

In [30]:
mvr_sample.sort(key = lambda x: sample_order[x.id]["selection_order"])

In [31]:
## SEE ABOVE FOR WHY THIS IS FLAGGING ERROR
CVR.prep_comparison_sample(mvr_sample, cvr_sample, sample_order)  # for comparison audit
# CVR.prep_polling_sample(mvr_sample, sample_order)  # for polling audit

In [32]:
p_max = Assertion.set_p_values(contests=contests, mvr_sample=mvr_sample, cvr_sample=cvr_sample)
print(f'maximum assertion p-value {p_max}')
done = audit.summarize_status(contests)

maximum assertion p-value 1.0

p-values for assertions in contest Proposition 19
	No v Yes: 0.24999608501283765

contest Proposition 19 audit INCOMPLETE at risk limit 0.05. Attained risk 0.24999608501283765
assertions remaining to be proved:
	contest_id: Proposition 19 winner: No loser: Yes assorter: contest_id: Proposition 19
upper bound: 1, winner defined: False, loser defined: False, assort defined: True p-value: 0.24999608501283765 margin: 0.05417607223476306 test: test: <bound method NonnegMean.alpha_mart of <NonnegMean.NonnegMean object at 0x7fea12120460>> estim: <bound method NonnegMean.optimal_comparison of <NonnegMean.NonnegMean object at 0x7fea12120460>> upper bound u: 1.0278422273781902 N: 2658 null mean t: 0.5 kwargs: {'g': 0.1} p-history length: 136 proved: False sample_size: 133 assorter upper bound: 1: current risk 0.24999608501283765

p-values for assertions in contest Proposition 20
	No v Yes: 1.5042537829185744e-05

contest Proposition 20 AUDIT COMPLETE at risk limit 

In [33]:
# Log the status of the audit 
audit.write_audit_parameters(contests)

In [74]:
print(contests['Proposition 20'])

{'id': 'Proposition 20', 'name': 'Prop 20', 'risk_limit': 0.05, 'cards': 4189, 'choice_function': 'PLURALITY', 'n_winners': 1, 'share_to_win': None, 'candidates': ['Yes', 'No'], 'winner': ['No'], 'assertion_file': None, 'audit_type': 'BALLOT_COMPARISON', 'test': <function NonnegMean.alpha_mart at 0x7fea31a2b0d0>, 'g': 0.1, 'estim': <function NonnegMean.optimal_comparison at 0x7fea31a2b4c0>, 'use_style': True, 'assertions': {'No v Yes': <Audit.Assertion object at 0x7fea121202b0>}, 'tally': None, 'sample_size': 47, 'cvrs': 4176, 'margins': {'No v Yes': 0.15206493196466941}, 'p_values': {'No v Yes': 1.5042537829185744e-05}, 'proved': {'No v Yes': 1}, 'max_p': 1.5042537829185744e-05}


# How many more cards should be audited?

Estimate how many more cards will need to be audited to confirm any remaining contests. The enlarged sample size is based on:

* cards already sampled
* the assumption that we will continue to see errors at the same rate observed in the sample

In [77]:
sample_size = audit.find_sample_size(contests, cvr_list, mvr_sample)
print(f'{sample_size=}\n{[(i, c.sample_size) for i, c in contests.items()]}')

sample_size=363
[('Proposition 19', 133), ('Proposition 20', 0), ('Proposition 21', 0), ('V-City of Laguna Woods', 40)]


In [None]:
# augment the sample
# reset the seed
prng = SHA256(seed)
old_sample = sample
sample = sample_by_index(max_cards, new_size, prng=prng)
incremental_sample = np.sort(list(set(sample) - set(old_sample)))
n_phantom_sample = np.sum([cvr_list[i].phantom for i in incremental_sample])
print("The incremental sample includes {} phantom cards.".format(n_phantom_sample))

In [None]:
cvr_sample_lookup_new, cvr_sample_new, mvr_phantoms_sample_new = \
                sample_from_cvrs(cvr_list, manifest, incremental_sample)
write_cards_sampled(sample_file, cvr_sample_lookup_new, print_phantoms=False)

In [None]:
# mvr_json should contain the complete set of mvrs, including those in previous rounds

with open(mvr_file) as f:
    mvr_json = json.load(f)

mvr_sample = CVR.from_dict(mvr_json['ballots']) 

In [None]:
# compile entire sample
cvr_sample_lookup, cvr_sample, mvr_phantoms_sample = sample_from_cvrs(cvr_list, manifest, sample)

In [None]:
# add MVRs for phantoms
mvr_sample = mvr_sample + mvr_phantoms_sample

## Find measured risks for all assertions

In [None]:
prep_sample(mvr_sample, cvr_sample)
p_max = find_p_values(contests, mvr_sample, cvr_sample, manifest_type, \
                      risk_function= risk_fn)
print("maximum assertion p-value {}".format(p_max))
done = summarize_status(contests)

In [None]:
# Log the status of the audit 
write_audit_parameters(log_file, seed, replacement, risk_function, g, max_cards, len(cvr_list), \
                       manifest_cards, phantom_cards, error_rate, contests)

In [None]:
x = np.ones(5)
y = x
y[3]=2
x

In [None]:
AUDIT_TYPES = (IRV:="IRV", other:="OTHER")

In [None]:
AUDIT_TYPES

In [None]:
import math
import numpy as np
import scipy as sp
import json
import csv
import warnings
import typing
from numpy import testing
from collections import OrderedDict, defaultdict
from cryptorandom.cryptorandom import SHA256, random, int_from_hash
from cryptorandom.sample import random_permutation
from cryptorandom.sample import sample_by_index

from CVR import CVR
from Audit import Audit, Assertion, Assorter, Contest
from NonnegMean import NonnegMean

In [None]:
rate = 0.01
N = int(10**4)
margin = 0.1
upper_bound = 1
u = 2/(2-margin/upper_bound)
m = (1 - rate*upper_bound/margin)/(2*upper_bound/margin - 1)
one_over = 1/3.8 # 0.5/(2-margin)
clean = 1/1.9    # 1/(2-margin)

AvB = Contest.from_dict({'id': 'AvB',
                     'name': 'AvB',
                     'risk_limit': 0.05,
                     'cards': 10**4,
                     'choice_function': Audit.SOCIAL_CHOICE_FUNCTION.PLURALITY,
                     'n_winners': 1,
                     'candidates': ['Alice','Bob', 'Carol'],
                     'winners': ['Alice'],
                     'audit_type': Audit.AUDIT_TYPE.BALLOT_COMPARISON,
                     'test': NonnegMean.kaplan_markov,
                     'tally': {'Alice': 3000, 'Bob': 2000, 'Carol': 1000},
                     'g': 0.1,
                     'use_style': True
                })
losers = list(set(AvB.candidates)-set(AvB.winners))
AvB.assertions = Assertion.make_plurality_assertions(AvB, winners=AvB.winners, losers=losers)
AvB.find_margins_from_tally()

In [None]:
# first test
for a_id, a in AvB.assertions.items():
    sam_size1 = a.find_sample_size(data=np.ones(10), prefix=True, rate=rate, reps=None, quantile=0.5, seed=1234567890)
    # Kaplan-Markov martingale is \prod (t+g)/(x+g). For x = [1, 1, ...], sample size should be:
    ss1 = math.ceil(np.log(AvB.risk_limit)/np.log((a.test.t+a.test.g)/(1+a.test.g)))
    print(f'{sam_size1} {ss1}')
    assert sam_size1 == ss1
    # For "clean", the term is (1/2+g)/(clean+g); for one_over
    # it is (1/2+g)/(one_over+g). 
    clean = 1/(2-a.margin/a.assorter.upper_bound)
    over = clean/2
    c = (a.test.t+a.test.g)/(clean+a.test.g)
    o = (a.test.t+a.test.g)/(clean/2+a.test.g)
    sam_size2 = a.find_sample_size(data=None, prefix=True, rate=rate, reps=10**3, quantile=0.5, seed=1234567890)
    ss2 = 1+math.ceil(np.log(AvB.risk_limit/o)/np.log(c))
    assert sam_size2 == ss2
    


In [None]:
c=0.957983193277311; o=1.652173913043478

In [None]:
o*c**(sam_size2-2)