# Demonstrate selection of samples for RLA

Show how the selections used in the Orange County 2018 Primary risk-limiting audit can be reproduced.

Other audits can be demonstrated by defining different parameters.

Based on Rivest's `sampler` algorithm and code. Install python3 version from
  https://github.com/nealmcb/rivest-sampler-tests/tree/python3-port

In [1]:
import sys
sys.path.insert(0, '../../rivest-sampler-tests/src')

In [2]:
import demo_selections
import codecs
import csv
from pprint import pprint

## Define parameters

In [3]:
countyname = "Orange County"
manifestname = "combined-manifest.csv"
seed = "81330464974734480366"

In [4]:
# Show examples
batches = []
reader = csv.reader(codecs.open(manifestname, 'r', 'iso-8859-1'))
next(reader) # Skip header row
for row in reader:
    batchname = row[0]
    count = int(row[1])
    batches.append(demo_selections.Batch(countyname, "", batchname, count, ""))

N = sum(batch.cardcount for batch in batches)
print("There are a total of %d ballot sheets in the manifest.\n" % N)

demo_selections.sampler_example(N, 20, seed, 3)

There are a total of 1447871 ballot sheets in the manifest.

First we show how the seed is used by the sampler algorithm to select ballots
 from the CVR file. The seed, 81330464974734480366, is paired up
 with the numbers between 1 and the number of selections we need.
 We show details for the first 3:

sha256('81330464974734480366,1')
 = 185e10255b09240b64e90a9695e7b269c92f3a3922b0a2c07073424903b3962e base 16
 = 11021703425134315629919584077641002952166874100921455850668298360203563472430 base 10.
 = 413627 mod 1447871. So the cvr_number of selected CVR number 1 is 413627.
sha256('81330464974734480366,2')
 = d824cacae12e120cda675e00da600120246ffc2e40584de36f879ee9466b737c base 16
 = 97764581410703058116217141074916508336601353442740087313844070728367696737148 base 10.
 = 1356617 mod 1447871. So the cvr_number of selected CVR number 2 is 1356617.
sha256('81330464974734480366,3')
 = a079d5cb992aaa98868984f42ca48255154bb4bcec86cd13bba504fc4d8af486 base 16
 = 72585319829132179025304998760

In [5]:
# Print first 20 in order they were picked
selections = demo_selections.select_ballots_to_audit(seed, batches, 20)
pprint(selections)

['Orange County--1-3-288-117',
 'Orange County--48256-242',
 'Orange County--1-3-388-203',
 'Orange County--1-9-19-150',
 'Orange County--1-4-174-66',
 'Orange County--1-1-64-290',
 'Orange County--1-6-468-72',
 'Orange County--1-3-445-92',
 'Orange County--1-1-55-256',
 'Orange County--1-5-125-321',
 'Orange County--14075-163',
 'Orange County--1-1-142-206',
 'Orange County--1-2-442-66',
 'Orange County--1-2-346-264',
 'Orange County--1-8-267-53',
 'Orange County--1-6-4-105',
 'Orange County--1-3-173-99',
 'Orange County--1-2-13-223',
 'Orange County--1-4-323-77',
 'Orange County--1-2-354-291']


In [6]:
# Print first 20 in order by imprintedId
sorted_selections = selections[:]
sorted_selections.sort(key=demo_selections.natural_keys)
pprint(sorted_selections)

['Orange County--1-1-55-256',
 'Orange County--1-1-64-290',
 'Orange County--1-1-142-206',
 'Orange County--1-2-13-223',
 'Orange County--1-2-346-264',
 'Orange County--1-2-354-291',
 'Orange County--1-2-442-66',
 'Orange County--1-3-173-99',
 'Orange County--1-3-288-117',
 'Orange County--1-3-388-203',
 'Orange County--1-3-445-92',
 'Orange County--1-4-174-66',
 'Orange County--1-4-323-77',
 'Orange County--1-5-125-321',
 'Orange County--1-6-4-105',
 'Orange County--1-6-468-72',
 'Orange County--1-8-267-53',
 'Orange County--1-9-19-150',
 'Orange County--14075-163',
 'Orange County--48256-242']


In [7]:
# Print first selected ballot id from each of the 8 batches of 20
selections = demo_selections.select_ballots_to_audit(seed, batches, 160)
for index in range(0, 160, 20):
    sheet = selections[index:index+20]
    sheet.sort(key=demo_selections.natural_keys)
    print("%d: %s" % (index, sheet[0]))

0: Orange County--1-1-55-256
20: Orange County--1-1-2-54
40: Orange County--1-1-128-202
60: Orange County--1-1-359-15
80: Orange County--1-1-402-142
100: Orange County--1-1-67-85
120: Orange County--1-1-444-146
140: Orange County--1-1-18-302
