# Montgomery County OH, Risk-Limiting Audit of 2020 Primary

This notebook provides some simple guidance and a way to document the risk-limiting audit in Montgomery County Ohio of the [2020 Primary election, March 17, 2020](https://montgomerycountyoh.epulseadmin2.com/election_results/30/preview?import_id=69439)

See also the [Principles and Best Practices for Post\-Election Tabulation Audits \| ElectionAudits\.org](https://electionaudits.org/principles/)

## Before the audit, publish election results and manifest to be audited

Election results as of 2020-05-11: https://r7j7u2j8.rocketcdn.me/wp-content/uploads/2020/05/03172020es-Official-Final-With-Write-ins.pdf

Ballot manifest: **montgomery_co_ballot_manifest_final.csv**

    SHA256 sum: c9a27d612e1a6001abe26f83ee74404602b0fa545383232c46871e3944b90b14

Commit to the elections results by publishing this document, tweeting a hash of it, etc.

In [3]:
# Establish 10% risk limit:
risk_limit = 0.1

## Next, roll dice to establish random seed:

Random seed (20 digits): **49986805286195567111**

## Estimate round sizes

Initialize software, load election results data, and show round sizes for a 70%, 80%, and 90% chance of finishing the audit in the round.

In [1]:
import pandas as pd
import json
from IPython.core.display import display, HTML

# For the Athena calculations, we either import locally (after the repo was cloned)
# or we first clone it and then use it (e.g., when run in Google Colab)
shell = get_ipython().__class__.__name__ 

if shell == 'Shell':
    # imports when launched in e.g., Google Colab
    !git clone https://github.com/filipzz/athena.git r2b2
    from r2b2.code.athena.athena import AthenaAudit
    from r2b2.code.athena.contest import Contest
    from r2b2.code.athena.audit import Audit
else: # shell ==  'ZMQInteractiveShell' or shell == 'TerminalInteractiveShell'
    # local imports if you run it with e.g., Jupyter
    from athena.athena import AthenaAudit
    from athena.contest import Contest
    from athena.audit import Audit

In [2]:
# Results, based on data at https://r7j7u2j8.rocketcdn.me/wp-content/uploads/2020/05/03172020es-Official-Final-With-Write-ins.pdf
# Or URL
results_file = "athena/test_data/2020_montgomery-0511-formatted.json"

In [4]:
results = json.load(open(results_file, 'r'))

In [5]:
results['total_ballots']

69743

In [6]:
# For each contest, display reported results and sample sizes
htmlout = ""
for contest in results['contests']:
    htmlout += f"<H1>Contest: {contest}</H1>\n"
    w = Audit("athena", risk_limit)

    w.read_election_results(results_file)

    w.load_contest(contest)
    htmlout += w.show_election_results().render()

    round_sizes = w.predict_round_sizes([.7, .8, .9])
    df_rs = pd.DataFrame({f'{pstop:.0%}': ss for pstop, ss in round_sizes}, index=['Sample size'])
    htmlout += "<p>Sample sizes by stopping probability:" + df_rs.to_html()

display(HTML(htmlout))

Unnamed: 0,Candidates,Results
0,Bennet,51
1,Biden,29011
2,Bloomberg,702
3,Buttigieg,525
4,Gabbard,137
5,Klobuchar,406
6,Patrick,27
7,Sanders,5713
8,Steyer,62
9,Warren,1118

Unnamed: 0,70%,80%,90%
Sample size,21,25,29

Unnamed: 0,Candidates,Results
0,Moyer,10130
1,Tims,25047

Unnamed: 0,70%,80%,90%
Sample size,44,62,80

Unnamed: 0,Candidates,Results
0,Fogel,16867
1,Griggs,3643

Unnamed: 0,70%,80%,90%
Sample size,35,41,58

Unnamed: 0,Candidates,Results
0,Dodge,24425
1,Rountree,10888

Unnamed: 0,70%,80%,90%
Sample size,68,72,101

Unnamed: 0,Candidates,Results
0,Lieberman,28353
1,West,6865

Unnamed: 0,70%,80%,90%
Sample size,24,24,38

Unnamed: 0,Candidates,Results
0,Anderson,1964
1,Flanders,1454
2,Turner,24224

Unnamed: 0,70%,80%,90%
Sample size,19,20,24

Unnamed: 0,Candidates,Results
0,Antani,14866
1,Robinson,2885
2,Selby,5317

Unnamed: 0,70%,80%,90%
Sample size,77,83,115

Unnamed: 0,Candidates,Results
0,Stubbs,2341
1,Young,6644

Unnamed: 0,70%,80%,90%
Sample size,156,171,257

Unnamed: 0,Candidates,Results
0,Scearce,8538
1,Setzer,15691

Unnamed: 0,70%,80%,90%
Sample size,144,167,228


## Enter the election data and random seed into Arlo

Publish the ballot selection information from Arlo

# Actual results

See if the evidence supports finishing the audit.

Do this for each contest.

Enter the sample tally data below for the first round

## Presidential Primary (D)

In [7]:
contest = 'd_president'

In [8]:
w = Audit("athena", risk_limit)
w.read_election_results(results_file)
w.load_contest(contest)

In [9]:
w.set_observations(240, 150, [0, 118, 4, 2, 0, 1, 0, 20, 0, 4, 1])
w.present_state()



	Audit Successfully completed!
	LR:		4.517698142736021e+16	[needs to be > 1]
	p-value:	6.66819221488288e-18	[needs to be <= 0.1]


Unnamed: 0,Candidates,Results,Round 1,Total,Required
0,Bennet,51,0.0,0.0,
1,Biden,29011,118.0,118.0,107.0
2,Bloomberg,702,4.0,4.0,
3,Buttigieg,525,2.0,2.0,
4,Gabbard,137,0.0,0.0,
5,Klobuchar,406,1.0,1.0,
6,Patrick,27,0.0,0.0,
7,Sanders,5713,20.0,20.0,
8,Steyer,62,0.0,0.0,
9,Warren,1118,4.0,4.0,


## City Commissioner-1-2-21 (D)

In [10]:
contest = 'd_cc_1_2_2021'

In [11]:
w = Audit("athena", risk_limit)
w.read_election_results(results_file)
w.load_contest(contest)

In [12]:
w.set_observations(240, 140, [89, 51])
w.present_state()



	Audit Successfully completed!
	LR:		68.3886512239166	[needs to be > 1]
	p-value:	0.0008920047915490925	[needs to be <= 0.1]


Unnamed: 0,Candidates,Results,Round 1,Total,Required
0,Dodge,24425,89.0,89.0,84.0
1,Rountree,10888,51.0,51.0,
2,,Sum,140.0,,
3,,LR,68.3887,,
4,,P-Value,0.0009,,


## City Commissioner 1-2-21 (R)

In [13]:
contest = 'r_cc_1_2_2021'

In [14]:
w = Audit("athena", risk_limit)
w.read_election_results(results_file)
w.load_contest(contest)

In [15]:
w.set_observations(240, 80, [20, 49])
w.present_state()



	Audit Successfully completed!
	LR:		292.3643415969275	[needs to be > 1]
	p-value:	0.0018886244991318493	[needs to be <= 0.1]


Unnamed: 0,Candidates,Results,Round 1,Total,Required
0,Scearce,8538,20.0,20.0,
1,Setzer,15691,49.0,49.0,41.0
2,,Sum,80.0,,
3,,LR,292.3643,,
4,,P-Value,0.0019,,


# Ballot-by-Ballot resampling

In [13]:
cvrs = '/srv/voting/audit/oh/montgomery/Montgomery_Tally_Sheets-s1.csv'

In [68]:
cvrdf = pd.read_csv(cvrs)
cvrdf.fillna(0)

In [69]:
cvrdf

Unnamed: 0,Audit Board,Batch Name,Ballot Number,Storage Location,Tabulator,Ticket Numbers,Already Audited,Bennet,Biden,Bloomberg,...,O/U.4,Unnamed: 49,Scearce.1,Setzer.1,O/U.5,Unnamed: 53,Nonpartisan.1,Unnamed: 55,Unnamed: 56,Unnamed: 57
0,Audit Board #1,A0006,118,,,0.002343,N,,,1.0,...,,,,,,,,,,
1,Audit Board #1,A0006,180,,,0.002369,N,,,1.0,...,,,,,,,,,,
2,Audit Board #1,A0014,176,,,0.001015,N,,,,...,,,,,,,,,,
3,Audit Board #1,A0032,125,,,0.000374,N,,,,...,,,,,,,,,,
4,Audit Board #1,A0033,172,,,0.001433,N,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
237,Audit Board #4,EARLY-6,344,,,0.001888,N,,,,...,,,,,,,,,,
238,Audit Board #4,EARLY-6,384,,,0.003033,N,,,,...,,,,,,,,,,
239,Audit Board #4,EARLY-6,667,,,0.002330,N,,,,...,,,,,,,,,,
240,,,,,,,,,,,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,


In [71]:
# Drop totals rows at the end
# Should check them first
cvrdf = cvrdf.drop([240, 241]).sort_values('Ticket Numbers')

In [75]:
cvrdf = cvrdf.fillna(0)

In [37]:
cvrdf.columns

Index(['Audit Board', 'Batch Name', 'Ballot Number', 'Storage Location',
       'Tabulator', 'Ticket Numbers', 'Already Audited', 'Bennet', 'Biden',
       'Bloomberg', 'Booker', 'Buttigieg', 'Gabbard', 'Klobuchar', 'Patrick',
       'Sanders', 'Steyer', 'Warren', 'Write-In', 'O/U', 'Unnamed: 20',
       'Dodge', 'Rountree', 'O/U.1', 'Unnamed: 24', 'Scearce', 'Setzer',
       'O/U.2', 'Unnamed: 28', 'Nonpartisan', 'Total Ballots', 'Unnamed: 31',
       'Bennet.1', 'Biden.1', 'Bloomberg.1', 'Booker.1', 'Buttigieg.1',
       'Gabbard.1', 'Klobuchar.1', 'Patrick.1', 'Sanders.1', 'Steyer.1',
       'Warren.1', 'Write-In.1', 'O/U.3', 'Unnamed: 45', 'Dodge.1',
       'Rountree.1', 'O/U.4', 'Unnamed: 49', 'Scearce.1', 'Setzer.1', 'O/U.5',
       'Unnamed: 53', 'Nonpartisan.1', 'Unnamed: 55', 'Unnamed: 56',
       'Unnamed: 57'],
      dtype='object')

In [76]:
cvrdf

Unnamed: 0,Audit Board,Batch Name,Ballot Number,Storage Location,Tabulator,Ticket Numbers,Already Audited,Bennet,Biden,Bloomberg,...,O/U.4,Unnamed: 49,Scearce.1,Setzer.1,O/U.5,Unnamed: 53,Nonpartisan.1,Unnamed: 55,Unnamed: 56,Unnamed: 57
193,Audit Board #4,A0094,27,0.0,0.0,0.000018,N,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
143,Audit Board #3,B0035,64,0.0,0.0,0.000032,N,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
129,Audit Board #3,A0123,102,0.0,0.0,0.000040,N,0.0,1.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
140,Audit Board #3,B0019,85,0.0,0.0,0.000058,N,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
218,Audit Board #4,B0092,63,0.0,0.0,0.000062,N,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
39,Audit Board #1,B0071,81,0.0,0.0,0.003160,N,0.0,1.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
204,Audit Board #4,A0132,151,0.0,0.0,0.003166,N,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
96,Audit Board #2,B0089,160,0.0,0.0,0.003176,N,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
74,Audit Board #2,A0093,14,0.0,0.0,0.003178,N,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [43]:
core_columns = ['Ticket Numbers', 'Batch Name', 'Ballot Number']
d_cc = ['Dodge', 'Rountree', 'O/U.1']
r_cc = ['Scearce', 'Setzer', 'O/U.2']

In [77]:
repdf = cvrdf[core_columns + r_cc]

In [78]:
repdf

Unnamed: 0,Ticket Numbers,Batch Name,Ballot Number,Scearce,Setzer,O/U.2
193,0.000018,A0094,27,1.0,0.0,0.0
143,0.000032,B0035,64,0.0,0.0,1.0
129,0.000040,A0123,102,0.0,0.0,0.0
140,0.000058,B0019,85,0.0,1.0,0.0
218,0.000062,B0092,63,0.0,0.0,0.0
...,...,...,...,...,...,...
39,0.003160,B0071,81,0.0,0.0,0.0
204,0.003166,A0132,151,1.0,0.0,0.0
96,0.003176,B0089,160,0.0,0.0,0.0
74,0.003178,A0093,14,0.0,1.0,0.0


In [79]:
repdf.describe()

Unnamed: 0,Ticket Numbers,Scearce,Setzer,O/U.2
count,240.0,240.0,240.0,240.0
mean,0.001648,0.033333,0.15,0.045833
std,0.0009,0.179881,0.357818,0.209561
min,1.8e-05,0.0,0.0,0.0
25%,0.000893,0.0,0.0,0.0
50%,0.00165,0.0,0.0,0.0
75%,0.002421,0.0,0.0,0.0
max,0.003189,1.0,1.0,1.0


In [80]:
repdf.sum()

Ticket Numbers                                             0.395569
Batch Name        A0094B0035A0123B0019B0092B0012A0053A0151A0134B...
Ballot Number     2764102856367195741451091161827111810019411417...
Scearce                                                           8
Setzer                                                           36
O/U.2                                                            11
dtype: object

In [19]:
dpresdf = cvrdf

Unnamed: 0,Audit Board,Batch Name,Ballot Number,Storage Location,Tabulator,Ticket Numbers,Already Audited,Bennet,Biden,Bloomberg,...,O/U.4,Unnamed: 49,Scearce.1,Setzer.1,O/U.5,Unnamed: 53,Nonpartisan.1,Unnamed: 55,Unnamed: 56,Unnamed: 57
0,Audit Board #1,A0006,118,,,0.002343383,N,,,1,...,,,,,,,,,,
1,Audit Board #1,A0006,180,,,0.002368673,N,,,1,...,,,,,,,,,,
2,Audit Board #1,A0014,176,,,0.001015005,N,,,,...,,,,,,,,,,
3,Audit Board #1,A0032,125,,,0.000373813,N,,,,...,,,,,,,,,,
4,Audit Board #1,A0033,172,,,0.001433460,N,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
237,Audit Board #4,EARLY-6,344,,,0.001888002,N,,,,...,,,,,,,,,,
238,Audit Board #4,EARLY-6,384,,,0.003032824,N,,,,...,,,,,,,,,,
239,Audit Board #4,EARLY-6,667,,,0.002330398,N,,,,...,,,,,,,,,,
240,,,,,,,,,,,...,0,0,0,0,0,0,0,,,


# Other examples

## Set up a contest

In [16]:
contest = 'd_president'

In [17]:
w = Audit("athena", risk_limit)
w.read_election_results(results_file)
w.load_contest(contest)

## Tally the first round of samples and check the results
See if the evidence supports finishing the audit.

Do this for each contest.

Enter the sample tally data below for the first round

_**Note, this is just EXAMPLE DATA from test11**_

In [18]:
w.set_observations(200, 100, [0,60,2,2,0,0,0,30,0,6,0])
w.present_state()



	Round: 1 audit failed
	LR:		0.07865222182791687	[needs to be > 1]
	Delta:		12.714198998572465	[needs to be < 1]
	p-value:	0.0010301588480453989	[needs to be <= 0.1]
	both conditions are required to be satisfied.


Unnamed: 0,Candidates,Results,Round 1,Total,Required
0,Bennet,51,0.0,0.0,
1,Biden,29011,60.0,60.0,62.0
2,Bloomberg,702,2.0,2.0,
3,Buttigieg,525,2.0,2.0,
4,Gabbard,137,0.0,0.0,
5,Klobuchar,406,0.0,0.0,
6,Patrick,27,0.0,0.0,
7,Sanders,5713,30.0,30.0,
8,Steyer,62,0.0,0.0,
9,Warren,1118,6.0,6.0,


Note that the test results are a bit different, e.g. a delta in test11 of 15.63488304 (LR of 0.06395) vs LR = 0.0787 here, presumably because the contest details have been updated by a few percent.

## Continue until audit is completed

If there isn't enough evidence yet to complete the audit, pull more ballots and enter more observations as in the last cell.

Enter the incremental data from each round, not cumulative results.

In [19]:
round_sizes = w.predict_round_sizes([.7, .8, .9])

In [20]:
round_sizes

[[0.7, 115], [0.8, 125], [0.9, 138]]

In [21]:
w.set_observations(200, 100, [0,70,1,1,0,0,0,23,0,5,0])
w.present_state()



	Audit Successfully completed!
	LR:		2513.0443937554787	[needs to be > 1]
	p-value:	6.441264615738567e-06	[needs to be <= 0.1]


Unnamed: 0,Candidates,Results,Round 1,Round 2,Total,Required
0,Bennet,51,0.0,0.0,0.0,
1,Biden,29011,60.0,70.0,130.0,126.0
2,Bloomberg,702,2.0,1.0,3.0,
3,Buttigieg,525,2.0,1.0,3.0,
4,Gabbard,137,0.0,0.0,0.0,
5,Klobuchar,406,0.0,0.0,0.0,
6,Patrick,27,0.0,0.0,0.0,
7,Sanders,5713,30.0,23.0,53.0,
8,Steyer,62,0.0,0.0,0.0,
9,Warren,1118,6.0,5.0,11.0,


## Publish and share this notebook

Incorporate the final Arlo audit report also

# A different example

## Set up a contest

In [22]:
contest = 'r_senator'

In [23]:
w = Audit("athena", risk_limit)
w.read_election_results(results_file)
w.load_contest(contest)

## Tally the first round of samples and check the results
See if the evidence supports finishing the audit.

Do this for each contest.

Enter the sample tally data below for the first round

_**Note, this is just EXAMPLE DATA....**_

In [24]:
w.set_observations(115, 30, [16, 5, 9])
w.present_state()



	Round: 1 audit failed
	LR:		1.538908805781586	[needs to be > 1]
	Delta:		0.64981108447951	[needs to be < 1]
	p-value:	0.126973980090786	[needs to be <= 0.1]
	both conditions are required to be satisfied.


Unnamed: 0,Candidates,Results,Round 1,Total,Required
0,Antani,14866,16.0,16.0,17.0
1,Robinson,2885,5.0,5.0,
2,Selby,5317,9.0,9.0,
3,,Sum,30.0,,
4,,LR,1.5389,,
5,,P-Value,0.127,,


## Continue until audit is completed

If there isn't enough evidence yet to complete the audit, pull more ballots and enter more observations as in the last cell.

Enter the incremental data from each round, not cumulative results.

In [25]:
round_sizes = w.predict_round_sizes([.7, .8, .9])

In [26]:
round_sizes

[[0.7, 129], [0.8, 161], [0.9, 202]]

In [27]:
w.set_observations(202, 77, [37, 10, 20])
w.present_state()



	Audit Successfully completed!
	LR:		7.018633183315828	[needs to be > 1]
	p-value:	0.015665256319630765	[needs to be <= 0.1]


Unnamed: 0,Candidates,Results,Round 1,Round 2,Total,Required
0,Antani,14866,16.0,37.0,53.0,52.0
1,Robinson,2885,5.0,10.0,15.0,
2,Selby,5317,9.0,20.0,29.0,
3,,Sum,30.0,77.0,,
4,,LR,1.5389,7.0186,,
5,,P-Value,0.127,0.0157,,
