Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail gracefully when there is no precinct information #650

Merged
merged 9 commits into from Jun 9, 2023

Conversation

artoonie
Copy link
Collaborator

@artoonie artoonie commented Apr 4, 2023

See the comments in #621.

In sum: tabulating by precinct fails because there is no precinct on these files. How should we handle that? Should we set "Precinct Portion" to be equal to "Precinct"? Should we prevent tabulating by precinct altogether?

This will require updating the tests that have precinct blank: https://github.com/BrightSpots/rcv/blob/develop/src/test/resources/network/brightspots/rcv/test_data/dominion_wyoming/dominion_wyoming_contest_1_expected.csv

Note: I will add a test when the design decisions are validated.

@@ -275,6 +278,10 @@ private int parseCvrFile(
String precinctPortion = this.precinctPortions.get(precinctPortionId);
String ballotTypeId = adjudicatedData.get("BallotTypeId").toString();

if (precinct == null && precinctPortion != null) {
precinct = precinctPortion;
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the comments on line 261, this is a workaround to have precincts populated for older files that only has precinctPortion

@artoonie artoonie linked an issue Apr 4, 2023 that may be closed by this pull request
@HEdingfield
Copy link
Contributor

@tarheel @chughes297 any thoughts on @artoonie's questions here?

@tarheel
Copy link
Contributor

tarheel commented May 28, 2023

You'll need to talk to @chughes297 et al about this. My guess is that it's not valid to take the "precinct portion" and use it as the precinct, and that the tabulator should just fail gracefully whenever the user tries to tabulate by precinct but the precinct info is incomplete (i.e. not present for every single CVR) or missing entirely.

Btw, I don't think that @chughes297 is seeing his GitHub notifs, so I'd recommend emailing him directly and also seeing if he can adjust his notif settings for the future.

@chughes297
Copy link
Collaborator

Agree with @tarheel, I'm hesitant about populating this info because I don't know how precinct portion and precinct differ. I would really rather not produce precinct-level results that are inaccurate because of a mistaken assumption about how those two files relate to one another.

Also I just updated my notification settings so I should get emails just when I get tagged. I was getting emails for every comment on the rctab repo which was not helpful for actually seeing things when I got tagged...

@artoonie
Copy link
Collaborator Author

Great, so new approach would be to allow "Tabulate by Precinct," and if some CVRs don't have a precinct, gracefully fail.

@chughes297
Copy link
Collaborator

chughes297 commented May 31, 2023 via email

@artoonie artoonie changed the base branch from develop to cleanup-precinct-ids June 5, 2023 18:46
@artoonie artoonie force-pushed the feature/issue-621_no-dominion-precincts branch from 3d5e5a3 to 296d1f6 Compare June 5, 2023 18:48
@artoonie artoonie marked this pull request as ready for review June 5, 2023 18:53
@artoonie
Copy link
Collaborator Author

artoonie commented Jun 5, 2023

This is ready for review.

  1. I copied and pasted the dominion_no_precinct_data test. I wonder if we should begin allowing non-standalone tests: allow a relative path from one test to another, because this 86,000 line copy-pasta feels terrible. The new test would do well to just point to the other test's data (or to create a shared_test_data folder?)
  2. This now fails gracefully with an error message about not having precinct data, instead of an NPE.

Base automatically changed from cleanup-precinct-ids to develop June 9, 2023 17:26
# Conflicts:
#	src/main/java/network/brightspots/rcv/BaseCvrReader.java
#	src/main/java/network/brightspots/rcv/ClearBallotCvrReader.java
#	src/main/java/network/brightspots/rcv/CommonDataFormatReader.java
#	src/main/java/network/brightspots/rcv/CsvCvrReader.java
#	src/main/java/network/brightspots/rcv/DominionCvrReader.java
#	src/main/java/network/brightspots/rcv/StreamingCvrReader.java
#	src/main/java/network/brightspots/rcv/Tabulator.java
#	src/main/java/network/brightspots/rcv/TabulatorSession.java
@@ -598,4 +598,11 @@ void genericCsvTest() {
void noOneMeetsMinimumTest() {
runTabulationTest("no_one_meets_minimum", TabulationAbortedException.class.toString());
}

@Test
@DisplayName("gracefully fail when tabulate-by-precinct option set without any precincts in CVR")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@artoonie said:

I copied and pasted the dominion_no_precinct_data test. I wonder if we should begin allowing non-standalone tests: allow a relative path from one test to another, because this 86,000 line copy-pasta feels terrible. The new test would do well to just point to the other test's data (or to create a shared_test_data folder?)

Agreed, and I like this idea. The only potential issue is that it makes the tests non-hermetic (i.e. changing the input file for one could cause others to unintentionally fail). I don't think this is a good reason not to do this though.

Could you please create a follow-up issue for this (linking to this comment), and maybe we can sneak it into v1.4.0 if it seems relatively low effort?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! #692

I think we can avoid the first issue by only sharing CVRs in a common directory, so it's clear that those CVRs are shared.

Comment on lines 855 to 858
if (precinctIds.size() == 0) {
Logger.severe("\"Tabulate by Precinct\" enabled, but CVRs don't list precincts.");
throw new TabulationAbortedException(false);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be more efficient to put this in the constructor just before initPrecinctRoundTallies() is called?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but the constructor doesn't currently throw a TabulationAbortedException (nor do I think it should?), so it felt natural here since it's also where we check for other precinct-level issues.

Let me know if you think that's a good enough reason, or if there's another natural spot for this check to occur.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any reason not to have the constructor throw a TabulationAbortedException (in fact, it's already caught in the place where the constructor is called).

I'd personally move it there because I like cleaner logs and not going forward with other calcs to no end, but it's not a super big deal. Will go ahead and approve and let you make the call either way. 👍

@artoonie artoonie changed the title Populate older dominion files with precinct information Fail gracefully when there is no precinct information Jun 9, 2023
Comment on lines 855 to 858
if (precinctIds.size() == 0) {
Logger.severe("\"Tabulate by Precinct\" enabled, but CVRs don't list precincts.");
throw new TabulationAbortedException(false);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any reason not to have the constructor throw a TabulationAbortedException (in fact, it's already caught in the place where the constructor is called).

I'd personally move it there because I like cleaner logs and not going forward with other calcs to no end, but it's not a super big deal. Will go ahead and approve and let you make the call either way. 👍

@artoonie artoonie merged commit a8308e8 into develop Jun 9, 2023
1 check passed
@artoonie artoonie deleted the feature/issue-621_no-dominion-precincts branch June 9, 2023 19:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NPE when tabulating by precinct
4 participants