Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deprecate the concept of "codes" (almost) entirely #690

Merged
merged 6 commits into from
Jun 14, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions config_file_documentation.txt
Original file line number Diff line number Diff line change
Expand Up @@ -123,10 +123,10 @@ Config file must be valid JSON format. Examples can be found in the test_data fo
example: "Duson, Jill C."
value: string of length [1..1000]

"code" optional
candidate code which may appear in CVRs in lieu of full candidate name
example: "JCD"
value: string of length [1..1000]
"aliases" optional
aliases which may appear in CVRs in addition to candidate name
example: ["JCD"], or ["Jimmy", "Jim"]
value: list containing string(s) of length [1..1000]

"excluded" optional
candidate should be ignored during tabulation
Expand Down Expand Up @@ -178,8 +178,9 @@ Config file must be valid JSON format. Examples can be found in the test_data fo
if selecting a loser the tied candidate who appears latest is selected as the loser

"generatePermutation"
on config load candidate names are sorted alphabetically by candidate code, or if code is not present, candidate name
a randomly ordered candidate permutation is created using Collections.shuffle() with the randomSeed specified in the input config file
on config load candidate names are sorted alphabetically by candidate name
a randomly ordered candidate permutation is created using Collections.shuffle() with the randomSeed specified
in the input config file
during tabulation, in the event of a tie at the end of a round, this permutation is consulted
if selecting a winner: the tied candidate in this round who appears earliest is selected
if selecting a loser: the tied candidate who appears latest is selected
Expand Down
50 changes: 21 additions & 29 deletions src/main/java/network/brightspots/rcv/ContestConfig.java
Original file line number Diff line number Diff line change
Expand Up @@ -91,12 +91,10 @@ class ContestConfig {
private final String sourceDirectory;
// Used to track a sequential multi-seat race
private final List<String> sequentialWinners = new LinkedList<>();
// Candidate display names (no aliases or codes)
// Candidate display names (no aliases)
private Set<String> candidateNames;
// Mapping from any candidate alias to the candidate's display name
private Map<String, String> candidateAliasesToNameMap;
// Mapping from any candidate alias to the candidate's code
private Map<String, String> candidateAliasesToCodeMap;
// A list of any validation errors
private Set<ValidationError> validationErrors = new HashSet<>();

Expand Down Expand Up @@ -519,20 +517,20 @@ private void validateOutputSettings() {
}
}

// checks for conflicts between a candidate name and other name/codes or other reserved strings
// param: candidateString is a candidate name or code
// param: field is either "name" or "code"
// param: candidateStringsSeen is a running set of names/codes we've already encountered
// checks for conflicts between a candidate name and other aliases or reserved strings
// param: candidateString is a candidate name or alias
// param: candidateStringsSeen is a running set of names / aliases we've already encountered
private boolean candidateStringAlreadyInUseElsewhere(
String candidateString, String field, Set<String> candidateStringsSeen) {
String candidateString, Set<String> candidateStringsSeen) {
boolean inUse = false;
if (candidateStringsSeen.contains(candidateString)) {
inUse = true;
Logger.severe("Duplicate candidate %ss are not allowed: %s", field, candidateString);
Logger.severe("Duplicate candidate names or aliases are not allowed: %s", candidateString);
} else {
for (CvrSource source : getRawConfig().cvrFileSources) {
inUse =
stringAlreadyInUseElsewhereInSource(candidateString, source, "a candidate " + field);
stringAlreadyInUseElsewhereInSource(
candidateString, source, "a candidate name or alias");
if (inUse) {
break;
}
Expand Down Expand Up @@ -611,13 +609,16 @@ private void validateCandidates() {

// Ensure the candidate name and all aliases are unique, both within each candidate and
// across candidates.
candidate.createStreamOfNameAndAllAliases().forEach(nameOrAlias -> {
if (candidateStringAlreadyInUseElsewhere(nameOrAlias, "name", candidateNameSet)) {
validationErrors.add(ValidationError.CANDIDATE_DUPLICATE_NAME);
} else {
candidateNameSet.add(nameOrAlias);
}
});
candidate
.createStreamOfNameAndAllAliases()
.forEach(
nameOrAlias -> {
if (candidateStringAlreadyInUseElsewhere(nameOrAlias, candidateNameSet)) {
validationErrors.add(ValidationError.CANDIDATE_DUPLICATE_NAME);
} else {
candidateNameSet.add(nameOrAlias);
}
});
}

if (getNumDeclaredCandidates() < 1) {
Expand Down Expand Up @@ -1063,19 +1064,15 @@ String getNameForCandidate(String nameOrAlias) {
return candidateAliasesToNameMap.get(nameOrAlias);
}

String getCodeForCandidate(String nameOrAlias) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this comment, you mentioned possibly reverting / getting rid of some lines in TabulatorSession. Is that no longer feasible?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a commit showing the effect of that. Depends on what we want -- we can get rid of getNameForCandidate in ResultsWriter, which generates results that use the candidate code instead of the candidate name. See the diff here.

I'm not sure how the _expected.csvs are used, so I don't know which format is more useful, but it seems likely that it's more useful with canonical names (i.e. reverting #3f121b7, the next commit) and keeping the code as it was.

Let me know which you prefer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _expected.csvs are just used for our tests.

Hard for me to answer this question... might be best for @tarheel or @chughes297 to weigh in? It appears that with the above commit you linked, it leaves our tests as-is, which is probably preferred?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the outputted CSVs never used by election administrators? If so, then I suppose it doesn't matter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The real output CSV are definitely used by admins. @HEdingfield is just referring specifically to the _expected.csv files that are in the test directories.

Apologies if I missed a relevant part of the discussion -- I'm just looking at this comment thread -- but I think ideally the output CSV would have the canonical names in one column and then perhaps a separate column that lists any alias that are used for each candidate (separated by a delimiter).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great. I think that makes sense too. I reverted the last commit, and this PR should be ready to go now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm-- let's separate adding both canonical names and aliases for another task?

@chughes297 @tarheel do y'all think it's worth creating a new issue for this? If so, is it something we'd need to include in 1.4.0?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it would be useful to list aliases in the output spreadsheet. I doubt it's a requirement for v1.4, but @chughes297 can comment.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes let's create a separate issue but not a huge priority for me to include in 1.4.0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #705 for this.

return candidateAliasesToCodeMap.get(nameOrAlias);
}

ArrayList<String> getCandidatePermutation() {
return candidatePermutation;
}

void setCandidateExclusionStatus(String candidateCode, boolean excluded) {
void setCandidateExclusionStatus(String candidateName, boolean excluded) {
if (excluded) {
excludedCandidates.add(candidateCode);
excludedCandidates.add(candidateName);
} else {
excludedCandidates.remove(candidateCode);
excludedCandidates.remove(candidateName);
}
}

Expand All @@ -1085,7 +1082,6 @@ void setCandidateExclusionStatus(String candidateCode, boolean excluded) {
// 3) add uwi candidate if needed
private void processCandidateData() {
candidateAliasesToNameMap = new HashMap<>();
candidateAliasesToCodeMap = new HashMap<>();
candidateNames = new HashSet<>();

if (rawConfig.candidates != null) {
Expand All @@ -1101,7 +1097,6 @@ private void processCandidateData() {
aliases.forEach(nameOrAlias -> {
// duplicate names and aliases get caught in validation
candidateAliasesToNameMap.put(nameOrAlias, name);
candidateAliasesToCodeMap.put(nameOrAlias, candidate.getCode());
});
}
}
Expand All @@ -1112,8 +1107,6 @@ private void processCandidateData() {
candidateNames.add(Tabulator.UNDECLARED_WRITE_IN_OUTPUT_LABEL);
candidateAliasesToNameMap.put(
Tabulator.UNDECLARED_WRITE_IN_OUTPUT_LABEL, Tabulator.UNDECLARED_WRITE_IN_OUTPUT_LABEL);
candidateAliasesToCodeMap.put(
Tabulator.UNDECLARED_WRITE_IN_OUTPUT_LABEL, Tabulator.UNDECLARED_WRITE_IN_OUTPUT_LABEL);
}
}

Expand Down Expand Up @@ -1164,7 +1157,6 @@ enum ValidationError {
CVR_UNDERVOTE_LABEL_UNEXPECTEDLY_DEFINED,
CVR_CONTEST_ID_UNEXPECTEDLY_DEFINED,
CANDIDATE_NAME_MISSING,
CANDIDATE_CODE_INVALID,
CANDIDATE_DUPLICATE_NAME,
CANDIDATE_NO_CANDIDATES_SPECIFIED,
CANDIDATE_ALL_EXCLUDED,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@ public void runAdditionalValidations(List<CastVoteRecord> castVoteRecords)

private void validateNamesAreInContest(List<CastVoteRecord> castVoteRecords)
throws CastVoteRecord.CvrParseException {
// build a lookup map for candidates codes to optimize Cvr parsing
// build a lookup map to optimize CVR parsing
Map<String, Set<String>> contestIdToCandidateNames = new HashMap<>();
for (Candidate candidate : this.candidates) {
Set<String> candidates;
Expand Down
33 changes: 10 additions & 23 deletions src/main/java/network/brightspots/rcv/RawContestConfig.java
Original file line number Diff line number Diff line change
Expand Up @@ -217,11 +217,6 @@ public static class Candidate {
private boolean excluded;
private List<String> aliases = new ArrayList<String>();

// The code is a special alias which is used in the output files instead of
// the display name. Other than output displays, it is not handled specially:
// it is just another alias.
private String code;
HEdingfield marked this conversation as resolved.
Show resolved Hide resolved

Candidate() {
}

Expand Down Expand Up @@ -251,14 +246,6 @@ public void setAliases(List<String> aliases) {
this.aliases = new ArrayList<>(aliases);
}

public String getCode() {
return code;
}

public void setCode(String code) {
this.code = code;
}

public boolean isExcluded() {
return excluded;
}
Expand All @@ -267,6 +254,15 @@ public void setExcluded(boolean excluded) {
this.excluded = excluded;
}


// This is deprecated and replaced by aliases, but we need to leave it in place
// here for the purpose of supporting automatic migration from older config versions.
private void setCode(String code) {
if (code != null && !code.isEmpty()) {
this.aliases.add(code);
}
}

/**
* A stream of all aliases (which is guaranteed to be unique) and the candidate name
* (which is not guaranteed to be unique, i.e. it may exist in the list twice)
Expand All @@ -278,23 +274,17 @@ public Stream<String> createStreamOfNameAndAllAliases() {
if (!isNullOrBlank(this.name)) {
otherNames.add(this.name);
}
if (!isNullOrBlank(this.code)) {
otherNames.add(this.code);
}

return Stream.concat(this.aliases.stream(), otherNames.stream());
}

/**
* For display purposes, get a semicolon-separated list of aliases, including the code.
* For display purposes, get a semicolon-separated list of aliases.
*
* @return a potentially-empty string
*/
public String getSemicolonSeparatedAliases() {
Stream<String> s = this.aliases.stream();
if (this.code != null) {
s = Stream.concat(s, Stream.of(this.code));
}
return String.join("; ", s.toList());
}

Expand All @@ -309,9 +299,6 @@ public void trimNameAndAllAliases() {
if (name != null) {
name = name.trim();
}
if (code != null) {
code = code.trim();
}
if (aliases != null) {
aliases.replaceAll(s -> s.trim());
}
Expand Down
6 changes: 3 additions & 3 deletions src/main/java/network/brightspots/rcv/ResultsWriter.java
Original file line number Diff line number Diff line change
Expand Up @@ -441,8 +441,8 @@ private void addHeaderRows(CSVPrinter csvPrinter, String precinct) throws IOExce
// make sure we list them in order of election
Collections.sort(winningRounds);
for (int round : winningRounds) {
for (String candidateCode : roundToWinningCandidates.get(round)) {
winners.add(config.getNameForCandidate(candidateCode));
for (String candidateName : roundToWinningCandidates.get(round)) {
winners.add(config.getNameForCandidate(candidateName));
}
}
csvPrinter.printRecord("Winner(s)", String.join(", ", winners));
Expand Down Expand Up @@ -577,7 +577,7 @@ private void printRankings(String undeclaredWriteInLabel, Contest contest, CSVPr
if (selection.equals(Tabulator.UNDECLARED_WRITE_IN_OUTPUT_LABEL)) {
selection = undeclaredWriteInLabel;
} else {
selection = config.getCodeForCandidate(selection);
selection = config.getNameForCandidate(selection);
}
csvPrinter.print(selection);
} else {
Expand Down
3 changes: 1 addition & 2 deletions src/main/java/network/brightspots/rcv/TabulatorSession.java
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
* Output results
* Design: TabulatorSession also stores state metadata which exists outside tabulation results:
* config object, resolved output, and logging paths, tabulation object, and CVR data including
* precinct codes discovered while parsing CVR files.
* precinct IDs discovered while parsing CVR files.
* Conditions: During tabulation, validation, and conversion.
* Version history: see https://github.com/BrightSpots/rcv.
*/
Expand All @@ -29,7 +29,6 @@
import java.text.SimpleDateFormat;
import java.util.ArrayList;
import java.util.Date;
import java.util.HashSet;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;
Expand Down
Loading