Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aliases as a replacement for candidate codes #639

Merged
merged 10 commits into from
May 8, 2023

Conversation

artoonie
Copy link
Collaborator

@artoonie artoonie commented Mar 15, 2023

Implementation Details

An alias is just a general case of the old concept of a "Code". Candidates now have three fields: name (aka display name, or canonical name), code (only used in the output CSVs - see Question below), and aliases (generalized codes).

In the tabulation logic, all names encountered are translated from whatever form they're in to the canonical name. All logic is now done on the canonical name, not on the code or any alias.

I have designed this so that the concept of code can be removed entirely in time, but have not done so yet.

Question: removing the concept of "Code" entirely

The current logic on develop is for output CSVs to print the Candidate code rather than the name. In all other cases, code is treated exactly like an alias.

To ensure 1-to-1 consistency with develop, I have opted to leave both "Code" and "Aliases" internally, even though the UI now only shows "Aliases." Removing the concept of "Code" will be trivial, and would clean up the code, but would result in test files differing. I'm not sure if there are other consequences.

Question: Do you agree with the current implementation (keeping "Code" internally so there are no changes to any output files), or do you prefer a cleaner implementation (done by removing "Code" and updating all tests that used "Code").
Answer: In a separate pull request, we'll get rid of code

Screenshot: an old Dominion config.

Notice the "Code" is successfully shown as an "Alias"
image

Screenshot: multiple aliases

Semicolon-separated, which will be easy to translate back and forth into a list when the Table becomes editable.
image

Test details (on both JSON and XLSX tests)

  1. Ensures the non-alias name is the name used in the JSON summary output
  2. Ensures that both aliases and names are treated identically otherwise
  3. Ensures unused aliases don't cause issues
  4. Ensures unused names don't cause issues (i.e. the "Code" use case, where only the alias is used in the CVR and not the name)

@artoonie artoonie linked an issue Mar 15, 2023 that may be closed by this pull request
Still TODO:
- UI (test, remove Candidate Code)
- Serialization (test)
- Decide whether we need to keep Candidate Codes for the output files or not
@artoonie artoonie removed the WIP label Mar 16, 2023
@artoonie artoonie mentioned this pull request Mar 16, 2023
@artoonie artoonie force-pushed the feature/issue-591_aliases branch 3 times, most recently from 92edaee to cb1080d Compare March 16, 2023 03:19
@HEdingfield
Copy link
Contributor

Question: Do you agree with the current implementation (keeping "Code" internally so there are no changes to any output files), or do you prefer a cleaner implementation (done by removing "Code" and updating all tests that used "Code").

I personally don't see a big benefit to keeping the "Code" field around at all. We're guaranteed that names will be unique, so why not just display that in the output? Let's confirm with @rrojas350 though.

Copy link
Contributor

@HEdingfield HEdingfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't reviewed everything yet, but here are some comments to get your started!

src/main/java/network/brightspots/rcv/ContestConfig.java Outdated Show resolved Hide resolved
src/main/java/network/brightspots/rcv/ContestConfig.java Outdated Show resolved Hide resolved
@@ -315,23 +315,25 @@ private int parseCvrFile(
}
Integer candidateId = (Integer) rankingMap.get("CandidateId");
String candidateCode = candidateId.toString();
Set<String> candidates = contestIdToCandidateCodes.get(contestId);
if (!candidates.contains(candidateCode)) {
String candidateName = config.getNameFor(candidateCode);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you still calling this candidateCode on purpose?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Dominion uses a code which we translate here to the display name. In other files I use candidateName not candidateDisplayName, but I'm happy to rename the variable here to clarify (or add a comment).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we just call this candidateId or dominionCandidateId then, since that's what it's apparently called in the Dominion format? Just trying to clear the way for #663 so there's no confusion around this there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if you missed this comment, but I think it'd be nice to take care of this still.

Copy link
Collaborator Author

@artoonie artoonie May 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like your comment didn't submit before :) both your comments show as appearing 5 days ago. Yes, I can rename this variable dominionCandidateId

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh weird, sorry about that. Looks like we got our wires crossed a bit on this, so I just added a commit to clear it up. Please give it a look over to verify that it makes sense.

@HEdingfield
Copy link
Contributor

artoonie force-pushed the feature/issue-591_aliases branch from cb1080d to f0cec41
3 days ago
@artoonie
Make CastVoteRecord use 50% less memory

Not sure why this was done, as that code hasn't been reviewed yet. Could you please keep it out of this branch?

@artoonie
Copy link
Collaborator Author

artoonie commented Mar 20, 2023

Ugh sorry, let me do some git cleanup. I'm using the github desktop client for the first time and must have made some mistake.

Update: Fixed. Sincere apologies. Will avoid the GitHub desktop client for now since it seems to do a lot behind the scenes that I don’t understand.

@artoonie artoonie changed the title WIP: Aliases instead of (in addition to) candidate codes Aliases instead of (in addition to) candidate codes Mar 20, 2023
@ggilber1
Copy link

ggilber1 commented Mar 20, 2023

Can someone explain to me in English (and without acronyms) what is being done here? Thanks.
I'll assign this to @rrojas350 since you are probably at at least an intermediate level of technical comprehension.

@artoonie
Copy link
Collaborator Author

Can someone explain to me in English (and without acronyms) what is being done here? Thanks. I'll assign this to @rrojas350 since you are probably at at least an intermediate level of technical comprehension.

Check out the Issue linked to this pull request here: #591
In sum: there are often cases where a candidate has multiple names across different CVRs -- for example, write-ins, or multi-vendor tabulations. This provides support for those cases: a candidate can now have aliases, and we will look for both their name and their aliases in CVRs.

@ggilber1
Copy link

ggilber1 commented Mar 20, 2023 via email

@artoonie
Copy link
Collaborator Author

@ggilber1 It now treats Candidate Codes and Aliases in the same manner. An election administrator using ballots with candidate codes would enter the Code in the new textbox. An election administrator using aliases would enter the aliases in that same textbox. They can also mix and match codes and aliases if they have a multi-vendor tabulation that needs both codes and aliases.

This probably does not affect the auto-populating of candidate names in any way, though I haven't given that enough thought to be confident about it.

@rrojas350
Copy link

@HEdingfield If I'm understanding correctly, I'm ok getting rid of candidate code in the output.

Judging from @artoonie's reply to @ggilber1, it sounds like candidate codes on the front end shouldn't cause any issues when trying to tabulate Hart and Dominion CVRs, which is the only concern I have regarding candidate codes.

@ggilber1
Copy link

ggilber1 commented Mar 20, 2023 via email

@andyanderson
Copy link

Also need to understand how this issue relates to the issue of auto-populating candidate names in the config tile.

Currently the software ignores CVRs for candidates who aren’t pre-populated in the software, i.e. write-ins. Auto-populating them eliminates the need for an administrator to enter them by hand. Then all they need to do is correlate them in the alias table if there are multiple variants (likely).

src/test/java/network/brightspots/rcv/TabulatorTests.java Outdated Show resolved Hide resolved
src/main/java/network/brightspots/rcv/ContestConfig.java Outdated Show resolved Hide resolved
*
* @return a potentially-empty string
*/
public String getSemicolonSeparatedAliases() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I kind of had it working before was double-clicking the cell would allow you to edit it and type right in there, then you had to press enter to commit it. Definitely open to other designs though. It'd probably actually be really cool if we had a modal pop up with the full text area to edit, but that would take some work.

Maybe we can go with a happy medium, like a pipe | since that's at least typeable on a standard keyboard and my gut tells me it's less likely to appear in an alias than a semicolon??

Alternatively, a no-semicolon check would work too. I'm happy either way. @chughes297 might have useful thoughts on this.

@artoonie artoonie changed the title Aliases instead of (in addition to) candidate codes Aliases as a replacement for candidate codes Apr 29, 2023
Copy link
Contributor

@HEdingfield HEdingfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very close now! I think we'll all breathe a sigh of relief once this one is submitted.

src/test/java/network/brightspots/rcv/TabulatorTests.java Outdated Show resolved Hide resolved
src/test/java/network/brightspots/rcv/TabulatorTests.java Outdated Show resolved Hide resolved
src/test/java/network/brightspots/rcv/TabulatorTests.java Outdated Show resolved Hide resolved
src/main/java/network/brightspots/rcv/ContestConfig.java Outdated Show resolved Hide resolved
*
* @return a potentially-empty string
*/
public String getSemicolonSeparatedAliases() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I do like the semicolons from a UX perspective, because non-technical people are used to seeing that.

I propose this: we leave the semicolons as delimiters in the column, but we plan to do a pop-up modal with a text area for editing. Then, maybe in the backend, it always makes sure to submit the newline version for saving, even though it's displaying the semicolons to users in the column. Then we can just split on newlines when saving it.

Does that make sense? I believe this could open up a path for us to allow for semicolons in aliases.

Noting this discussion in #590.

@artoonie
Copy link
Collaborator Author

All comments resolved. Thanks for catching the accidentally-updated test -- perhaps we should stop the UI from forcing you to re-save a config if the version is "TEST"? It's a pain to constantly revert those. I can make a ticket for that if you think it'd help y'all as well.

@@ -315,23 +315,25 @@ private int parseCvrFile(
}
Integer candidateId = (Integer) rankingMap.get("CandidateId");
String candidateCode = candidateId.toString();
Set<String> candidates = contestIdToCandidateCodes.get(contestId);
if (!candidates.contains(candidateCode)) {
String candidateName = config.getNameFor(candidateCode);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if you missed this comment, but I think it'd be nice to take care of this still.

*
* @return a potentially-empty string
*/
public String getSemicolonSeparatedAliases() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that sounds perfect. Does that require code changes to this PR, or is it purely something you'd want to handle in future work around this?

@HEdingfield
Copy link
Contributor

Happy to add a commit that takes care of this last bit if you'd like (also happy to wait if you'd rather do it yourself), just let me know.

Copy link
Contributor

@tarheel tarheel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for taking forever to comment on this. It makes sense to me! And I definitely agree that we should dump candidate codes.


/**
* A stream of all aliases (which is guaranteed to be unique) and the candidate name
* (which is not guaranteed to be unique, i.e. it may exist in the list twice)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why might the name be present twice?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's valid for a candidate name to be equal to a candidate code, and therefore for a candidate name to be equal to an alias.

Comment on lines +1008 to +1010
if (candidateNames == null) {
candidateNames = new HashSet<>();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why not just initialize this when you declare it?

Copy link
Collaborator Author

@artoonie artoonie May 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Nothing else is initialized when declared
  2. This was a request from @HEdingfield to add this:

#639 (comment)
#639 (comment)

(Looks like the direct links may not work...but, see comments above on ContestConfig.java

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think either way would probably work fine. Also agree with @artoonie that we should make a pass on all this at some point to make it consistent.

Comment on lines +1039 to +1041
candidateAliasesToNameMap = new HashMap<>();
candidateAliasesToCodeMap = new HashMap<>();
candidateNames = new HashSet<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly, why not just initialize these when you declare them?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just following the existing model, a la validateCandidates. I don't mind changing these to init when declared, but perhaps we should have a larger movement to do that consistently across this and other files?

*
* @return a potentially-empty string
*/
public String getSemicolonSeparatedAliases() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're saying that it'll just use standard JSON list format, right?

Copy link
Contributor

@HEdingfield HEdingfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think this is all set! I added a small commit that fixes a couple small bugs and cleans stuff up more, so will let you go ahead and look that over before you merge it. Great work on this!

@@ -315,23 +315,25 @@ private int parseCvrFile(
}
Integer candidateId = (Integer) rankingMap.get("CandidateId");
String candidateCode = candidateId.toString();
Set<String> candidates = contestIdToCandidateCodes.get(contestId);
if (!candidates.contains(candidateCode)) {
String candidateName = config.getNameFor(candidateCode);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh weird, sorry about that. Looks like we got our wires crossed a bit on this, so I just added a commit to clear it up. Please give it a look over to verify that it makes sense.

Comment on lines +1008 to +1010
if (candidateNames == null) {
candidateNames = new HashSet<>();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think either way would probably work fine. Also agree with @artoonie that we should make a pass on all this at some point to make it consistent.

@artoonie artoonie merged commit b1367a2 into develop May 8, 2023
@artoonie artoonie deleted the feature/issue-591_aliases branch May 8, 2023 17:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Alias for Candidates
6 participants