Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🗑 Remove delimiters in hashes #536

Open
lmarie79 opened this issue Feb 8, 2022 · 4 comments
Open

🗑 Remove delimiters in hashes #536

lmarie79 opened this issue Feb 8, 2022 · 4 comments
Labels
enhancement New feature or request

Comments

@lmarie79
Copy link
Collaborator

lmarie79 commented Feb 8, 2022

Feature Request

Description
@benaloh proposed to not use any delimiters between parameters of hashes. The idea would be simplify the hash itself. It also removes confusion on whether any delimiters are used between parameters of hashes.

Another option would be to be more clear of exactly what the delimiters are and how they are used in documentation.

Proposed Solution

This would involve removing the delimiters shown in these lines:
https://github.com/microsoft/electionguard-python/blob/e35b68dc3175243ffaf872e9a0ea56c5b8a05830/src/electionguard/hash.py#L73

https://github.com/microsoft/electionguard-python/blob/e35b68dc3175243ffaf872e9a0ea56c5b8a05830/src/electionguard/hash.py#L100

@lmarie79 lmarie79 added enhancement New feature or request triage Waiting to be triaged labels Feb 8, 2022
@keithrfung keithrfung changed the title Remove delimiters in hashes 🗑 Remove delimiters in hashes Feb 8, 2022
@keithrfung keithrfung added question Further information is requested and removed triage Waiting to be triaged labels Feb 8, 2022
@keithrfung
Copy link
Collaborator

This is marked as question because we are unsure whether to implement this despite it being simple fix.

@keithrfung keithrfung removed the question Further information is requested label Feb 9, 2022
@benaloh
Copy link

benaloh commented Feb 28, 2022

We need to be a bit careful here. We can and should remove delimiters wherever possible, but we must make certain that parameters are always of predetermined length. We have to avoid "no table" and "not able" looking the same.

For most of our hashes (especially, generation of challenges), we are hashing a fixed sequence of fixed-length arguments. Here we can just drop the delimiters and concatenate. For less constrained data (e.g., the ballot coding file), we either need delimiters or the file itself should specify the lengths of each argument.

@danwallach
Copy link
Collaborator

I'm not sure I like nuking the delimiters. Consider that we're currently converting everything to base16. That means we can easily construct an ElementModP for which a corresponding number of ElementModQ values, concatenated together, have the exact same hex digits. The delimiters prevent these from hashing identically.

If anything, I'm concerned that strings are passed through to the hash function without any escaping. That's only an issue if a string can be externally specified. The solution would be use the kind of string escaper that shows up when encoding for JSON or for CSV.

@benaloh
Copy link

benaloh commented Mar 1, 2022

The concern above is precisely why the arguments must be of known fixed-length. If the input to a hash is an ElementModP, then one can't swap it with a sequence of 16 ElementModQ values and have it be a legitimate input.

Having identical inputs of different forms is a concern even with delimiters. One could say that "A|B" with A and B as numbers could collide with a hex string when the delimiter "|" is written as its ASCII equivalent "7C". We must always check that the inputs are of proper form regardless, and once we do, we no longer need delimiters for known fixed-length inputs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants