Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review metadata validation hashing #937

Closed
jsdw opened this issue Apr 27, 2023 · 1 comment
Closed

Review metadata validation hashing #937

jsdw opened this issue Apr 27, 2023 · 1 comment

Comments

@jsdw
Copy link
Collaborator

jsdw commented Apr 27, 2023

We have 2 non allocating hash functions; xor and concat_and_hash. xor just XORs two hashes together. concat_and_hash concats the two 32byte hashes into one 64byte hash and then hashes that to get out a 32byte hash again.

  • xor is faster, but if you XOR A, B, C together you get the same result as C,B,A or A,C,B etc. ie it's the same irrespective of the htings we're XORing.
  • If you xor two identical hashes you end up with 0's everywhere, which can obscure a lot of mismatches (ie xor(A,A) == xor(B,B) == xor(C,C), so all of them would appear identical when they are in fact things we probably want to have different hashes).
  • concat_and_hash is slower (it actually hashes), but concat_and_hash(A,B) would give different output to concat_and_hash(B,A) ie the order is preserved.

We use xor fairly liberally. We also allocate in a few places (ie allocate a vec, append some stuff to it, sort it, and hash that).

We should:

  1. Ensure that we use concat_and_hash everywhere order etc matters, and that we aren't over-using xor.
  2. See whether we can get rid of the allocations; can we just XOR eg pallet hashes together rather than do any sorting based on pallet names? Things like hashing the pallet name into the per-pallet hashes will help ensure they are unique.
  3. Think about validation in terms of DecodeAsType and EncodeAsType, ie if field names in some struct change places, that's mostly OK now for instance (this is an optimisaiton though; we can be stricter too if we want)

Ultimately we want validation to be as fast as possible (so that people have as few reasons as possible reason to opt out) but also actually protect as best as possible against things that DecodeAsType and EncodeAsType would consider different.

@jsdw
Copy link
Collaborator Author

jsdw commented May 17, 2023

Closed by #959

@jsdw jsdw closed this as completed May 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant