Pyspec SSZ; HTR caching of Vector. #1481

protolambda · 2019-11-15T16:34:19Z

Although I still like to see pyspec-ssz replaced with py-ssz or moved out to refactor more, it looks like py-ssz is not quite ready, or maybe not the right choice. And moving it out and refactoring brings different problems: keeping it sync, and readability / size (py-ssz pyrsistent looks great, but also ~10x more code to go through as a reader).

The two big pain points for mainnet ssz tree-hashing are:

the large validator registry
the large vectors of data

In tests the validator registry is not that big however, it is really the vector data that's slow without caching: 8192 roots to merkleize together.

So this PR split a vector into smaller vectors during hash-tree-root, and caches the results. On a modification, it removes the cache entry. And caching is only active for large enough vectors, of elements of an immutable type.

A quick bench shows a ~80 times improvement for a Vector[Bytes32, 8192], when modifying elements in a rotation (like the historical vectors in the spec): https://gist.github.com/protolambda/4509db7f91d07b40a65ca3daf1e37685

Writing some tests and a bench of the BeaconState later.

Functionally this does not change SSZ or the spec. And although not too pretty, it helps to make mainnet test generation more bareable.

Note: base-branch on the other SSZ PR, which I would like to merge first, and then update the base.

protolambda · 2019-12-03T19:40:39Z

Update: inclined to specialize the pyspec-ssz implementation for merkle proofs and caching by doing something like I describe here and like in this POC, which would make this temporary caching hack unnecessary.

protolambda · 2019-12-28T21:43:22Z

Update: I'm experimenting with a new python SSZ implementation build with binary trees as backings, thus caching every single hash in-place by default. See https://github.com/protolambda/pymerkles

So far it:

supports every SSZ type (except Union...)
binary tree backings (also partial trees!) work
initial phase of implementing deserialization (required for running tests)
compatible with spec, but needs more testing (see spec experiment file)

When I completete the serialize/deserialize part, and when it meets pyspec tests, we can swap the pyspec implemention and avoid caching-hacks like in this PR.

protolambda · 2020-01-02T22:36:25Z

Closing this in favor of #1552 and future iterations of that.

quick 80 times ssz vector htr speedup; caching vectors

625f35c

protolambda added the scope:SSZ Simple Serialize label Nov 15, 2019

protolambda closed this Jan 2, 2020

protolambda deleted the vector-caching branch February 9, 2020 00:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pyspec SSZ; HTR caching of Vector. #1481

Pyspec SSZ; HTR caching of Vector. #1481

protolambda commented Nov 15, 2019

protolambda commented Dec 3, 2019

protolambda commented Dec 28, 2019

protolambda commented Jan 2, 2020

Pyspec SSZ; HTR caching of Vector. #1481

Pyspec SSZ; HTR caching of Vector. #1481

Conversation

protolambda commented Nov 15, 2019

protolambda commented Dec 3, 2019

protolambda commented Dec 28, 2019

protolambda commented Jan 2, 2020