Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full support for tuples #665

Closed
JustinDrake opened this issue Feb 20, 2019 · 9 comments
Closed

Full support for tuples #665

JustinDrake opened this issue Feb 20, 2019 · 9 comments
Labels
scope:SSZ Simple Serialize

Comments

@JustinDrake
Copy link
Collaborator

SSZ has full support for lists, but only partial support for arrays (specifically, it can support arrays of bytes with bytesN). I'd argue we want full support for fixed-sized arrays. The reason is that fixed-sized arrays appear in various places where we do not want to mix in the array length. One example is when computing the shard_data_root, and another one appeared here. Other examples include latest_randao_mixes, latest_block_roots, latest_active_index_roots, latest_slashed_balances which are all fixed size, and where including the data length is a nuisance.

Ideally we want to use hash_tree_root over merkle_root pretty much everywhere, and support for fixed-sized arrays will allow this.

(Side note regarding Go: In go we have "arrays" with the [n] notation for a fixed n. For example [32]byte would be the equivalent of Bytes32. We also have "slices" with the [] which have dynamic size. My suggestion is to define SSZ for slices and arrays.)

@JustinDrake JustinDrake added the scope:SSZ Simple Serialize label Feb 20, 2019
@djrtwo
Copy link
Contributor

djrtwo commented Feb 21, 2019

I would lean toward favoring pythonic notation because the rest of the spec uses it.
"slices" in the go nomenclature are just dynamically sized arrays?

Why is including the data length on these arrays a "nuisance"? Is there any loss other than a tiny increase in size of the hashed item?

@JustinDrake
Copy link
Collaborator Author

JustinDrake commented Feb 21, 2019

Why is including the data length on these arrays a "nuisance"?

I think the idea to use SSZ everywhere (as opposed to the lower-level merkle_root) was dismissed some time ago (on our Telegram, or maybe our weekly call) because:

  • Having the length complicates the proof-of-custody challenges
  • Having the length complicates STARK-based erasure coding

In addition to the above, I would say that we want SSZ to be maximally friendly to merkleisation with the hope of SSZ becoming a standard. It's also slightly more efficient and friendly for the fixed-size components of state to be exposed as pure Merkle roots.

I would lean toward favoring pythonic notation

Right, we should use whatever notation is most Pythonic! :)

"slices" in the go nomenclature are just dynamically sized arrays?

Basically. I communicated the idea in Go because I'm not sure what the Python equivalent is.

@JustinDrake
Copy link
Collaborator Author

I'm not sure what the Python equivalent is.

I think in Python a fixed-size list is called a "tuple".

@JustinDrake JustinDrake changed the title Full support for fixed-sized arrays Full support for tuples Feb 24, 2019
@JustinDrake JustinDrake mentioned this issue Feb 24, 2019
@hwwhww
Copy link
Contributor

hwwhww commented Feb 25, 2019

latest_randao_mixes, latest_block_roots, latest_active_index_roots, latest_slashed_balances which are all fixed size

Off this topic

  1. Since we use the fixed size arrays for recording recent data. It seems we assume that "State list lengths" are un-changeable constants in the future forks?
  2. (feature creep alert!) Possible requirements: if we adjust SLOTS_PER_EPOCH to 4, LATEST_RANDAO_MIXES_LENGTH is changed from ~36 days to ~24 days... and somehow people want to increase LATEST_RANDAO_MIXES_LENGTH for some reason...
  3. If it's fixed forever, should we re-check if these lengths are appropriate? I remember that we changed the unit of latest_randao_mixes from slot to epoch, but LATEST_RANDAO_MIXES_LENGTH number is the same. Is 2**13 an arbitrary number?

@djrtwo
Copy link
Contributor

djrtwo commented Feb 26, 2019

I vote for the following notation:

  • "type[]" -- is a variable length list of type type
  • "type[N]" -- is a fixed length list of length N and of type type

@JustinDrake
Copy link
Collaborator Author

It seems we assume that "State list lengths" are un-changeable constants in the future forks?

In theory a fork could change the state list lengths.

2. (feature creep alert!)

Not sure I understand this 😂

If it's fixed forever, should we re-check if these lengths are appropriate?

I do intend to fine-tune all the constant towards the end. But as mentioned above, the sizes are not fixed forever.

I vote for the following notation

Agreed!

@hwwhww
Copy link
Contributor

hwwhww commented Feb 27, 2019

Not sure I understand this 😂

I was saying I was afraid that I'm putting more unnecessary requirements and make it a more complicated design in the early stage. 😆

In theory a fork could change the state list lengths.

It seems we will need extra code to handle length extension. And the "suture" between the forks is less clean.

For example

state.latest_block_roots[slot % LATEST_BLOCK_ROOTS_LENGTH]


LATEST_BLOCK_ROOTS_LENGTH: 8
after processed block 15
when processed block 16
[root_8, root_11, ...., root_14, root_15]

get_block_root(state, slot) can get block root between [state.slot - LATEST_BLOCK_ROOTS_LENGTH, state.slot) = [15-8, 15)
get_block_root(state, 11) = state.latest_block_roots[slot % LATEST_BLOCK_ROOTS_LENGTH] = root_11


####
Increase `LATEST_BLOCK_ROOTS_LENGTH` from 8 to 16

LATEST_BLOCK_ROOTS_LENGTH: 16
after processed block 15
[root_9, root_11, ...., root_14, ZERO_HASH, .... ZERO_HASH]
get_block_root(state, slot) can get block root between [state.slot - LATEST_BLOCK_ROOTS_LENGTH, state.slot) = [15-16, 15) <--- oh no

We would need to either (i) fill the missing empty spots in the new array in the fork slot or (ii) prevent the empty spots from being accessed before they are filled in the future slots.

@JustinDrake
Copy link
Collaborator Author

fill the missing empty spots in the new array in the fork slot

That seems reasonable.

JustinDrake added a commit that referenced this issue Feb 27, 2019
* Implement tuples and a chunk-size reduction to 32 bytes (see #665 and #679)
* Simplify presentation where appropriate. For example, deserialisation is implicit from serialisation (similar to `bls_sign` being implicit from `bls_verify`) and is left as an implementation exercise.
* Dramatically reduce spec size and hopefully improve readability.
@JustinDrake
Copy link
Collaborator Author

Implemented in 57971aa (see also #723)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scope:SSZ Simple Serialize
Projects
None yet
Development

No branches or pull requests

3 participants