Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ssz: byte type and canonical JSON mapping #3506

Merged
merged 4 commits into from Jan 11, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion specs/altair/beacon-chain.md
Expand Up @@ -71,7 +71,7 @@ Altair is the first beacon chain hard fork. Its main features are:

| Name | SSZ equivalent | Description |
| - | - | - |
| `ParticipationFlags` | `uint8` | a succinct representation of 8 boolean participation flags |
| `ParticipationFlags` | `byte` | a succinct representation of 8 boolean participation flags |
arnetheduck marked this conversation as resolved.
Show resolved Hide resolved

## Constants

Expand Down
42 changes: 41 additions & 1 deletion ssz/simple-serialize.md
Expand Up @@ -10,6 +10,7 @@
- [Basic types](#basic-types)
- [Composite types](#composite-types)
- [Variable-size and fixed-size](#variable-size-and-fixed-size)
- [Byte](#byte)
- [Aliases](#aliases)
- [Default values](#default-values)
- [`is_zero`](#is_zero)
Expand All @@ -25,6 +26,7 @@
- [Merkleization](#merkleization)
- [Summaries and expansions](#summaries-and-expansions)
- [Implementations](#implementations)
- [JSON mapping](#json-mapping)

<!-- END doctoc generated TOC please keep comment here to allow auto update -->
<!-- /TOC -->
Expand All @@ -41,6 +43,7 @@
### Basic types

* `uintN`: `N`-bit unsigned integer (where `N in [8, 16, 32, 64, 128, 256]`)
* `byte`: 8-bit opaque data container, equivalent in serialization and hashing to `uint8`
* `boolean`: `True` or `False`

### Composite types
Expand Down Expand Up @@ -69,15 +72,20 @@

We recursively define "variable-size" types to be lists, unions, `Bitlist` and all types that contain a variable-size type. All other types are said to be "fixed-size".

### Byte

Although the SSZ serialization of `byte` is equivalent to that of `uint8`, the former is used for opaque data while the latter is intended as a number.

### Aliases

For convenience we alias:

* `bit` to `boolean`
* `byte` to `uint8` (this is a basic type)
* `BytesN` and `ByteVector[N]` to `Vector[byte, N]` (this is *not* a basic type)
* `ByteList[N]` to `List[byte, N]`

Aliases are semantically equivalent to their underlying type and therefore share canonical representations both in SSZ and in related formats.

### Default values
Assuming a helper function `default(type)` which returns the default value for `type`, we can recursively define the default value for all types.

Expand Down Expand Up @@ -256,3 +264,35 @@ We similarly define "summary types" and "expansion types". For example, [`Beacon
## Implementations

See https://github.com/ethereum/eth2.0-specs/issues/2138 for a list of current known implementations.

## JSON mapping

The canonical JSON mapping assigns to each SSZ type a corresponding JSON encoding, enabling an SSZ schema to also define the JSON encoding.

When decoding JSON data, all fields in the SSZ schema must be present with a value. Parsers may ignore additional JSON fields.

| SSZ | JSON | Example |
| --- | --- | --- |
| `uintN` | string | `"0"` |
| `byte` | hex-byte-string | `"0x00"` |
| `boolean` | bool | `false` |
| `Container` | object | `{ "field": ... }` |
| `Vector[type, N]` | array | `[element, ...]` |
| `Vector[byte, N]` | hex-byte-string | `"0x1122"` |
| `Bitvector[N]` | hex-byte-string | `"0x1122"` |
| `List[type, N]` | array | `[element, ...]` |
| `List[byte, N]` | hex-byte-string | `"0x1122"` |
| `Bitlist[N]` | hex-byte-string | `"0x1122"` |
| `Union[type_0, type_1, ...]` | selector-object | `{ "selector": number, "data": type_N }` |

Integers are encoded as strings to avoid loss of precision in 64-bit values.

Aliases are encoded as their underlying type.

`hex-byte-string` is a `0x`-prefixed hex encoding of byte data, as it would appear in an SSZ stream.

`List` and `Vector` of `byte` (and aliases thereof) are encoded as `hex-byte-string`. `Bitlist` and `Bitvector` similarly map their SSZ-byte encodings to a `hex-byte-string`.

`Union` is encoded as an object with a `selector` and `data` field, where the contents of `data` change according to the selector.

> This encoding is used in [beacon-APIs](https://github.com/ethereum/beacon-APIs) with one exception: the `ParticipationFlags` type for the `getStateV2` response, although it is an alias of `uint8`, is encoded as a list of numbers. Future versions of the beacon API may address this incompatibility.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was it not previously agreed to encode uint8, uint16, uint32 as numbers (because they fit in the JS size), and only encode uint64 as string? I believe that's why ParticipationFlags is the way it is.
AFAIK the specs do not encode uint8, uint16, uint32 anywhere else, so changing it wouldn't be too difficult, but I still prefer to avoid unnecessary number stringification.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was it not previously agreed to encode uint8, uint16, uint32 as numbers

no idea and no strong opinion either way.

AFAIK the specs do not encode uint8

https://github.com/ethereum/consensus-specs/blob/dev/specs/deneb/beacon-chain.md#blob is the only (other) offender I can find that potentially also should be byte-ified.

I believe that's why ParticipationFlags is the way it is.

I'd pin it on lack of interest since the two types were the same in this repo ;) ie the python code liberally uses uint8 where byte is intended for simplicity since the two are after all... the same. Flags are not numbers in most CS-inspired interpretations of numbers that I can think of so byte it becomes. We'll have to deal with the fallout at some point maybe, but this way we start with a clean slate at spec level.

arnetheduck marked this conversation as resolved.
Show resolved Hide resolved