-
Notifications
You must be signed in to change notification settings - Fork 5
Initial proposal for binary protocol #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
0ae967c
move from https://github.com/w3c/trace-context/pull/215/files
f5370c1
added MUST for ordering
63ad8f2
added a link to 256 limit and explained why spec is taking about byte…
8b5f9a5
use trace-flags to describe the field
253cfb6
addressed PRfeedback and more
9cf6b67
added note about endianess of bytes
9df6928
Fix typo (#3)
SLdragon 7b00826
Update spec/20-binary-format.md
SergeyKanzhelev File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,106 @@ | ||
# Binary format | ||
|
||
Binary format document describes how to encode each field - `traceparent` and | ||
`tracestate`. The binary format should be used to encode the values of these | ||
fields. This specification does not specify how these fields should be stored | ||
and sent as a part of a binary payload. The basic implementation may serialize | ||
those as size of the field followed by the value. | ||
|
||
Specification operates with bytes - unsigned 8-bit integer values | ||
representing values from `0` to `255`. Byte representation as a set of | ||
bits (big or little endian) MUST be defined by underlying platform and | ||
out of scope of this specification. | ||
|
||
## `Traceparent` binary format | ||
|
||
The field `traceparent` encodes the version of the protocol and fields | ||
`trace-id`, `parent-id` and `trace-flags`. Each field starts with the one byte | ||
field identifier with the field value following immediately after it. Field | ||
identifiers are used as markers for additional verification of the value | ||
consistency and may be used in future for the versioning of the `traceparent` | ||
field. | ||
|
||
``` abnf | ||
traceparent = version version_format | ||
version = 1BYTE ; version is 0 in the current spec | ||
version_format = "{ 0x0 }" trace-id "{ 0x1 }" parent-id "{ 0x2 }" trace-flags | ||
trace-id = 16BYTES | ||
parent-id = 8BYTES | ||
trace-flags = 1BYTE ; only the least significant bit is used | ||
``` | ||
|
||
Unknown field identifier (anything beyond `0`, `1` and `2`) should be treated as | ||
invalid `traceparent`. All zeroes in `trace-id` and `parent-id` invalidates the | ||
`traceparent` as well. | ||
|
||
## Serialization of `traceparent` | ||
|
||
Implementation MUST serialize fields into the field ordering sequence. | ||
In other words, `trace-id` field should be serialized first, `parent-id` | ||
second and `trace-flags` - third. | ||
|
||
Field identifiers should be treated as unsigned byte numbers and should be | ||
encoded in big-endian bit order. | ||
|
||
Fields `trace-id` and `parent-id` are defined as a byte arrays, NOT a | ||
long numbers. First element of an array MUST be copied first. When array is | ||
represented as a memory block of 16 bytes - serialization of `trace-id` | ||
would be identical to `memcpy` method call on that memory block. This | ||
may be a concern for implementations casting these fields to integers - | ||
protocol is NOT defining whether those byte arrays are ordered as big | ||
endian or little endian and have a sign bit. | ||
|
||
If padding of the field is required (`traceparent` needs to be serialized into | ||
the bigger buffer) - any number of bytes can be appended to the end of the | ||
serialized value. | ||
|
||
## `traceparent` example | ||
|
||
``` js | ||
{0, | ||
0, 75, 249, 47, 53, 119, 179, 77, 166, 163, 206, 146, 157, 0, 14, 71, 54, | ||
1, 52, 240, 103, 170, 11, 169, 2, 183, | ||
2, 1} | ||
``` | ||
|
||
This corresponds to: | ||
|
||
- `trace-id` is | ||
`{75, 249, 47, 53, 119, 179, 77, 166, 163, 206, 146, 157, 0, 14, 71, 54}` or | ||
`4bf92f3577b34da6a3ce929d000e4736`. | ||
- `parent-id` is `{52, 240, 103, 170, 11, 169, 2, 183}` or `34f067aa0ba902b7`. | ||
- `trace-flags` is `1` with the meaning `recorded` is true. | ||
|
||
## `tracestate` binary format | ||
|
||
List of up to 32 name-value pairs. Each list member starts with the 1 byte field | ||
identifier `0`. The format of list member is a single byte key length followed | ||
by the key value and single byte value length followed by the encoded | ||
value. Note, single byte length field allows keys and values up to 256 | ||
bytes long. This limit is defined by [trace | ||
context](https://w3c.github.io/trace-context/#header-value) | ||
specification. Strings are transmitted in ASCII encoding. | ||
SergeyKanzhelev marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
``` abnf | ||
tracestate = list-member 0*31( list-member ) | ||
list-member = "0" key-len key value-len value | ||
key-len = 1BYTE ; length of the key string | ||
value-len = 1BYTE ; length of the value string | ||
``` | ||
|
||
Zero length key (`key-len == 0`) indicates the end of the `tracestate`. So when | ||
SergeyKanzhelev marked this conversation as resolved.
Show resolved
Hide resolved
|
||
`tracestate` should be serialized into the buffer that is longer than it | ||
requires - `{ 0, 0 }` (field id `0` and key-len `0`) will indicate the end of | ||
the `tracestate`. | ||
|
||
## `tracestate` example | ||
|
||
``` js | ||
{ 0, 3, 102, 111, 111, 16, 51, 52, 102, 48, 54, 55, 97, 97, 48, 98, 97, 57, 48, 50, 98, 55, | ||
0, 3, 98, 97, 114, 4, 48, 46, 50, 53, } | ||
|
||
``` | ||
|
||
This corresponds to 2 tracestate entries: | ||
|
||
`foo=34f067aa0ba902b7,bar=0.25` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
# Rationale for decision on binary format | ||
|
||
Binary format is similar to proto encoding without any reference on | ||
protobuf project. It uses field identifiers in bytes in front of field | ||
values. | ||
|
||
## Field identifiers | ||
|
||
Protocol uses field identifiers for fields like `trace-id`, `parent-id`, | ||
`trace-flags` and tracestate entries. The purpose of the field | ||
identifiers is two-fold. First, allow to remove existing fields or add | ||
new ones going forward. Second, provides an additional layer of | ||
validation of the format. | ||
|
||
## How can we add new fields | ||
|
||
If we follow the rules that we always append the new ids at the end of the | ||
buffer we can add up to 127. After that we can either use varint encoding or | ||
just reserve 255 as a continuation byte. Assumption at the moment is | ||
that specification will never get to this point. | ||
|
||
## Why custom binary protocol | ||
|
||
We didn't find non-proprietary wide used binary protocol that can be | ||
used in this specification. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,86 @@ | ||
# De-serialization algorithms | ||
|
||
This is non-normative section that describe de-serialization algorithm | ||
that may be used to parse `traceparent` and `tracestate` field values. | ||
|
||
## De-serialization of `traceparent` | ||
|
||
Let's assume the algorithm takes a buffer - bytes array - and can set | ||
and shift cursor in the buffer as well as validate whether the end of | ||
the buffer was reached or will be reached after reading the given number | ||
of bytes. This algorithm can work on stream of bytes. De-serialization | ||
of `traceparent` MAY be done in the following sequence: | ||
|
||
1. If buffer is empty - RETURN invalid status `BUFFER_EMPTY`. Set a cursor to | ||
the first byte. | ||
2. Read the `version` byte at the cursor position. Shift cursor to `1` byte. | ||
3. If at the end of the buffer RETURN invalid status `TRACEPARENT_INCOMPLETE`. | ||
4. **Parse `trace-id`**. Read the field identifier byte at the cursor | ||
position. If NOT `0` - go to step `8. Report invalid field`. | ||
Otherwise - check that remaining buffer size is more or equal to `16` | ||
bytes. If shorter - RETURN invalid status `TRACE_ID_TOO_SHORT`. | ||
Otherwise read the next `16` bytes for `trace-id` and shift cursor to | ||
the end of those `16` bytes. | ||
5. **Parse `trace-id`**. Read the field identifier byte at the cursor | ||
position. If NOT `1` - go to step `8. Report invalid field`. | ||
Otherwise - check that remaining buffer size is more or equal to `8` | ||
bytes. If shorter - RETURN invalid status `PARENT_ID_TOO_SHORT`. | ||
Otherwise read the next `8` bytes for `parent-id` and shift cursor | ||
to the end of those `8` bytes. | ||
6. **Parse `trace-id`**. Read the field identifier byte at the cursor | ||
position. If NOT `2` - go to step `8. Report invalid field`. | ||
Otherwise - check the remaining size of the buffer. If at the end of | ||
the buffer - RETURN invalid status. Otherwise - read the | ||
`trace-flags` byte. Least significant bit will represent `recorded` | ||
value. | ||
7. RETURN status `OK` if `version` is `0` or status `DOWNGRADED_TO_ZERO` | ||
otherwise. | ||
8. **Report invalid field**. If `version` is `0` RETURN invalid status | ||
`INVALID_FIELD_ID`. If `version` has any other value - | ||
`INCOMPATIBLE_VERSION` | ||
|
||
_Note_, that invalid status names are given for readability and not part of the | ||
specification. | ||
|
||
_Note_, that parsing should not treat any additional bytes in the end of the | ||
buffer as an invalid status. Those fields can be added for padding purposes. | ||
Optionally implementation can check that the buffer is longer than `29` bytes as | ||
a very first step if this check is not expensive. | ||
|
||
## De-serialization of `tracestate` | ||
|
||
Let's assume the algorithm takes a buffer - bytes array - and can set | ||
and shift cursor in the buffer as well as validate whether the end of | ||
the buffer was reached or will be reached after reading the given number | ||
of bytes. Algorithm also uses `version` value parsed from `traceparent`. | ||
If `version` was not given - value `0` SHOULD be used. This algorithm | ||
can work on stream of bytes. De-serialization of `tracestate` MAY be | ||
done in the following sequence: | ||
|
||
1. If at the end of the buffer - RETURN status `OK`. Otherwise set a | ||
cursor to the first byte. | ||
2. **Parse `list-member` field identifier**. Read the field identifier | ||
byte at the cursor position and shift cursor to `1` byte. If NOT `0` | ||
and `version` is `0` RETURN invalid status `INVALID_FIELD_ID`. If NOT | ||
`0` and `version` has any other value - `INCOMPATIBLE_VERSION`. | ||
3. **Parse key**. | ||
1. If at the end of the buffer - RETURN status `OK`. This situation | ||
indicates that `tracestate` value was padded with `0`. | ||
2. Read the `key-len` byte. Shift cursor to `1` byte. If the value of | ||
`key-len` is `0` - RETURN status `OK`. This situation indicates an | ||
explicit end of a key. | ||
3. Check that buffer has `key-len` more bytes. If not - RETURN | ||
`KEY_TOO_SHORT`. | ||
4. Read `key-len` bytes as `key`. Shift cursor to `key-len` bytes. | ||
4. **Parse value**. | ||
1. If at the end of the buffer - RETURN status `INCOMPLETE_LIST_MEMBER`. | ||
2. Read the `value-len` byte. Shift cursor to `1` byte. If the value of | ||
`value-len` is `0` - add `list-member` with the `key` and empty | ||
`value` to the `tracestate` list. RETURN status `OK`. | ||
3. Check that buffer has `value-len` more bytes. If not - RETURN | ||
`VALUE_TOO_SHORT`. | ||
4. Read `value-len` bytes as `value`. Shift cursor to `value-len` | ||
bytes. | ||
5. Add `list-member` with the `key` and `value` to the `tracestate` | ||
list. | ||
5. Go to step `2. Parse list-member field identifier`. |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.