VarU64

A simple variable-length encoding for unsigned 64 bit integers.

Specification

To decode a varu64, look at the first byte. If its value is below 248, the value itself is the encoded number. Else, the first byte determines the further length of the encoding:

first byte	number of additional bytes
248	1
249	2
250	3
251	4
252	5
253	6
254	7
255	8

Following the first byte are length many bytes. These bytes are the big-endian representation of the encoded number.

Of all possible representations for a number that this scheme admits, the shortest one is its unique, valid encoding. Decoders must indicate an error if a value uses an encoding that is longer than necessary.

Remarks/Properties

Whether the first byte signifies a length can be checked efficiently by testing whether the first 5 bits are set to 1. In that case, the length itself is 1 plus the value of the last three bits. In other words additional_bytes = (first_byte & 0xf8 == 0xf8) ? (first_byte & 7) + 1 : 0.

The length of an encoded value can be determined by solely looking at the first byte.

The canonical encoding means that if there are >= 2 additional bytes, then there may be no leading zero bytes. If there is just one additional bytes, then the additional byte must be >= 248. That means 0-247 is encoded in one byte, but 248-255 is encoded in two bytes.

Due to the canonicity requirement of only allowing the shortest possible encoding, there is a bijection between unsigned 64 bit integers and encodings.

The cost for the simplicity and canonicty of this format are a (somewhat) large number of unused byte strings. On the plus side, these can be used as extension points.

Related work: This has been inspired by the issues in the multiformats varint repository, in particular issues #8 and #12.

License

The specification (this file) is licensed as CC BY-SA 4.0, the code in this repository is licensed under AGPL-3.0

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
fuzz		fuzz
src		src
.gitignore		.gitignore
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fuzz

fuzz

src

src

.gitignore

.gitignore

Cargo.toml

Cargo.toml

LICENSE

LICENSE

README.md

README.md

Repository files navigation

VarU64

Specification

Remarks/Properties

License

About

Releases

Packages

Languages

License

dominictarr/varu64-rs

Folders and files

Latest commit

History

Repository files navigation

VarU64

Specification

Remarks/Properties

License

About

Resources

License

Stars

Watchers

Forks

Languages