Skip to content

Commit

Permalink
Rewording Huffman coding
Browse files Browse the repository at this point in the history
  • Loading branch information
Yoric committed May 13, 2019
1 parent 922d481 commit 833f6c0
Showing 1 changed file with 18 additions and 11 deletions.
29 changes: 18 additions & 11 deletions format.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ ProbabilityTable ::= ProbabilityTableUnreachable # Compression artifact. A t
```

The probability tables are written down in an order extracted from the grammar and define a model
`table: (parent type, my type) -> [(value, probabilities)]`.
`huffman_at: (parent type, my type) -> HuffmanTable`.

FIXME: Specify how the order is extracted from the grammar.

Expand Down Expand Up @@ -113,21 +113,28 @@ LazyAST ::= Node
In the definition of `AST`, for each `i`, `LazyPartByteLen[i]` represents the number
of bytes used to store the item of the sub-ast `LazyAST[i]`.

# Nodes and probabilities
# Nodes

To decode a node, we need to know the type of its parent.
Nodes are stored as sequences of Huffman-encoded values. Note that the encoding uses
numerous distinct Huffman tables. Each `(parent tag, value type)` pair determines the
Huffman table to be used to decode the next few bits in the sequence.

```
RootNode ::= Value(ε)*
Node(parent) ::= t=Tag(parent) Field(t)*
Tag(parent) ::= Primitive(parent)
Value(parent) ::= "" # If field is lazy
| Node(parent) # If field is an interface or sum of interfaces
| Primitive(parent) # If field is a primitive value
| List(parent) # If field is a list
Tag(parent) ::= Primitive(parent, TAG)
Value(parent) ::= "" # If field is lazy
| Node(parent) # If field is an interface or sum of interfaces
| List(parent) # If field is a list
| Primitive(parent, U32) # If field is a u32
| Primitive(parent, I32) # If field is a i32
| Primitive(parent, F64) # ...
| Primitive(parent, StringIndex)
| Primitive(parent, OptionalStringIndex)
List(parent) ::= ListLength(parent) Value(parent)*
ListPength(parent) ::= Value((parent, 'list-length')) # List lengths are u32 values with a special parent
Primitive(parent) ::= bit*
ListLength(parent) ::= Primitive(ListLength<parent>, U32) # List lengths are u32 values with a special parent
Primitive(parent, type) ::= bit*
```

To determine how many bits need to be read to decode a `Primitive(parent)`, it is sufficient for a decoder to know the distribution of probabilities for `(parent, expected type)`, as stored in the `ProbabilityPrelude`.
In every instance of `Primitive(parent, type)`, we use the Huffman table defined as `huffman_at` (see above)
to both determine the number of bits to read and interpret these bits as a value of the corresponding `type`.

0 comments on commit 833f6c0

Please sign in to comment.