New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add appendix with example file decoding 'by hand' #120
Conversation
As an appendix with examples was suggested in ietf-wg-cellar#120 and the example felt out of place here, it is removed. Wording of the remainder is slightly improved
rfc_backmatter.md
Outdated
|
|
||
| The vendor string is reference libFLAC 1.3.3 20190804, the field contents of the only field is title=Québec. The vorbis comment field is 13 bytes but only 12 characters in size, because it contains one character needing 2 bytes to represent. | ||
| The vendor string is reference libFLAC 1.3.3 20190804, the field contents of the only field is TITLE=Щелкунчик. The vorbis comment field is 24 bytes but only 15 characters in size, because it contains 9 character needing 2 bytes to represent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TITLE= Матрёшка would be funnier (sorry, I could not resist ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a bad idea actually. I was looking for short words in scripts other than latin that would not be considered offensive in any way, and that was rather hard as I don't speak or read any Russian, Greek, Chinese etc. Матрёшка is one character less then Щелкунчик so it is probably a better fit, as all those UTF-8 code points still make it look cluttered. Maybe you know something even shorter in Hebrew?
Now I think of it: If I shorten the vendor string, I don't have to change the whole example even with a longer vorbis comment field, so less room for error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Short, well known and all but offensive would be: שלום
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'll add that in and force-push this PR branch to amend the last commit.
rfc_backmatter.md
Outdated
|
|
||
| Start | Length | Contents | Description | ||
| :------|:--------|:-------------------|:----------------- | ||
| 0x7e+0 | 1 bit | 0b0 | Last metadata block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another copy-paste error
| 0x7e+0 | 1 bit | 0b0 | Last metadata block | |
| 0x7e+0 | 1 bit | 0b1 | Last metadata block |
rfc_backmatter.md
Outdated
| 0x15+4 | 36 bit | 0b0000, 0x00000001 | Total no. of samples 1 | ||
| 0x1a | 16 byte | (...) | MD5 signature | ||
|
|
||
| The minimum and maximum blocksize are both 4096. This was apparently the blocksize the encoder was intending to use for this audio, but as only 1 interchannel sample was provided, no frames with size 4096 are actually present in this file. This is because even in fixed blocksize streams, the size of the last frame can be smaller. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| The minimum and maximum blocksize are both 4096. This was apparently the blocksize the encoder was intending to use for this audio, but as only 1 interchannel sample was provided, no frames with size 4096 are actually present in this file. This is because even in fixed blocksize streams, the size of the last frame can be smaller. | |
| The minimum and maximum blocksize are both 4096. This was apparently the blocksize the encoder planned to use, but as only 1 interchannel sample was provided, no frames with 4096 samples are actually present in this file. |
rfc_backmatter.md
Outdated
|
|
||
| The frame ends with 6 padding bits and a 2 byte frame CRC | ||
|
|
||
| To decode this subframe, 21 predictions have to calculated and added to their corresponding residuals. This is a sequential process: as each prediction uses previous samples, it is not possible to start this decoding halfway a subframe or decode a subframe with parallel threads. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| To decode this subframe, 21 predictions have to calculated and added to their corresponding residuals. This is a sequential process: as each prediction uses previous samples, it is not possible to start this decoding halfway a subframe or decode a subframe with parallel threads. | |
| To decode this subframe, 21 predictions have to be calculated and added to their corresponding residuals. This is a sequential process: as each prediction uses previous samples, it is not possible to start this decoding halfway a subframe or decode a subframe with parallel threads. |
rfc_backmatter.md
Outdated
|
|
||
| This informational appendix contains short example FLAC files and short parts of FLAC files which are decoded step by step. These examples provide a more engaging way to understand the FLAC format than the formal specification. The text explaining these examples assumes the reader has at least cursory read the specification and that the reader refers to the specification for explanation of the terminology used. These examples mostly focus on the lay-out of several metadata blocks and subframe types and the implications of certain aspects (for example wasted bits and stereo decorrelation) on this lay-out. | ||
|
|
||
| The examples feature (parts of) files generated by various FLAC encoders. These are presented in hexadecimal or binary format, followed by tables and text referring to various features by their starting bit positions in these representations. These starting positions (shortened to 'start' in the tables) are a hexadecimal byte position and a start bit within that byte, separated by a plus sign. Counts for these start at zero. For example, a feature starting at the 3rd bit of the 17th byte is referred to as starting at 0x10+2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps stress a little more that these examples are informational and could contain errors, despite thorough checking?
| The examples feature (parts of) files generated by various FLAC encoders. These are presented in hexadecimal or binary format, followed by tables and text referring to various features by their starting bit positions in these representations. These starting positions (shortened to 'start' in the tables) are a hexadecimal byte position and a start bit within that byte, separated by a plus sign. Counts for these start at zero. For example, a feature starting at the 3rd bit of the 17th byte is referred to as starting at 0x10+2. | |
| The examples feature (parts of) files generated by various FLAC encoders. These are presented in hexadecimal or binary format, followed by tables and text referring to various features by their starting bit positions in these representations. Each starting position (shortened to 'start' in the tables) is a hexadecimal byte position and a start bit within that byte, separated by a plus sign. Counts for these start at zero. For example, a feature starting at the 3rd bit of the 17th byte is referred to as starting at 0x10+2. | |
| All data in this appendix has been thoroughly verified. However, as this appendix is informational, in case any information here conflicts with statements in the formal specification, the latter takes precedence. |
rfc_backmatter.md
Outdated
|
|
||
| Anywhere a number of samples is mentioned (blocksize, total number of samples, sample rate), interchannel samples are meant. | ||
|
|
||
| The MD5 sum (starting at 0x1a) is 0x3e84 b418 07dc 6903 0758 6a3d ad1a 2e0f. This is validated after decoding the samples. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| The MD5 sum (starting at 0x1a) is 0x3e84 b418 07dc 6903 0758 6a3d ad1a 2e0f. This is validated after decoding the samples. | |
| The MD5 sum (starting at 0x1a) is 0x3e84 b418 07dc 6903 0758 6a3d ad1a 2e0f. This will be validated after decoding the samples. |
rfc_backmatter.md
Outdated
| 0x34+1 | 6 bit | 0b000001 | verbatim subframe | ||
| 0x34+7 | 1 bit | 0b1 | wasted bits present | ||
| 0x35+0 | 4 bit | 0b0001 | 4 wasted bits | ||
| 0x35+4 | 14 bit | 0b0010, 0x8b | 12-bit unencoded sample |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| 0x35+4 | 14 bit | 0b0010, 0x8b | 12-bit unencoded sample | |
| 0x35+4 | 12 bit | 0b0010, 0x8b | 12-bit unencoded sample |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch!
rfc_backmatter.md
Outdated
| 0 | 533 | 533 | -267 | ||
| 0 | 268 | 268 | 134 | ||
|
|
||
| It can be calculated that using a Rice code is more efficient than storing values unencoded. The rice code (excluding the partition order and parameter) takes 197 bits. Storing unencoded, the largest value (-13172) would need 15 bits for storing, so 15*15 = 225 which is larger. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| It can be calculated that using a Rice code is more efficient than storing values unencoded. The rice code (excluding the partition order and parameter) takes 197 bits. Storing unencoded, the largest value (-13172) would need 15 bits for storing, so 15*15 = 225 which is larger. | |
| It can be calculated that using a Rice code is in this case more efficient than storing values unencoded. The rice code (excluding the partition order and parameter) is 199 bits in length. The largest residual value (-13172) would need 15 bits to be stored unencoded, so storing all 15 samples with 15 bits results in a sequence with a length of 225 bits. |
In a previous PR, I added an example to the coded residual section which felt out of place. This examples can provide a more hands-on way to understand the FLAC specification for readers that need it. Also, it can be used by people proofreading this specification to do cross-referencing, as it is 'redundant'.
In the second example, mmark and xml2rfc rendered the following > the only field is title=Qué (U+00E9)bec which seemed unclear to me. With this commit, it is rendered as > the only field is TITLE=שלום (U+05E9 U+05DC U+05D5 U+05DD) which I think is clearer.
|
I just force-pushed this PR branch to make it mergeable after #124 was merged. I also rebased all typos into the first commit, and added the 3 example files as FLAC files. I'll probably merge it after having another look at it tomorrow. |
In a previous PR (and also in the rewrite), I added an example to the coded residual section which felt out of place. Such an example didn't seem to fit in a formal specification, so the idea came to write an appendix with more thorough examples.
These examples can provide a more hands-on way to understand the FLAC specification for readers that need it. Also, it can be used by people proofreading this specification to do cross-referencing, as it is 'redundant'.
If this turns out to be a welcome addition, I would like to add more examples to this appendix in the future, specifically:
Please provide feedback