Skip to content
Permalink
Newer
Older
100644 2014 lines (1510 sloc) 82.9 KB
1
# Introduction
3
EBML, short for Extensible Binary Meta Language, specifies a binary format
4
aligned with octets (bytes) and inspired by the principle of XML (a framework for
5
structuring data).
6
7
The goal of this document is to define a generic, binary, space-efficient
8
format that can be used to define more complex formats using an EBML
9
Schema. EBML is used by the multimedia container, Matroska
10
[@?I-D.ietf-cellar-matroska]. The applicability of EBML for other
11
use cases is beyond the scope of this document.
12
13
The definition of the EBML format recognizes the idea behind HTML and XML
14
as a good one: separate structure and semantics allowing the same structural
15
layer to be used with multiple, possibly widely differing, semantic
16
layers. Except for the EBML Header and a few Global Elements, this
17
specification does not define particular EBML format semantics; however, this
18
specification is intended to define how other EBML-based formats can be
19
defined, such as the audio/video container formats Matroska and WebM [@?WebM].
20
21
EBML uses a simple approach of building Elements upon three pieces of data (tag, length, and value), as this approach is well known, easy to parse, and allows selective data parsing. The EBML structure additionally allows for hierarchical arrangement to support complex structural formats in an efficient manner.
23
A typical EBML file has the following structure:
24
25
~~~
26
EBML Header (master)
27
+ DocType (string)
28
+ DocTypeVersion (unsigned integer)
29
EBML Body Root (master)
30
+ ElementA (utf-8)
31
+ Parent (master)
32
+ ElementB (integer)
33
+ Parent (master)
34
+ ElementB (integer)
35
~~~
36
37
# Notation and Conventions
39
The key words "**MUST**", "**MUST NOT**",
40
"**REQUIRED**", "**SHALL**", "**SHALL NOT**",
41
"**SHOULD**", "**SHOULD NOT**",
42
"**RECOMMENDED**", "**NOT RECOMMENDED**",
43
"**MAY**", and "**OPTIONAL**" in this document are to be interpreted as
44
described in BCP 14 [@!RFC2119] [@!RFC8174]
45
when, and only when, they appear in all capitals, as shown here.
47
This document defines specific terms in order to define the format and
48
application of `EBML`. Specific terms are defined below:
51
: Extensible Binary Meta Language
54
: A name provided by an `EBML Schema` to designate a particular
55
implementation of `EBML` for a data format (e.g., Matroska and WebM).
58
: A standardized definition for the structure of an `EBML Document Type`.
61
: A datastream comprised of only two components, an `EBML Header` and
65
: A data parser that interprets the semantics of an `EBML Document`
66
and creates a way for programs to use `EBML`.
69
: A file that consists of one or more `EBML Documents` that are
73
: A declaration that provides processing instructions and identification of
74
the `EBML Body`. The `EBML Header` is analogous to an XML Declaration
75
[@!XML] (see Section 2.8 on "Prolog and Document Type Declaration").
78
: All data of an `EBML Document` following the `EBML Header`.
80
`Variable-Size Integer`:
81
: A compact variable-length binary value that defines its own length.
84
: Also known as `Variable-Size Integer`.
87
: A foundation block of data that contains three parts: an `Element ID`,
88
an `Element Data Size`, and `Element Data`.
91
: A binary value, encoded as a `Variable-Size Integer`,
92
used to uniquely identify a defined `EBML Element` within a specific
93
`EBML Schema`.
96
: An expression, encoded as a `Variable-Size Integer`, of the length
100
: The maximum possible value that can be stored as `Element Data Size`.
103
: An `Element` with an unknown `Element Data Size`.
106
: The value(s) of the `EBML Element`, which is identified by its
107
`Element ID` and `Element Data Size`. The form of the `Element Data` is
108
defined by this document and the corresponding `EBML Schema` of the
109
Element's `EBML Document Type`.
112
: The starting level in the hierarchy of an `EBML Document`.
115
: A mandatory, nonrepeating `EBML Element` that occurs at the top
116
level of the path hierarchy within an `EBML Body` and contains all other
117
`EBML Elements` of the `EBML Body`, excepting optional `Void Elements`.
120
: An `EBML Element` defined to only occur as a `Child Element`
124
: The `Master Element` contains zero, one, or many other `EBML Elements`.
127
: A `Child Element` is a relative term to describe the `EBML Elements`
128
immediately contained within a `Master Element`.
131
: A relative term to describe the `Master Element` that contains a
132
specified element. For any specified `EBML Element` that is not at
133
`Root Level`, the `Parent Element` refers to the `Master Element`
134
in which that `EBML Element` is directly contained.
137
: A relative term to describe any `EBML Elements` contained within a
138
`Master Element`, including any of the `Child Elements` of its
139
`Child Elements`, and so on.
142
: An `Element` used to overwrite data or
143
reserve space within a `Master Element` for later use.
146
: The human-readable name of the `EBML Element`.
149
: The hierarchy of `Parent Element` where the `EBML Element`
150
is expected to be found in the `EBML Body`.
153
: An `EBML Element` that has an `Element Data Size` with all
154
`VINT_DATA` bits set to zero, which indicates
155
that the `Element Data` of the `Element` is zero octets in
156
length.
158
# Structure
160
EBML uses a system of Elements to compose an EBML Document. EBML Elements incorporate three parts: an Element ID, an Element Data Size, and Element Data. The Element Data, which is described by the Element ID, includes either binary data, one or more other EBML Elements, or both.
162
# Variable-Size Integer
164
The Element ID and Element Data Size are both encoded as a Variable-Size
165
Integer. The Variable-Size Integer is composed of a VINT\_WIDTH, VINT\_MARKER,
166
and VINT\_DATA, in that order. Variable-Size Integers **MUST**
167
left-pad the VINT\_DATA value with zero bits so that the whole Variable-Size
168
Integer is octet aligned. The Variable-Size Integer will be referred to as
171
## VINT_WIDTH
173
Each Variable-Size Integer starts with a VINT\_WIDTH followed by a
174
VINT\_MARKER. VINT\_WIDTH is a sequence of zero or more bits of value `0`
175
and is terminated by the VINT\_MARKER, which is a single bit of value
176
`1`. The total length in bits of both VINT\_WIDTH and VINT\_MARKER is the
177
total length in octets in of the Variable-Size Integer.
179
The single bit `1` starts a Variable-Size Integer with a length of
180
one octet. The sequence of bits `01` starts a Variable-Size Integer
181
with a length of two octets. `001` starts a Variable-Size Integer with
182
a length of three octets, and so on, with each additional `0` bit adding one
183
octet to the length of the Variable-Size Integer.
185
## VINT_MARKER
187
The VINT\_MARKER serves as a separator between the VINT\_WIDTH and VINT\_DATA. Each Variable-Size Integer **MUST** contain exactly one VINT\_MARKER. The VINT\_MARKER is one bit in length and contain a bit with a value of one. The first bit with a value of one within the Variable-Size Integer is the VINT\_MARKER.
189
## VINT_DATA
191
The VINT\_DATA portion of the Variable-Size Integer includes all data following
192
(but not including) the VINT\_MARKER until end of the Variable-Size
193
Integer whose length is derived from the VINT\_WIDTH. The bits required for the
194
VINT\_WIDTH and the VINT\_MARKER use one out of every eight bits of the total
195
length of the Variable-Size Integer. Thus, a Variable-Size Integer of 1-octet
196
length supplies 7 bits for VINT\_DATA, a 2-octet length supplies 14 bits for
197
VINT\_DATA, and a 3-octet length supplies 21 bits for VINT\_DATA. If the number
198
of bits required for VINT\_DATA is less than the bit size of VINT\_DATA, then
199
VINT\_DATA **MUST** be zero-padded to the left to a size that
200
fits. The VINT\_DATA value **MUST** be expressed as a big-endian
203
## VINT Examples
205
[@tableUsableBits] shows examples of Variable-Size
206
Integers with lengths from 1 to 5 octets. The "Usable Bits" column refers to the
207
number of bits that can be used in the VINT\_DATA. The "Representation" column
208
depicts a binary expression of Variable-Size Integers where VINT\_WIDTH is
209
depicted by `0`, the VINT\_MARKER as `1`, and the VINT\_DATA as
210
`x`.
211
212
Octet Length | Usable Bits | Representation
213
-------------|-------------|:-------------------------------------------------
214
1 | 7 | 1xxx xxxx
215
2 | 14 | 01xx xxxx xxxx xxxx
216
3 | 21 | 001x xxxx xxxx xxxx xxxx xxxx
217
4 | 28 | 0001 xxxx xxxx xxxx xxxx xxxx xxxx xxxx
218
5 | 35 | 0000 1xxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx
219
Table: VINT examples depicting usable bits{#tableUsableBits}
221
A Variable-Size Integer may be rendered at octet lengths larger
222
than needed to store the data in order to facilitate overwriting it at a later
223
date -- e.g., when its final size isn't known in advance. In
224
[@tableVariousSizes], an integer `2` (with a
225
corresponding binary value of 0b10) is shown encoded as different Variable-Size
226
Integers with lengths from one octet to four octets. All four encoded
227
examples have identical semantic meaning, though the VINT\_WIDTH and the padding
230
Integer | Octet Length | As Represented in VINT (binary) | As Represented in VINT (hexadecimal)
231
--------|--------------|----------------------------------------:|------------------------------------:
232
2 | 1 | 1000 0010 | 0x82
233
2 | 2 | 0100 0000 0000 0010 | 0x4002
234
2 | 3 | 0010 0000 0000 0000 0000 0010 | 0x200002
235
2 | 4 | 0001 0000 0000 0000 0000 0000 0000 0010 | 0x10000002
236
Table: VINT examples depicting the same
237
integer value rendered at different VINT lengths{#tableVariousSizes}
239
# Element ID
241
An Element ID is a Variable-Size Integer. By default, Element IDs are from
242
one octet to four octets in length, although Element IDs of greater lengths
243
**MAY** be used if the EBMLMaxIDLength Element of the EBML Header
244
is set to a value greater than four (see
245
(#ebmlmaxidlength-element)). The bits of the VINT\_DATA component
246
of the Element ID **MUST NOT** be all
247
`1` values. The VINT\_DATA component of the Element ID
248
**MUST** be encoded at the shortest valid length. For example, an
249
Element ID with binary encoding of `1011 1111` is valid, whereas an
250
Element ID with binary encoding of `0100 0000 0011 1111` stores a
251
semantically equal VINT\_DATA but is invalid, because a shorter VINT encoding is
252
possible. Additionally, an Element ID with binary encoding of `1111 1111`
253
is invalid, since the VINT\_DATA section is set to all one values,
254
whereas an Element ID with binary encoding of `0100 0000 0111 1111`
255
stores a semantically equal VINT\_DATA and is the shortest-possible VINT
258
[@tableElementIDValidity] details these specific examples further:
260
VINT_WIDTH | VINT_MARKER | VINT_DATA | Element ID Status
261
-----------:|-------------:|---------------:|:-----------------
262
| | 1 | 0000000 | Invalid: VINT_DATA **MUST NOT** be set to all 0
263
0 | 1 | 00000000000000 | Invalid: VINT_DATA **MUST NOT** be set to all 0
May 14, 2017
264
| | 1 | 0000001 | Valid
265
0 | 1 | 00000000000001 | Invalid: A shorter VINT_DATA encoding is available.
May 14, 2017
266
| | 1 | 0111111 | Valid
267
0 | 1 | 00000000111111 | Invalid: A shorter VINT_DATA encoding is available.
268
| | 1 | 1111111 | Invalid: VINT_DATA **MUST NOT** be set to all 1
269
0 | 1 | 00000001111111 | Valid
270
Table: Examples of valid and invalid Element IDs{#tableElementIDValidity}
272
The range and count of possible Element IDs are determined by their
273
octet length. Examples of this are provided in
274
[@tableElementIDRanges].
276
Element ID Octet Length | Range of Valid Element IDs | Number of Valid Element IDs
277
:----------------------:|:----------------------------:|-------------:
278
1 | 0x81 - 0xFE | 126
279
2 | 0x407F - 0x7FFE | 16,256
280
3 | 0x203FFF - 0x3FFFFE | 2,080,768
281
4 | 0x101FFFFF - 0x1FFFFFFE | 268,338,304
282
Table: Examples of count and range for
283
Element IDs at various octet lengths{#tableElementIDRanges}
285
# Element Data Size
287
## Data Size Format
288
289
The Element Data Size expresses the length in octets of Element Data. The
290
Element Data Size itself is encoded as a Variable-Size Integer. By default,
291
Element Data Sizes can be encoded in lengths from one octet to eight octets,
292
although Element Data Sizes of greater lengths **MAY** be used if
293
the octet length of the longest Element Data Size of the EBML Document is
294
declared in the EBMLMaxSizeLength Element of the EBML Header (see
295
(#ebmlmaxsizelength-element)). Unlike the VINT\_DATA of the
296
Element ID, the VINT\_DATA component of the Element Data Size is not mandated
297
to be encoded at the shortest valid length. For example, an Element Data Size
298
with binary encoding of 1011 1111 or a binary encoding of 0100 0000 0011 1111
299
are both valid Element Data Sizes and both store a semantically equal value
300
(both 0b00000000111111 and 0b0111111, the VINT\_DATA sections of the examples,
301
represent the integer 63).
302
303
Although an Element ID with all VINT\_DATA bits set to zero is invalid, an
304
Element Data Size with all VINT\_DATA bits set to zero is allowed for EBML
305
Element Types that do not mandate a nonzero length (see
306
(#ebml-element-types)). An Element Data Size with all VINT\_DATA
307
bits set to zero indicates that the Element Data is zero octets in
308
length. Such an EBML Element is referred to as an Empty Element. If an Empty
309
Element has a default value declared, then the EBML Reader **MUST**
310
interpret the value of the Empty Element as the default value. If an Empty
311
Element has no default value declared, then the EBML Reader **MUST**
312
use the value of the Empty Element for the corresponding EBML Element Type of
313
the Element ID, 0 for numbers and an empty string for strings.
315
## Unknown Data Size
316
317
An Element Data Size with all VINT\_DATA bits set to one is reserved as an
318
indicator that the size of the EBML Element is unknown. The only reserved
319
value for the VINT\_DATA of Element Data Size is all bits set to one. An EBML
320
Element with an unknown Element Data Size is referred to as an Unknown-Sized
321
Element. Only a Master Element is allowed to be of unknown size, and it can
322
only be so if the `unknownsizeallowed` attribute of its EBML Schema is
323
set to true (see (#unknownsizeallowed)).
325
The use of Unknown-Sized Elements allows an EBML Element to be written and read before the size of the EBML Element is known. Unknown-Sized Elements **MUST** only be used if the Element Data Size is not known before the Element Data is written, such as in some cases of datastreaming. The end of an Unknown-Sized Element is determined by whichever comes first:
327
- Any EBML Element that is a valid Parent Element of the Unknown-Sized Element according to the EBML Schema, Global Elements excluded.
328
- Any valid EBML Element according to the EBML Schema, Global Elements
329
excluded, that is not a Descendant Element of the Unknown-Sized Element but
330
shares a common direct parent, such as a Top-Level Element.
331
- Any EBML Element that is a valid Root Element according to the EBML Schema, Global Elements excluded.
332
- The end of the Parent Element with a known size has been reached.
333
- The end of the EBML Document, either when reaching the end of the file or because a new EBML Header started.
335
Consider an Unknown-Sized Element whose EBML path is
336
`\root\level1\level2\elt`. When reading a new Element ID, assuming the
337
EBML Path of that new Element is valid, here are some possible and impossible
338
ways that this new Element is ending `elt`:
340
EBML Path of new element | Status
341
:----------------------------------|:-----------------------------
342
`\root\level1\level2` | Ends the Unknown-Sized Element, as it is a new Parent Element
343
`\root\level1` | Ends the Unknown-Sized Element, as it is a new Parent Element
344
`\root` | Ends the Unknown-Sized Element, as it is a new Root Element
345
`\root2` | Ends the Unknown-Sized Element, as it is a new Root Element
346
`\root\level1\level2\other` | Ends the Unknown-Sized Element, as they share the same parent
347
`\root\level1\level2\elt` | Ends the Unknown-Sized Element, as they share the same parent
348
`\root\level1\level2\elt\inside` | Doesn't end the Unknown-Sized Element; it's a child of `elt`
349
`\root\level1\level2\elt\<global>` | Global Element is valid; it's a child of `elt`
350
`\root\level1\level2\<global>` | Global Element cannot be interpreted with this path; while parsing `elt`, a Global Element can only be a child of `elt`
351
Table: Examples of determining the end of an Unknown-Sized Element
353
## Data Size Values
355
For Element Data Sizes encoded at octet lengths from one to eight,
356
[@tableVintRangePerLength] depicts the range of possible values
357
that can be encoded as an Element Data Size. An Element Data Size with an
358
octet length of 8 is able to express a size of 2^56^-2 or
359
72,057,594,037,927,934 octets (or about 72 petabytes). The maximum possible
360
value that can be stored as Element Data Size is referred to as VINTMAX.
361
362
Octet Length | Possible Value Range
363
-------------|---------------------
364
1 | 0 to 2^7^ - 2
365
2 | 0 to 2^14^ - 2
366
3 | 0 to 2^21^ - 2
367
4 | 0 to 2^28^ - 2
368
5 | 0 to 2^35^ - 2
369
6 | 0 to 2^42^ - 2
370
7 | 0 to 2^49^ - 2
371
8 | 0 to 2^56^ - 2
372
Table: Possible range of values that
373
can be stored in VINTs, by octet length{#tableVintRangePerLength}
375
If the length of Element Data equals 2^n\*7^-1, then the octet
376
length of the Element Data Size **MUST** be at least n+1. This rule
377
prevents an Element Data Size from being expressed as the unknown-size
378
value. [@tableVintReservation] clarifies this rule by
379
showing a valid and invalid expression of an Element Data Size with a
380
VINT\_DATA of 127 (which is equal to 2^1\*7^-1) and 16,383 (which is equal to
381
2^2\*7^-1).
383
VINT_WIDTH | VINT_MARKER | VINT_DATA | Element Data Size Status
384
-----------:|-------------:|----------------------:|---------------------------
385
| | 1 | 1111111 | Reserved (meaning Unknown)
386
0 | 1 | 00000001111111 | Valid (meaning 127 octets)
387
00 | 1 | 000000000000001111111 | Valid (meaning 127 octets)
388
0 | 1 | 11111111111111 | Reserved (meaning Unknown)
389
00 | 1 | 000000011111111111111 | Valid (16,383 octets)
390
Table: Demonstration of VINT_DATA
391
reservation for VINTs of unknown size{#tableVintReservation}
393
# EBML Element Types
395
EBML Elements are defined by an EBML Schema (see
396
(#ebml-schema)), which **MUST** declare one of the
397
following EBML Element Types for each EBML Element. An EBML Element Type
398
defines a concept of storing data within an EBML Element that describes such
399
characteristics as length, endianness, and definition.
401
EBML Elements that are defined as a Signed Integer Element, Unsigned
402
Integer Element, Float Element, or Date Element use big-endian storage.
403
404
## Signed Integer Element
405
406
A Signed Integer Element **MUST** declare a length from zero to eight octets. If the EBML Element is not defined to have a default value, then a Signed Integer Element with a zero-octet length represents an integer value of zero.
408
A Signed Integer Element stores an integer (meaning that it can be written
409
without a fractional component) that could be negative, positive, or
410
zero. Signed Integers are stored with two's complement notation with the
411
leftmost bit being the sign bit. Because EBML limits Signed Integers to 8
412
octets in length, a Signed Integer Element stores a number from
413
-9,223,372,036,854,775,808 to +9,223,372,036,854,775,807.
414
415
## Unsigned Integer Element
416
417
An Unsigned Integer Element **MUST** declare a length from zero to eight octets. If the EBML Element is not defined to have a default value, then an Unsigned Integer Element with a zero-octet length represents an integer value of zero.
419
An Unsigned Integer Element stores an integer (meaning that it can be
420
written without a fractional component) that could be positive or
421
zero. Because EBML limits Unsigned Integers to 8 octets in length, an Unsigned
422
Integer Element stores a number from 0 to 18,446,744,073,709,551,615.
423
424
## Float Element
425
426
A Float Element **MUST** declare a length of either zero octets
427
(0 bit), four octets (32 bit), or eight octets (64 bit). If the EBML Element
428
is not defined to have a default value, then a Float Element with a zero-octet
429
length represents a numerical value of zero.
431
A Float Element stores a floating-point number in the 32-bit and 64-bit
432
binary interchange format, as defined in [@!IEEE.754].
433
434
## String Element
435
436
A String Element **MUST** declare a length in octets from zero to VINTMAX. If the EBML Element is not defined to have a default value, then a String Element with a zero-octet length represents an empty string.
438
A String Element **MUST** either be empty (zero-length) or contain printable ASCII characters [@!RFC0020] in the range of 0x20 to 0x7E, with an exception made for termination (see (#terminating-elements)).
439
440
## UTF-8 Element
441
442
A UTF-8 Element **MUST** declare a length in octets from zero to VINTMAX. If the EBML Element is not defined to have a default value, then a UTF-8 Element with a zero-octet length represents an empty string.
444
A UTF-8 Element contains only a valid Unicode string as defined in [@!RFC3629], with an exception made for termination (see (#terminating-elements)).
445
446
## Date Element
447
448
A Date Element **MUST** declare a length of either zero octets or eight octets. If the EBML Element is not defined to have a default value, then a Date Element with a zero-octet length represents a timestamp of 2001-01-01T00:00:00.000000000 UTC [@!RFC3339].
450
The Date Element stores an integer in the same format as the Signed Integer Element that expresses a point in time referenced in nanoseconds from the precise beginning of the third millennium of the Gregorian Calendar in Coordinated Universal Time (also known as 2001-01-01T00:00:00.000000000 UTC). This provides a possible expression of time from 1708-09-11T00:12:44.854775808 UTC to 2293-04-11T11:47:16.854775807 UTC.
451
452
## Master Element
453
454
A Master Element **MUST** declare a length in octets from zero to VINTMAX or be of unknown length. See (#element-data-size) for rules that apply to elements of unknown length.
456
The Master Element contains zero or more other elements. EBML Elements contained within a Master Element **MUST** have the EBMLParentPath of their Element Path equal to the EBMLFullPath of the Master Element Element Path (see (#path)). Element Data stored within Master Elements **SHOULD** only consist of EBML Elements and **SHOULD NOT** contain any data that is not part of an EBML Element. The EBML Schema identifies what Element IDs are valid within the Master Elements for that version of the EBML Document Type. Any data contained within a Master Element that is not part of a Child Element **MUST** be ignored.
457
458
## Binary Element
459
460
A Binary Element **MUST** declare a length in octets from zero to VINTMAX.
462
The contents of a Binary Element should not be interpreted by the EBML Reader.
464
# EBML Document
466
An EBML Document is composed of only two components, an EBML Header and an EBML Body. An EBML Document **MUST** start with an EBML Header that declares significant characteristics of the entire EBML Body. An EBML Document consists of EBML Elements and **MUST NOT** contain any data that is not part of an EBML Element.
468
## EBML Header
470
The EBML Header is a declaration that provides processing instructions and identification of the EBML Body. The EBML Header of an EBML Document is analogous to the XML Declaration of an XML Document.
472
The EBML Header documents the EBML Schema (also known as the EBML DocType)
473
that is used to semantically interpret the structure and meaning of the EBML
474
Document. Additionally, the EBML Header documents the versions of both EBML
475
and the EBML Schema that were used to write the EBML Document and the versions
476
required to read the EBML Document.
478
The EBML Header **MUST** contain a single Master Element with an
479
Element Name of `EBML` and Element ID of `0x1A45DFA3` (see
480
(#ebml-element)); the Master Element may have any number of
481
additional EBML Elements within it. The EBML Header of an EBML Document that
482
uses an EBMLVersion of 1 **MUST** only contain EBML Elements that
483
are defined as part of this document.
485
Elements within an EBML Header can be at most 4 octets long, except for the
486
EBML Element with Element Name `EBML` and Element ID
487
`0x1A45DFA3` (see (#ebml-element)); this Element can be up
490
## EBML Body
492
All data of an EBML Document following the EBML Header is the EBML Body. The end of the EBML Body, as well as the end of the EBML Document that contains the EBML Body, is reached at whichever comes first: the beginning of a new EBML Header at the Root Level or the end of the file. This document defines precisely which EBML Elements are to be used within the EBML Header but does not name or define which EBML Elements are to be used within the EBML Body. The definition of which EBML Elements are to be used within the EBML Body is defined by an EBML Schema.
494
Within the EBML Body, the maximum octet length allowed for any Element ID
495
is set by the EBMLMaxIDLength Element of the EBML Header, and the maximum octet
496
length allowed for any Element Data Size is set by the EBMLMaxSizeLength
497
Element of the EBML Header.
499
# EBML Stream
501
An EBML Stream is a file that consists of one or more EBML Documents that are concatenated together. An occurrence of an EBML Header at the Root Level marks the beginning of an EBML Document.
503
# EBML Versioning
504
505
An EBML Document handles 2 different versions: the version of the EBML Header and the version of the EBML Body. Both versions are meant to be backward compatible.
506
507
## EBML Header Version
508
509
The version of the EBML Header is found in EBMLVersion. An EBML parser can read an EBML Header if it can read either the EBMLVersion version or a version equal or higher than the one found in EBMLReadVersion.
510
511
## EBML Document Version
512
513
The version of the EBML Body is found in DocTypeVersion. A parser for the particular DocType format can read the EBML Document if it can read either the DocTypeVersion version of that format or a version equal or higher than the one found in DocTypeReadVersion.
515
# Elements semantics
517
## EBML Schema
519
An EBML Schema is a well-formed XML Document
520
[@!XML] that defines the properties,
521
arrangement, and usage of EBML Elements that compose a specific EBML Document
522
Type. The relationship of an EBML Schema to an EBML Document is analogous to
523
the relationship of an XML Schema [@!XML-SCHEMA] to an XML
524
Document [@!XML]. An EBML Schema
525
**MUST** be clearly associated with one or more EBML Document
526
Types. An EBML Document Type is identified by a string stored within the EBML
527
Header in the DocType Element -- for example, Matroska or WebM (see
528
(#doctype-element)). The DocType value for an EBML Document Type
529
**MUST** be unique, persistent, and described in the IANA registry
530
(see (#ebml-doctypes-registry)).
532
An EBML Schema **MUST** declare exactly one EBML Element at Root Level (referred to as the Root Element) that occurs exactly once within an EBML Document. The Void Element **MAY** also occur at Root Level but is not a Root Element (see (#void-element)).
534
The EBML Schema **MUST** document all Elements of the EBML
535
Body. The EBML Schema does not document Global Elements that are defined by
536
this document (namely, the Void Element and the CRC-32 Element).
538
The EBML Schema **MUST NOT** use the Element ID
539
`0x1A45DFA3`, which is reserved for the EBML Header for the purpose of
542
An EBML Schema **MAY** constrain the use of EBML Header Elements
543
(see (#ebml-header-elements)) by adding or constraining
544
that Element's `range` attribute. For example, an EBML Schema
545
**MAY** constrain the EBMLMaxSizeLength to a maximum value of
546
`8` or **MAY** constrain the EBMLVersion to only support a
547
value of `1`. If an EBML Schema adopts the EBML Header Element as is,
548
then it is not required to document that Element within the EBML Schema. If an
549
EBML Schema constrains the range of an EBML Header Element, then that Element
550
**MUST** be documented within an `<element>` node of
551
the EBML Schema. This document provides an example of an EBML Schema; see
554
### EBML Schema Example
555
556
<{{ebml_schema_example.xml}}
557
560
Within an EBML Schema, the XPath [@?XPath] of the
561
`<EBMLSchema>` element is `/EBMLSchema`.
563
When used as an XML Document, the EBML Schema **MUST** use
564
`<EBMLSchema>` as the top-level element. The
565
`<EBMLSchema>` element can contain `<element>`
568
### `<EBMLSchema>` Namespace
569
570
The namespace URI for elements of the EBML Schema is a URN as defined by
571
[@!RFC8141] that uses the namespace identifier 'ietf' defined by [@!RFC2648]
572
and extended by [@!RFC3688]. This URN is `urn:ietf:rfc:8794`.
576
Within an EBML Schema, the `<EBMLSchema>` element uses the
577
following attributes to define an EBML Element:
579
#### docType
581
Within an EBML Schema, the XPath of the `@docType` attribute is
584
The docType lists the official name of the EBML Document Type that is
585
defined by the EBML Schema; for example,
586
`<EBMLSchema docType="matroska">`.
588
The `docType` attribute is **REQUIRED** within the
591
#### version
593
Within an EBML Schema, the XPath of the `@version` attribute is
596
The version lists a nonnegative integer that specifies the version of the
597
docType documented by the EBML Schema. Unlike XML Schemas, an EBML Schema
598
documents all versions of a docType's definition rather than using separate
599
EBML Schemas for each version of a docType. EBML Elements may be introduced
600
and deprecated by using the `minver` and `maxver` attributes of
603
The `version` attribute is **REQUIRED** within the
608
Within an EBML Schema, the XPath of the `@ebml` attribute is `/EBMLSchema/@ebml`.
610
The `ebml` attribute is a positive integer that specifies the
611
version of the EBML Header (see (#ebmlversion-element))
612
used by the EBML Schema. If the attribute is omitted, the EBML Header version
616
617
Within an EBML Schema, the XPath of the `<element>` element is
620
Each `<element>` defines one EBML Element through the use of
621
several attributes that are defined in
622
(#element-attributes). EBML Schemas **MAY** contain
623
additional attributes to extend the semantics but **MUST NOT**
624
conflict with the definitions of the `<element>` attributes
625
defined within this document.
627
The `<element>` nodes contain a description of the meaning and use of the EBML Element stored within one or more `<documentation>` subelements, followed by optional `<implementation_note>` subelements, followed by zero or one `<restriction>` subelement, followed by optional `<extension>` subelements. All `<element>` nodes **MUST** be subelements of the `<EBMLSchema>`.
631
Within an EBML Schema, the `<element>` uses the following
632
attributes to define an EBML Element:
633
634
#### name
635
636
Within an EBML Schema, the XPath of the `@name` attribute is
639
The name provides the human-readable name of the EBML Element. The value of
640
the name **MUST** be in the form of characters "A" to
642
"-", and ".". The first character of the name
643
**MUST** be in the form of an "A" to "Z",
644
"a" to "z", or "0" to "9" character.
646
The `name` attribute is **REQUIRED**.
650
Within an EBML Schema, the XPath of the `@path` attribute is
653
The path defines the allowed storage locations of the EBML Element within
654
an EBML Document. This path **MUST** be defined with the full
655
hierarchy of EBML Elements separated with a `\`. The top EBML Element
656
in the path hierarchy is the first in the value. The syntax of the
657
`path` attribute is defined using this Augmented Backus-Naur Form
658
(ABNF) [@!RFC5234] with the case-sensitive update
661
The `path` attribute is **REQUIRED**.
664
EBMLFullPath = EBMLParentPath EBMLElement
666
EBMLParentPath = PathDelimiter [EBMLParents]
668
EBMLParents = 0*IntermediatePathAtom EBMLLastParent
669
IntermediatePathAtom = EBMLPathAtom / GlobalPlaceholder
670
EBMLLastParent = EBMLPathAtom / GlobalPlaceholder
672
EBMLPathAtom = [IsRecursive] EBMLAtomName PathDelimiter
673
EBMLElement = [IsRecursive] EBMLAtomName
675
PathDelimiter = "\"
676
IsRecursive = "+"
677
EBMLAtomName = (ALPHA / DIGIT) 0*EBMLNameChar
678
EBMLNameChar = ALPHA / DIGIT / "-" / "."
680
GlobalPlaceholder = "(" GlobalParentOccurrence "\)"
681
GlobalParentOccurrence = [PathMinOccurrence] "-" [PathMaxOccurrence]
682
PathMinOccurrence = 1*DIGIT ; no upper limit
683
PathMaxOccurrence = 1*DIGIT ; no upper limit
686
The `*`, `(`, and `)` symbols are interpreted as defined in [@!RFC5234].
688
The EBMLAtomName of the EBMLElement part **MUST** be equal to the `@name` attribute of the EBML Schema.
689
If the EBMLElement part contains an IsRecursive part, the EBML Element can occur within itself recursively (see (#recursive)).
691
The starting PathDelimiter of EBMLParentPath corresponds to the root of the EBML Document.
693
The `@path` value **MUST** be unique within the EBML Schema. The `@id` value corresponding to this `@path` **MUST NOT** be defined for use within another EBML Element with the same EBMLParentPath as this `@path`.
Jan 26, 2020
695
A path with a GlobalPlaceholder as the EBMLLastParent defines a Global Element; see (#global-elements).
696
If the element has no EBMLLastParent part, or the EBMLLastParent part is not a
697
GlobalPlaceholder, then the Element is not a Global Element.
699
The GlobalParentOccurrence part is interpreted as the number of valid
700
EBMLPathAtom parts that can replace the GlobalPlaceholder in the path.
701
PathMinOccurrence represents the minimum number of EBMLPathAtoms required to
702
replace the GlobalPlaceholder. PathMaxOccurrence represents the maximum number
703
of EBMLPathAtoms possible to replace the GlobalPlaceholder.
705
If PathMinOccurrence is not present, then that GlobalParentOccurrence has a
707
If PathMaxOccurrence is not present, then there is no upper bound for the
708
permitted number of EBMLPathAtoms possible to replace the GlobalPlaceholder.
709
PathMaxOccurrence **MUST NOT** have the value 0, as it would mean
710
no EBMLPathAtom can replace the GlobalPlaceholder, and the EBMLFullPath would
711
be the same without that GlobalPlaceholder part.
712
PathMaxOccurrence **MUST** be bigger than, or equal to,
715
For example, in `\a\(0-1\)global`, the Element path
716
`\a\x\global` corresponds to an EBMLPathAtom occurrence of 1. The
717
Element `\a\x\y\global` corresponds to an EBMLPathAtom occurrence of 2,
718
etc. In those cases, `\a\x` or `\a\x\y` **MUST** be valid
719
paths to be able to contain the element `global`.
721
Consider another EBML Path, `\a\(1-\)global`. There has to be at
722
least one EBMLPathAtom between the `\a\` part and `global`.
723
So the `global` EBML Element cannot be found inside the `\a`
724
EBML Element, as it means the resulting path `\a\global` has no
725
EBMLPathAtom between the `\a\` and `global`. However, the
726
`global` EBML Element can be found inside the `\a\b` EBML
727
Element, because the resulting path, `\a\b\global`, has one EBMLPathAtom
728
between the `\a\` and `global`. Alternatively, it can be found
729
inside the `\a\b\c` EBML Element (two EBMLPathAtom), or inside the
730
`\a\b\c\d` EBML Element (three EBMLPathAtom), etc.
731
732
Consider another EBML Path, `\a\(0-1\)global`. There has to be at
733
most one EBMLPathAtom between the `\a\` part and `global`.
734
So the `global` EBML Element can be found inside either the `\a`
735
EBML Element (0 EBMLPathAtom replacing GlobalPlaceholder) or the
736
`\a\b` EBML Element (one replacement EBMLPathAtom).
737
But it cannot be found inside the `\a\b\c` EBML Element, because the
738
resulting path, `\a\b\c\global`, has two EBMLPathAtom between
744
Within an EBML Schema, the XPath of the `@id` attribute is
747
The Element ID is encoded as a Variable-Size Integer. It is read and stored in big-endian
748
order. In the EBML Schema, it is expressed in
749
hexadecimal notation prefixed by a 0x. To reduce the risk of false positives while parsing EBML Streams, the
750
Element IDs of the Root Element and Top-Level Elements **SHOULD**
751
be at least 4 octets in length. Element IDs defined for use at Root Level or
752
directly under the Root Level **MAY** use shorter octet lengths to
753
facilitate padding and optimize edits to EBML Documents; for instance, the
754
Void Element uses an Element ID with a length of one octet to allow its usage
755
in more writing and editing scenarios.
757
The Element ID of any Element found within an EBML Document **MUST** only match a single `@path` value of its corresponding EBML Schema, but a separate instance of that Element ID value defined by the EBML Schema **MAY** occur within a different `@path`. If more than one Element is defined to use the same `@id` value, then the `@path` values of those Elements **MUST NOT** share the same EBMLParentPath. Elements **MUST NOT** be defined to use the same `@id` value if one of their common Parent Elements could be an Unknown-Sized Element.
759
The `id` attribute is **REQUIRED**.
760
761
#### minOccurs
762
763
Within an EBML Schema, the XPath of the `@minOccurs` attribute is
764
`/EBMLSchema/element/@minOccurs`.
766
`minOccurs` is a nonnegative integer expressing the minimum permitted number
767
of occurrences of this EBML Element within its Parent Element.
769
Each instance of the Parent Element **MUST** contain at least this many instances of this EBML Element.
770
If the EBML Element has an empty EBMLParentPath, then `minOccurs` refers to
771
constraints on the occurrence of the EBML Element within the EBML Document.
772
EBML Elements with `minOccurs` set to "1" that also have a default
773
value (see (#default)) declared are not
774
**REQUIRED** to be stored but are **REQUIRED** to be
775
interpreted; see
776
(#note-on-the-use-of-default-attributes-to-define-mandatory-ebml-elements).
778
An EBML Element defined with a `minOccurs` value greater than zero is called
781
The `minOccurs` attribute is **OPTIONAL**. If the `minOccurs`
782
attribute is not present, then that EBML Element has a `minOccurs` value of
785
The semantic meaning of `minOccurs` within an EBML Schema is analogous to the meaning of `minOccurs` within an XML Schema.
787
#### maxOccurs
788
789
Within an EBML Schema, the XPath of the `@maxOccurs` attribute is
790
`/EBMLSchema/element/@maxOccurs`.
792
`maxOccurs` is a nonnegative integer expressing the maximum permitted number
793
of occurrences of this EBML Element within its Parent Element.
795
Each instance of the Parent Element **MUST** contain at most
796
this many instances of this EBML Element, including the unwritten mandatory
797
element with a default value; see
798
(#note-on-the-use-of-default-attributes-to-define-mandatory-ebml-elements).
799
If the EBML Element has an empty EBMLParentPath, then `maxOccurs` refers to
800
constraints on the occurrence of the EBML Element within the EBML
801
Document.
803
The `maxOccurs` attribute is **OPTIONAL**. If the `maxOccurs`
804
attribute is not present, then there is no upper bound for the permitted
805
number of occurrences of this EBML Element within its Parent Element or within
806
the EBML Document, depending on whether or not the EBMLParentPath of the EBML Element
807
is empty.
809
The semantic meaning of `maxOccurs` within an EBML Schema is analogous to the
810
meaning of `maxOccurs` within an XML Schema; when it is not present, it's
811
similar to xml:maxOccurs="unbounded" in an XML Schema.
812
813
#### range
814
815
Within an EBML Schema, the XPath of the `@range` attribute is
818
A numerical range for EBML Elements that are of numerical types (Unsigned
819
Integer, Signed Integer, Float, and Date). If specified, the value of the EBML
820
Element **MUST** be within the defined range. See
821
(#expression-of-range) for rules applied to expression of range
822
values.
824
The `range` attribute is **OPTIONAL**. If the
825
`range` attribute is
826
not present, then any value legal for the `type` attribute is valid.
828
##### Expression of range
829
830
The `range` attribute **MUST** only be used with EBML Elements
831
that are either signed integer, unsigned integer, float, or date. The
832
expression defines the upper, lower, exact, or excluded value of the EBML
833
Element and optionally an upper boundary value combined with a lower
834
boundary. The range expression may contain whitespace (using the ASCII 0x20
835
character) for readability, but whitespace within a range expression
837
838
To set a fixed value for the range, the value is used as the attribute
839
value. For example, `1234` means the EBML element always has the value
840
1234. The value can be prefixed with `not ` to indicate that the fixed
841
value **MUST NOT** be used for that Element. For example,
842
`not 1234` means the Element can use all values of its type except 1234.
843
844
The `>` sign is used for an exclusive lower boundary, and the
845
`>=` sign is used for an inclusive lower boundary. For example,
846
`>3` means the Element value **MUST** be greater than 3,
847
and `>=0x1p+0` means the Element value **MUST** be
848
greater than or equal to the floating value 1.0; see
849
(#textual-expression-of-floats).
850
851
The `<` sign is used for an exclusive upper boundary, and the
852
`<=` sign is used for an inclusive upper boundary. For example,
853
`<-2` means the Element value **MUST** be less than -2,
854
and `<=10` means the Element value **MUST** be less than
855
or equal to 10.
856
857
The lower and upper bounds can be combined into an expression to form a
858
closed boundary. The lower boundary comes first, followed by the upper
859
boundary, separated by a comma. For example, `>3,<= 20` means the
860
Element value **MUST** be greater than 3 and less than or equal to
861
20.
862
863
A special form of lower and upper bounds using the `-` separator is
864
possible, meaning the Element value **MUST** be greater than, or equal to,
865
the first value and **MUST** be less than or equal to the
866
second value. For example, `1-10` is equivalent to
867
`>=1,<=10`. If the upper boundary is negative, the `range` attribute
868
**MUST** only use the latter form.
873
Within an EBML Schema, the XPath of the `@length` attribute is
876
The `length` attribute is a value to express the valid length of the Element
877
Data as written, measured in octets. The length provides a constraint in
878
addition to the Length value of the definition of the corresponding EBML
879
Element Type. This length **MUST** be expressed as either a
880
nonnegative integer or a range (see
881
(#expression-of-range)) that consists of only nonnegative
884
The `length` attribute is **OPTIONAL**. If the
885
`length` attribute is
886
not present for that EBML Element, then that EBML Element is only limited in
887
length by the definition of the associated EBML Element Type.
888
889
#### default
890
891
Within an EBML Schema, the XPath of the `@default` attribute is
894
If an Element is mandatory (has a `minOccurs` value greater than zero) but not written within its Parent Element or stored as an Empty Element, then the EBML Reader of the EBML Document **MUST** semantically interpret the EBML Element as present with this specified default value for the EBML Element.
Oct 24, 2019
895
An unwritten mandatory Element with a declared default value is semantically equivalent to that Element if written with the default value stored as the Element Data.
896
EBML Elements that are Master Elements **MUST NOT** declare a default value.
897
EBML Elements with a `minOccurs` value greater than 1 **MUST NOT** declare a default value.
899
The default attribute is **OPTIONAL**.
900
901
#### type
902
903
Within an EBML Schema, the XPath of the `@type` attribute is
906
The type **MUST** be set to one of the following values:
907
`integer` (signed integer), `uinteger` (unsigned integer),
908
`float`, `string`, `date`, `utf-8`,
909
`master`, or `binary`. The content of each type is defined
910
in (#ebml-element-types).
912
The `type` attribute is **REQUIRED**.
913
914
#### unknownsizeallowed
915
916
Within an EBML Schema, the XPath of the `@unknownsizeallowed`
917
attribute is `/EBMLSchema/element/@unknownsizeallowed`.
919
This attribute is a boolean to express whether an EBML Element is permitted to
920
be an Unknown-Sized Element (having all VINT\_DATA bits of Element Data Size set
921
to 1). EBML Elements that are not Master Elements **MUST NOT** set
922
`unknownsizeallowed` to true. An EBML Element that is defined with an
923
`unknownsizeallowed` attribute set to 1 **MUST** also have the
924
`unknownsizeallowed` attribute of its Parent Element set to 1.
926
An EBML Element with the `unknownsizeallowed` attribute set to 1
927
**MUST NOT** have its `recursive` attribute set to 1.
929
The `unknownsizeallowed` attribute is **OPTIONAL**. If the
930
`unknownsizeallowed` attribute is not used, then that EBML Element is not
931
allowed to use an unknown Element Data Size.
932
933
#### recursive
934
935
Within an EBML Schema, the XPath of the `@recursive` attribute is
936
`/EBMLSchema/element/@recursive`.
938
This attribute is a boolean to express whether an EBML Element is permitted to
939
be stored recursively. If it is allowed, the EBML Element **MAY** be
940
stored within another EBML Element that has the same Element ID, which itself
941
can be stored in an EBML Element that has the same Element ID, and so on. EBML
942
Elements that are not Master Elements **MUST NOT** set recursive to
945
If the EBMLElement part of the `@path` contains an IsRecursive part,
946
then the `recursive` value **MUST** be true; otherwise, it
949
An EBML Element with the `recursive` attribute set to 1 **MUST NOT**
950
have its `unknownsizeallowed` attribute set to 1.
952
The `recursive` attribute is **OPTIONAL**. If the `recursive`
953
attribute is not present, then the EBML Element **MUST NOT** be
956
#### recurring
957
958
Within an EBML Schema, the XPath of the `@recurring` attribute is
959
`/EBMLSchema/element/@recurring`.
961
This attribute is a boolean to express whether or not an EBML Element is defined as an
962
Identically Recurring Element; see
963
(#identically-recurring-elements).
965
The `recurring` attribute is **OPTIONAL**. If the `recurring`
966
attribute is not present, then the EBML Element is not an Identically
969
#### minver
970
971
Within an EBML Schema, the XPath of the `@minver` attribute is
974
The `minver` (minimum version) attribute stores a nonnegative integer that
975
represents the first version of the docType to support the EBML Element.
977
The `minver` attribute is **OPTIONAL**. If the `minver`
978
attribute is not present, then the EBML Element has a minimum version of
979
"1".
980
981
#### maxver
982
983
Within an EBML Schema, the XPath of the `@maxver` attribute is
986
The `maxver` (maximum version) attribute stores a nonnegative integer that
987
represents the last or most recent version of the docType to support the
988
element. `maxver` **MUST** be greater than or equal to `minver`.
990
The `maxver` attribute is **OPTIONAL**. If the `maxver` attribute is
991
not present, then the EBML Element has a maximum version equal to the value
992
stored in the `version` attribute of `<EBMLSchema>`.
996
Within an EBML Schema, the XPaths of the `<documentation>`
997
elements are `/EBMLSchema/element/documentation` and `/EBMLSchema/element/restriction/enum/documentation`.
999
The `<documentation>` element provides additional information
1000
about EBML Elements or enumeration values. Within the `<documentation>` element, the
1001
following XHTML [@!XHTML] elements **MAY** be
1002
used: `<a>`, `<br>`, and `<strong>`.
1004
### `<documentation>` Attributes
1005
1006
#### lang
1007
1008
Within an EBML Schema, the XPath of the `@lang` attribute is
1009
`/EBMLSchema/element/documentation/@lang`.
1011
The `lang` attribute is set to the value from [@!RFC5646] of
1012
the language of the element's documentation.
1014
The `lang` attribute is **OPTIONAL**.
1018
Within an EBML Schema, the XPath of the `@purpose` attribute is
1019
`/EBMLSchema/element/documentation/@purpose`.
1021
A `purpose` attribute distinguishes the meaning of the documentation. Values
1022
for the `<documentation>` subelement's `purpose` attribute
1023
**MUST** include one of the values listed in
1026
| value of `purpose` attribute | definition
1027
|:---------------------------|:----------------------------------------|
1028
| definition | A "definition" is recommended for every defined EBML Element. This documentation explains the semantic meaning of the EBML Element.
1029
| rationale | An explanation about the reason or catalyst for the definition of the Element.
1030
| usage notes | Recommended practices or guidelines for both reading, writing, or interpreting the Element.
1031
| references | Informational references to support the contextualization and understanding of the value of the Element.
1032
Table: Definitions of the permitted
1033
values for the `purpose` attribute of the documentation Element{#tablePurposeDefinitions}
1035
The `purpose` attribute is **REQUIRED**.
1037
### `<implementation_note>` Element
1039
Within an EBML Schema, the XPath of the `<implementation_note>`
1040
element is `/EBMLSchema/element/implementation_note`.
1041
1042
In some cases within an EBML Document Type, the attributes of the
1043
`<element>` element are not sufficient to clearly communicate how
1044
the defined EBML Element is intended to be implemented.
1045
For instance, one EBML Element might only be mandatory if another EBML Element
1046
is present. As another example, the default value of an EBML Element might
1047
be derived from a related Element's content. In these cases where the Element's
1048
definition is conditional or advanced implementation notes are needed, one or many
1049
`<implementation_note>` elements can be used to store that
1050
information.
1051
The `<implementation_note>` refers to a specific attribute of the
1052
parent `<element>` as expressed by the `note_attribute`
1053
attribute (see (#note-attribute)).
1055
### `<implementation_note>` Attributes
1059
Within an EBML Schema, the XPath of the `@note_attribute` attribute
1060
is `/EBMLSchema/element/implementation_note/@note_attribute`.
1061
1062
The `note_attribute` attribute references which of the attributes of the
1063
`<element>` the `<implementation_note>` relates to.
1064
The `note_attribute` attribute **MUST** be set to one of the
1065
following values (corresponding to that attribute of the parent
1066
`<element>`): `minOccurs`, `maxOccurs`,
1067
`range`, `length`, `default`, `minver`, or
1068
`maxver`. The `<implementation_note>` **SHALL**
1069
supersede the parent `<element>`'s attribute that is named in the
1070
`note_attribute` attribute.
1071
An `<element>` **SHALL NOT** have more than one `<implementation_note>` of the same `note_attribute`.
1073
The `note_attribute` attribute is **REQUIRED**.
1075
#### `<implementation_note>` Example
1077
The following fragment of an EBML Schema demonstrates how an
1078
`<implementation_note>` is used. In this case, an EBML Schema
1079
documents a list of items that are described with an optional cost. The
1080
Currency Element uses an `<implementation_note>` to say that the
1081
Currency Element is **REQUIRED** if the Cost Element is set,
1083
1084
```xml
1085
<element name="Items" path="\Items" id="0x4025" type="master"
1086
minOccurs="1" maxOccurs="1">
1087
<documentation lang="en" purpose="definition">
Oct 27, 2019
1088
A set of items.
1089
</documentation>
1090
</element>
1091
<element name="Item" path="\Items\Item" id="0x4026"
1092
type="master">
1093
<documentation lang="en" purpose="definition">
1094
An item.
1095
</documentation>
1096
</element>
1097
<element name="Cost" path="\Items\Item\Cost" id="0x4024"
1098
type="float" maxOccurs="1">
1099
<documentation lang="en" purpose="definition">
1100
The cost of the item, if any.
1101
</documentation>
1102
</element>
1103
<element name="Currency" path="\Items\Item\Currency" id="0x403F"
1104
type="string" maxOccurs="1">
1105
<documentation lang="en" purpose="definition">
1106
The currency of the item's cost.
1107
</documentation>
1108
<implementation_note note_attribute="minOccurs">
1109
Currency MUST be set (minOccurs=1) if the associated Item stores
Oct 27, 2019
1110
a Cost, else Currency MAY be unset (minOccurs=0).
1111
</implementation_note>
1112
</element>
1113
```
1114
1117
Within an EBML Schema, the XPath of the `<restriction>`
1118
element is `/EBMLSchema/element/restriction`.
1120
The `<restriction>` element provides information about
1121
restrictions to the allowable values for the EBML Element, which are listed in
1126
Within an EBML Schema, the XPath of the `<enum>` element is
1127
`/EBMLSchema/element/restriction/enum`.
1129
The `<enum>` element stores a list of values allowed for
1130
storage in the EBML Element. The values **MUST** match the type of
1131
the EBML Element (for example, `<enum value="Yes">`
1132
cannot be a valid value for an EBML Element that is defined as an unsigned
1133
integer). An `<enum>` element **MAY** also store
1134
`<documentation>` elements to further describe the
1135
`<enum>`.
1138
1139
#### label
1140
1141
Within an EBML Schema, the XPath of the `@label` attribute is
1142
`/EBMLSchema/element/restriction/enum/@label`.
1144
The label provides a concise expression for human consumption that
1145
describes what the value of `<enum>` represents.
1147
The `label` attribute is **OPTIONAL**.
1148
1149
#### value
1150
1151
Within an EBML Schema, the XPath of the `@value` attribute is
1152
`/EBMLSchema/element/restriction/enum/@value`.
1154
The value represents data that **MAY** be stored within the EBML Element.
1156
The `value` attribute is **REQUIRED**.
1160
Within an EBML Schema, the XPath of the `<extension>`
1161
element is `/EBMLSchema/element/extension`.
1163
The `<extension>` element provides an unconstrained element to
1164
contain information about the associated EBML `<element>`, which
1165
is undefined by this document but **MAY** be defined by the
1166
associated EBML Document Type. The `<extension>` element
1167
**MUST** contain a `type` attribute and also
1168
**MAY** contain any other attribute or subelement as long as the
1169
EBML Schema remains as a well-formed XML Document. All
1170
`<extension>` elements **MUST** be subelements of the
1174
1175
#### type
1176
1177
Within an EBML Schema, the XPath of the `@type` attribute is
1178
`/EBMLSchema/element/extension/@type`.
1180
The `type` attribute should reference a name or identifier of the
1181
project or authority associated with the contents of the
1182
`<extension>` element.
1184
The `type` attribute is **REQUIRED**.
Jan 8, 2017
1186
### XML Schema for EBML Schema
1187
1188
The following provides an XML Schema [@!XML-SCHEMA] for
1189
facilitating verification of an EBML Schema described in
1190
(#ebml-schema).
Jan 8, 2017
1192
<{{EBMLSchema.xsd}}
1193
1194
### Identically Recurring Elements
1196
An Identically Recurring Element is an EBML Element that **MAY**
1197
occur within its Parent Element more than once, but each recurrence of it
1198
within that Parent Element **MUST** be identical both in storage
1199
and semantics. Identically Recurring Elements are permitted to be stored
1200
multiple times within the same Parent Element in order to increase data
1201
resilience and optimize the use of EBML in transmission. For instance, a
1202
pertinent Top-Level Element could be periodically resent within a datastream
1203
so that an EBML Reader that starts reading the stream from the middle could
1204
better interpret the contents. Identically Recurring Elements
1205
**SHOULD** include a CRC-32 Element as a Child Element; this is
1206
especially recommended when EBML is used for long-term storage or
1207
transmission. If a Parent Element contains more than one copy of an
1208
Identically Recurring Element that includes a CRC-32 Element as a Child
1209
Element, then the first instance of the Identically Recurring Element with a
1210
valid CRC-32 value should be used for interpretation. If a Parent Element
1211
contains more than one copy of an Identically Recurring Element that does not
1212
contain a CRC-32 Element, or if CRC-32 Elements are present but none are valid,
1213
then the first instance of the Identically Recurring Element should be used
1214
for interpretation.
1216
### Textual expression of floats
1218
When a float value is represented textually in an EBML Schema, such as
1219
within a default or range value, the float values **MUST** be
1220
expressed as Hexadecimal Floating-Point Constants as defined in the C11
1221
standard [@!ISO9899] (see Section 6.4.4.2 on Floating
1222
Constants). [@tableFloatExamples] provides examples of
1223
expressions of float ranges.
1225
| as decimal | as Hexadecimal Floating-Point Constants |
1226
|:------------------|:----------------------------------------|
1227
| 0.0 | `0x0p+1` |
1228
| 0.0-1.0 | `0x0p+1-0x1p+0` |
1229
| 1.0-256.0 | `0x1p+0-0x1p+8` |
1230
| 0.857421875 | `0x1.b7p-1` |
1231
| -1.0--0.857421875 | `-0x1p+0--0x1.b7p-1` |
1232
Table: Example of Floating-Point values and
1233
ranges as decimal and Hexadecimal Floating-Point Constants{#tableFloatExamples}
1234
1235
Within an expression of a float range, as in an integer range, the
1236
- (hyphen) character is the separator between the minimum and maximum values
1237
permitted by the range. Hexadecimal Floating-Point Constants also use a -
1238
(hyphen) when indicating a negative binary power. Within a float range, when a
1239
- (hyphen) is immediately preceded by a letter p, then the - (hyphen) is a
1240
part of the Hexadecimal Floating-Point Constant that notes negative binary
1241
power. Within a float range, when a - (hyphen) is not immediately preceded by
1242
a letter p, then the - (hyphen) represents the separator between the minimum
1243
and maximum values permitted by the range.
1245
### Note on the use of default attributes to define Mandatory EBML Elements
1247
If a Mandatory EBML Element has a default value declared by an EBML Schema
1248
and the value of the EBML Element is equal to the declared default value, then
1249
that EBML Element is not required to be present within the EBML Document if
1250
its Parent Element is present. In this case, the default value of the
1251
Mandatory EBML Element **MUST** be read by the EBML Reader,
1252
although the EBML Element is not present within its Parent Element.
1253
1254
If a Mandatory EBML Element has no default value declared by an EBML Schema
1255
and its Parent Element is present, then the EBML Element **MUST**
1256
be present, as well. If a Mandatory EBML Element has a default value declared
1257
by an EBML Schema, and its Parent Element is present, and the value of the EBML
1258
Element is NOT equal to the declared default value, then the EBML Element
1261
[@tableVintRequirements] clarifies whether a Mandatory
1262
EBML Element **MUST** be written, according to whether the default value
1263
is declared, the value of the EBML Element is equal to the declared default
1264
value, and/or the Parent Element is used.
1266
| Is the default value declared? | Is the value equal to default? | Is the Parent Element present? | Then is storing the EBML Element **REQUIRED**? |
1267
|:-----------------:|:-----------------------:|:--------------------:|:------------------------------------------:|
1268
| Yes | Yes | Yes | No |
1269
| Yes | Yes | No | No |
1270
| Yes | No | Yes | Yes |
1271
| Yes | No | No | No |
1272
| No | n/a | Yes | Yes |
1273
| No | n/a | No | No |
1274
Table: Demonstration of the conditional
1275
requirements of VINT Storage{#tableVintRequirements}
1277
## EBML Header Elements
1279
This document contains definitions of all EBML Elements of the EBML Header.
1281
### EBML Element
1282
1299
: Master Element
1302
: Set the EBML characteristics of the data to follow. Each EBML Document has
1304
1305
### EBMLVersion Element
1306
1311
: `\EBML\EBMLVersion`
1329
: Unsigned Integer
1332
: The version of EBML specifications used to create the EBML Document. The
1333
version of EBML defined in this document is 1, so EBMLVersion
1335
1336
### EBMLReadVersion Element
1337
1339
: EBMLReadVersion
1342
: `\EBML\EBMLReadVersion`
1360
: Unsigned Integer
1363
: The minimum EBML version an EBML Reader has to support to read this EBML
1364
Document. The EBMLReadVersion Element **MUST** be less than or equal to EBMLVersion.
1365
1366
### EBMLMaxIDLength Element
1367
1369
: EBMLMaxIDLength
1372
: `\EBML\EBMLMaxIDLength`
1390
: Unsigned Integer
1393
: The EBMLMaxIDLength Element stores the maximum permitted length in octets
1394
of the Element IDs to be found within the EBML Body. An EBMLMaxIDLength Element value of four
1395
is **RECOMMENDED**, though larger values are allowed.
1396
1397
### EBMLMaxSizeLength Element
1398
1400
: EBMLMaxSizeLength
1403
: `\EBML\EBMLMaxSizeLength`
1421
: Unsigned Integer
1424
: The EBMLMaxSizeLength Element stores the maximum permitted length in
1425
octets of the expressions of all Element Data Sizes to be found within the EBML Body. The
1426
EBMLMaxSizeLength Element documents an upper bound for the `length` of
1427
all Element Data Size expressions within the EBML Body and not an upper bound
1428
for the `value` of all Element Data Size expressions
1429
within the EBML Body. EBML Elements that have an Element Data Size expression that is larger in octets
1430
than what is expressed by EBMLMaxSizeLength Element are invalid.
1431
1432
### DocType Element
1433
1438
: `\EBML\DocType`
1449
length:
1450
: >0
1456
: A string that describes and identifies the content of the EBML Body that
1458
1459
### DocTypeVersion Element
1460
1462
: DocTypeVersion
1465
: `\EBML\DocTypeVersion`
1483
: Unsigned Integer
1486
: The version of DocType interpreter used to create the EBML Document.
1487
1488
### DocTypeReadVersion Element
1489
1491
: DocTypeReadVersion
1494
: `\EBML\DocTypeReadVersion`
1512
: Unsigned Integer
1515
: The minimum DocType version an EBML Reader has to support to read this
1516
EBML Document. The value of the DocTypeReadVersion Element **MUST**
1517
be less than or equal to the value of the DocTypeVersion Element.
1519
### DocTypeExtension Element
1520
1522
: DocTypeExtension
1525
: `\EBML\DocTypeExtension`
1534
: Master Element
1537
: A DocTypeExtension adds extra Elements to the main DocType+DocTypeVersion
1538
tuple it's attached to. An EBML Reader **MAY** know these extra Elements and how
1539
to use them. A DocTypeExtension **MAY** be used to iterate between
1540
experimental Elements before they are integrated into a regular
1541
DocTypeVersion. Reading one DocTypeExtension version of a
1542
DocType+DocTypeVersion tuple doesn't imply one should be able to read upper
1543
versions of this DocTypeExtension.
1544
1545
### DocTypeExtensionName Element
1546
1548
: DocTypeExtensionName
1551
: `\EBML\DocTypeExtension\DocTypeExtensionName`
1562
length:
1563
: >0
1569
: The name of the DocTypeExtension to differentiate it from other
1570
DocTypeExtensions of the same DocType+DocTypeVersion tuple. A DocTypeExtensionName value
1571
**MUST** be unique within the EBML Header.
1572
1573
### DocTypeExtensionVersion Element
1574
1576
: DocTypeExtensionVersion
1579
: `\EBML\DocTypeExtension\DocTypeExtensionVersion`
1594
: Unsigned Integer
1597
: The version of the DocTypeExtension. Different DocTypeExtensionVersion
1598
values of the same DocType + DocTypeVersion + DocTypeExtensionName tuple
1599
**MAY** contain completely different sets of extra Elements. An
1600
EBML Reader **MAY** support multiple versions
1601
of the same tuple, only one version of the tuple, or not support the tuple at all.
1603
## Global Elements
1604
1605
EBML allows some special Elements to be found within more than one parent
1606
in an EBML Document or optionally at the Root Level of an EBML Body. These
1607
Elements are called Global Elements. There are two Global Elements that can be
1608
found in any EBML Document: the CRC-32 Element and the Void Element. An EBML
1609
Schema **MAY** add other Global Elements to the format it
1610
defines. These extra elements apply only to the EBML Body, not the EBML
1611
Header.
1613
Global Elements are EBML Elements whose EBMLLastParent part of the path has
1614
a GlobalPlaceholder. Because it is the last Parent part of the path, a Global
1615
Element might also have EBMLParentPath parts in its path. In this case, the
1616
Global Element can only be found within this EBMLParentPath path -- i.e., it's
1619
A Global Element can be found in many Parent Elements, allowing the same number of occurrences in each Parent where this Element is found.
1621
### CRC-32 Element
1622
1627
: `\(1-\)CRC-32`
1638
length:
1639
: 4
1645
: The CRC-32 Element contains a 32-bit Cyclic Redundancy Check value of all
1646
the Element Data of the Parent Element as stored except for the CRC-32 Element
1647
itself. When the CRC-32 Element is present, the CRC-32 Element
1648
**MUST** be the first ordered EBML Element within its Parent
1649
Element for easier reading. All Top-Level Elements of an EBML Document that
1650
are Master Elements **SHOULD** include a CRC-32 Element as a Child
1651
Element. The CRC in use is the IEEE-CRC-32 algorithm as used in the
1652
[@!ISO3309] standard and in Section 8.1.1.6.2 of
1653
[@!ITU.V42], with initial value of 0xFFFFFFFF. The CRC value
1654
**MUST** be computed on a little-endian bytestream and
1655
**MUST** use little-endian storage.
1656
1657
### Void Element
1658
1675
: Used to void data or to avoid unexpected behaviors when using damaged
1676
data. The content is discarded. Also used to reserve space in a subelement for
1679
# Considerations for Reading EBML Data
1680
1681
The following scenarios describe events to consider when reading EBML
1682
Documents, as well as the recommended design of an EBML Reader.
1684
If a Master Element contains a CRC-32 Element that doesn't validate, then
1685
the EBML Reader **MAY** ignore all contained data except for
1686
Descendant Elements that contain their own valid CRC-32 Element.
1688
In the following XML representation of a simple, hypothetical EBML
1689
fragment, a Master Element called CONTACT contains two Child Elements, NAME
1690
and ADDRESS. In this example, some data within the NAME Element had been
1691
altered so that the CRC-32 of the NAME Element does not validate, and thus any
1692
Ancestor Element with a CRC-32 would therefore also no longer
1693
validate. However, even though the CONTACT Element has a CRC-32 that does not
1694
validate (because of the changed data within the NAME Element), the CRC-32 of
1695
the ADDRESS Element does validate, and thus the contents and semantics of the
1696
ADDRESS Element **MAY** be used.
1697
1698
```xml
1699
<CONTACT>
1700
<CRC-32>c119a69b</CRC-32><!-- does not validate -->
1701
<NAME>
1702
<CRC-32>1f59ee2b</CRC-32><!-- does not validate -->
1703
<FIRST-NAME>invalid data</FIRST-NAME>
1704
<LAST-NAME>invalid data</LAST-NAME>
1705
</NAME>
1706
<ADDRESS>
1707
<CRC-32>df941cc9</CRC-32><!-- validates -->
1708
<STREET>valid data</STREET>
1709
<CITY>valid data</CITY>
1710
</ADDRESS>
1711
</CONTACT>
1712
```
1713
1714
1715
If a Master Element contains more occurrences of a Child Master Element
1716
than permitted according to the `maxOccurs` and `recurring`
1718
definition of that Element, then the occurrences in addition to `maxOccurs`
1721
If a Master Element contains more occurrences of a Child Element than
1722
permitted according to the `maxOccurs` attribute of the definition of that
1723
Element, then all instances of that Element after the first `maxOccurs`
1724
occurrences from the beginning of its Parent Element **SHOULD** be
1727
# Terminating Elements
1728
1729
Null Octets, which are octets with all bits set to zero, **MAY** follow the value of a String Element or UTF-8 Element to serve as a terminator.
1730
An EBML Writer **MAY** terminate a String Element or UTF-8 Element
1731
with Null Octets in order to overwrite a stored value with a new value of
1732
lesser length while maintaining the same Element Data Size; this can prevent
1733
the need to rewrite large portions of an EBML Document. Otherwise, the use of
1734
Null Octets within a String Element or UTF-8 Element is **NOT RECOMMENDED**.
1735
The Element Data of a UTF-8 Element **MUST**
1736
be a valid UTF-8 string up to whichever comes first: the end of the Element or
1737
the first occurring Null octet. Within the Element Data of a String or UTF-8 Element,
1738
any Null octet itself and any following data within that Element
1739
**SHOULD** be ignored. A string value and a copy of that string
1740
value terminated by one or more Null Octets are semantically equal.
1741
1742
[@tableNullOctetSemantics] shows examples of semantics
1743
and validation for the use of Null Octets. Values to represent Stored Values
1744
and the Semantic Meaning as represented as hexadecimal values.
1746
Stored Value | Semantic Meaning
1747
:-------------------|:-------------------
1748
0x65 0x62 0x6D 0x6C | 0x65 0x62 0x6D 0x6C
1749
0x65 0x62 0x00 0x6C | 0x65 0x62
1750
0x65 0x62 0x00 0x00 | 0x65 0x62
1751
0x65 0x62 | 0x65 0x62
1752
Table: Examples of semantics for Null
1753
Octets in VINT_DATA{#tableNullOctetSemantics}
1754
1755
# Guidelines for Updating Elements
1756
1757
An EBML Document can be updated without requiring that the entire EBML
1758
Document be rewritten. These recommendations describe strategies for changing
1759
the Element Data of a written EBML Element with minimal disruption to the rest
1760
of the EBML Document.
1762
## Reducing Element Data in Size
1763
1764
There are three methods to reduce the size of Element Data of a written EBML Element.
1765
1766
### Adding a Void Element
1767
1768
When an EBML Element is changed to reduce its total length by more than one octet, an EBML Writer **SHOULD** fill the freed space with a Void Element.
1769
1770
### Extending the Element Data Size
1771
1772
The same value for Element Data Size **MAY** be written in various lengths, so for minor reductions of the Element Data, the Element Size **MAY** be written to a longer octet length to fill the freed space.
1774
For example, the first row of
1775
[@tableShortenVintOneOctet] depicts a String Element that stores
1776
an Element ID (3 octets), Element Data Size (1 octet), and Element Data (4
1777
octets). If the Element Data is changed to reduce the length by one octet, and
1778
if the current length of the Element Data Size is less than its maximum
1779
permitted length, then the Element Data Size of that Element
1780
**MAY** be rewritten to increase its length by one octet. Thus,
1781
before and after the change, the EBML Element maintains the same length of 8
1782
octets, and data around the Element does not need to be moved.
1783
1784
| Status | Element ID | Element Data Size | Element Data |
1785
|-------------|------------|-------------------|--------------------|
1786
| Before edit | 0x3B4040 | 0x84 | 0x65626D6C |
1787
| After edit | 0x3B4040 | 0x4003 | 0x6D6B76 |
1788
Table: Example of editing a VINT to
1789
reduce VINT_DATA length by one octet{#tableShortenVintOneOctet}
1791
This method is **RECOMMENDED** when the Element Data is
1792
reduced by a single octet; for reductions by two or more octets, it is
1793
**RECOMMENDED** to fill the freed space with a Void Element.
1794
1795
Note that if the Element Data length needs to be rewritten as shortened by
1796
one octet and the Element Data Size could be rewritten as a shorter VINT, then
1797
it is **RECOMMENDED** to rewrite the Element Data Size as one octet
1798
shorter, shorten the Element Data by one octet, and follow that Element with a
1799
Void Element. For example,
1800
[@tableShortenVintMoreThanOneOctet] depicts a String Element
1801
that stores an Element ID (3 octets), Element Data Size (2 octets, but could
1802
be rewritten in one octet), and Element Data (3 octets). If the Element Data
1803
is to be rewritten to a two-octet length, then another octet can be taken from
1804
Element Data Size so that there is enough space to add a two-octet Void
1806
1807
Status | Element ID | Element Data Size | Element Data | Void Element
1808
-------|------------|-------------------|--------------|-------------
1809
Before | 0x3B4040 | 0x4003 | 0x6D6B76 |
1810
After | 0x3B4040 | 0x82 | 0x6869 | 0xEC80
1812
VINT to reduce VINT_DATA length by more than one octet{#tableShortenVintMoreThanOneOctet}
1813
1814
### Terminating Element Data
1815
1816
For String Elements and UTF-8 Elements, the length of Element Data could be
1817
reduced by adding Null Octets to terminate the Element Data (see
1818
(#terminating-elements)).
1820
In [@tableExampleNullPadding], Element Data four octets
1821
long is changed to a value three octets long, followed by a Null Octet; the
1822
Element Data Size includes any Null Octets used to terminate Element Data and therefore
1824
1825
| Status | Element ID | Element Data Size | Element Data |
1826
|-------------|------------|-------------------|--------------------|
1827
| Before edit | 0x3B4040 | 0x84 | 0x65626D6C |
1828
| After edit | 0x3B4040 | 0x84 | 0x6D6B7600 |
1829
Table: Example of terminating VINT_DATA
1830
with a Null Octet when reducing VINT length during an edit{#tableExampleNullPadding}
1832
Note that this method is **NOT RECOMMENDED**. For
1833
reductions of one octet, the method for Extending the Element Data Size
1834
**SHOULD** be used. For reduction by more than one octet, the
1835
method for Adding a Void Element **SHOULD** be used.
1836
1837
## Considerations when Updating Elements with Cyclic Redundancy Check (CRC)
1838
1839
If the Element to be changed is a Descendant Element of any Master Element
1840
that contains a CRC-32 Element (see (#crc-32-element)),
1841
then the CRC-32 Element **MUST** be verified before permitting the
1842
change. Additionally, the CRC-32 Element value **MUST** be
1843
subsequently updated to reflect the changed data.
1845
# Backward and Forward Compatibility
1846
1847
Elements of an EBML format **SHOULD** be designed with backward and forward compatibility in mind.
1848
1849
## Backward Compatibility
1850
1851
Backward compatibility of new EBML Elements can be achieved by using default values for mandatory elements. The default value **MUST** represent the state that was assumed for previous versions of the EBML Schema, without this new EBML Element. If such a state doesn't make sense for previous versions, then the new EBML Element **SHOULD NOT** be mandatory.
1853
Non-mandatory EBML Elements can be added in a new DocTypeVersion. Since
1854
they are not mandatory, they won't be found in older versions of the
1855
DocTypeVersion, just as they might not be found in newer versions. This
1857
1858
## Forward Compatibility
1859
1860
EBML Elements **MAY** be marked as deprecated in a new
1861
DocTypeVersion using the `maxver` attribute of the EBML Schema. If such an
1862
Element is found in an EBML Document with a newer version of the
1863
DocTypeVersion, it **SHOULD** be discarded.
1867
EBML itself does not offer any kind of security and does not provide confidentiality. EBML does not provide any kind of authorization. EBML only offers marginally useful and effective data integrity options, such as CRC elements.
1869
Even if the semantic layer offers any kind of encryption, EBML itself could
1870
leak information at both the semantic layer (as declared via the DocType
1871
Element) and within the EBML structure (the presence of EBML Elements can be
1872
derived even with an unknown semantic layer using a heuristic approach -- not
1873
without errors, of course, but with a certain degree of confidence).
1875
An EBML Document that has the following issues may still be handled by the
1876
EBML Reader and the data accepted as such, depending on how strict the EBML
1877
Reader wants to be:
1879
- Invalid Element IDs that are longer than the limit stated in the EBMLMaxIDLength Element of the EBML Header.
1880
- Invalid Element IDs that are not encoded in the shortest-possible way.
1881
- Invalid Element Data Size values that are longer than the limit stated in the EBMLMaxSizeLength Element of the EBML Header.
1883
Element IDs that are unknown to the EBML Reader **MAY** be
1884
accepted as valid EBML IDs in order to skip such elements.
1886
EBML Elements with a string type may contain extra data after the first
1887
0x00. These data **MUST** be discarded according to the
1890
An EBML Reader may discard some or all data if the following errors are found in the EBML Document:
1892
- Invalid Element Data Size values (e.g., extending the length of the EBML Element beyond the scope of the Parent Element, possibly triggering access-out-of-bounds issues).
1893
- Very high lengths in order to force out-of-memory situations resulting in a denial of service, access-out-of-bounds issues, etc.
1894
- Missing EBML Elements that are mandatory in a Master Element and have no declared default value, making the semantic invalid at that Master Element level.
1895
- Usage of invalid UTF-8 encoding in EBML Elements of UTF-8 type (e.g., in order to trigger access-out-of-bounds or buffer-overflow issues).
Aug 25, 2019
1896
- Usage of invalid data in EBML Elements with a date type, triggering bogus date accesses.
1897
- The CRC-32 Element (see (#crc-32-element)) of a Master Element doesn't match the rest of the content of that Master Element.
1899
Side-channel attacks could exploit:
1901
- The semantic equivalence of the same string stored in a String Element or UTF-8 Element with and without zero-bit padding, making comparison at the semantic level invalid.
1902
- The semantic equivalence of VINT\_DATA within Element Data Size with two different lengths due to left-padding zero bits, making comparison at the semantic level invalid.
1903
- Data contained within a Master Element that is not itself part of a Child Element, which can trigger incorrect parsing behavior in EBML Readers.
1904
- Extraneous copies of Identically Recurring Element, making parsing unnecessarily slow to the point of not being usable.
1905
- Copies of Identically Recurring Element within a Parent Element that contain invalid CRC-32 Elements. EBML Readers not checking the CRC-32 might use the version of the element with mismatching CRC-32s.
1906
- Use of Void Elements that could be used to hide content or create bogus resynchronization points seen by some EBML Readers and not others.
1908
# IANA Considerations
1909
1912
This document creates a new IANA registry called the
1913
"EBML Element IDs" registry.
1915
Element IDs are described in (#element-id). Element
1916
IDs are encoded using the VINT mechanism described in
1917
(#variable-size-integer) and can be between one and five
1918
octets long. Five-octet-long Element IDs are possible only if declared
1921
This IANA registry only applies to Elements that can be contained
1922
in the EBML Header, thus including Global Elements. Elements only
1923
found in the EBML Body have their own set of independent Element IDs
1924
and are not part of this IANA registry.
1926
One-octet Element IDs **MUST** be between 0x80 and
1927
0xFE. These items are valuable because they are short, and they need
1928
to be used for commonly repeated elements. Element IDs are to be
1929
allocated within this range according to the "RFC Required"
1930
policy [@!RFC8126].
1932
The following one-octet Element ID is RESERVED: 0xFF.
1934
Values in the one-octet range of 0x00 to 0x7F are not valid for use
1937
Two-octet Element IDs **MUST** be between 0x407F and
1938
0x7FFE. Element IDs are to be allocated within this range according to
1939
the "Specification Required" policy
1940
[@!RFC8126].
1942
The following two-octet Element ID is RESERVED: 0x7FFF.
1944
Values in the two-octet ranges of 0x0000 to 0x4000 and 0x8000 to 0xFFFF are
1945
not valid for use as an Element ID.
1947
Three-octet Element IDs **MUST** be between 0x203FFF and 0x3FFFFE. Element IDs are to be allocated within this range according to the "First Come First Served" policy [@!RFC8126].
1949
The following three-octet Element ID is RESERVED: 0x3FFFFF.
1951
Values in the three-octet ranges of 0x000000 to 0x200000 and
1952
0x400000 to 0xFFFFFF are not valid for use as an Element ID.
1953
1954
Four-octet Element IDs **MUST** be between 0x101FFFFF
1955
and 0x1FFFFFFE. Four-octet Element IDs are somewhat special in that
1956
they are useful for resynchronizing to major structures in the event
1957
of data corruption or loss. As such, four-octet Element IDs are split
1958
into two categories. Four-octet Element IDs whose lower three octets
1959
(as encoded) would make printable 7-bit ASCII values (0x20 to 0x7E,
1960
inclusive) **MUST** be allocated by the "Specification
1961
Required" policy. Sequential allocation of values is not
1962
required: specifications **SHOULD** include a specific
1963
request and are encouraged to do early allocations.
1964
1965
To be clear about the above category: four-octet Element IDs always start
1966
with hex 0x10 to 0x1F, and that octet may be chosen so that the entire VINT
1967
has some desirable property, such as a specific CRC. The other three octets,
1968
when ALL having values between 0x20 (32, ASCII Space) and 0x7E (126, ASCII
1969
"~"), fall into this category.
1971
Other four-octet Element IDs may be allocated by the "First Come First Served" policy.
1973
The following four-octet Element ID is RESERVED: 0x1FFFFFFF.
1975
Values in the four-octet ranges of 0x00000000 to 0x10000000 and 0x20000000
1976
to 0xFFFFFFFF are not valid for use as an Element ID.
1978
Five-octet Element IDs (values from 0x080FFFFFFF to 0x0FFFFFFFFE) are RESERVED according to the "Experimental Use" policy [@!RFC8126]: they may be used by anyone at any time, but there is no coordination.
1980
ID Values found in this document are assigned as initial values as follows:
1981
1982
Element ID | Element Name | Reference
1983
----------:|:------------------------|:-------------------------------------------
1984
0x1A45DFA3 | EBML | Described in (#ebml-element)
1985
0x4286 | EBMLVersion | Described in (#ebmlversion-element)
1986
0x42F7 | EBMLReadVersion | Described in (#ebmlreadversion-element)
1987
0x42F2 | EBMLMaxIDLength | Described in (#ebmlmaxidlength-element)
1988
0x42F3 | EBMLMaxSizeLength | Described in (#ebmlmaxsizelength-element)
1989
0x4282 | DocType | Described in (#doctype-element)
1990
0x4287 | DocTypeVersion | Described in (#doctypeversion-element)
1991
0x4285 | DocTypeReadVersion | Described in (#doctypereadversion-element)
1992
0x4281 | DocTypeExtension | Described in (#doctypeextension-element)
1993
0x4283 | DocTypeExtensionName | Described in (#doctypeextensionname-element)
1994
0x4284 | DocTypeExtensionVersion | Described in (#doctypeextensionversion-element)
1995
0xBF | CRC-32 | Described in (#crc-32-element)
1996
0xEC | Void | Described in (#void-element)
1997
Table: IDs and Names for EBML Elements assigned by this document
1999
## EBML DocTypes Registry
2001
This document creates a new IANA registry called the "EBML DocTypes" registry.
2003
To register a new DocType in this registry, one needs a DocType name, a Description of the DocType, a Change Controller (IESG or email of registrant), and an optional Reference to a document describing the DocType.
2005
DocType values are described in (#doctype). DocTypes
2006
are ASCII strings, defined in (#string-element), which
2007
label the official name of the EBML Document Type. The strings may be
2008
allocated according to the "First Come First Served" policy.
2010
The use of ASCII corresponds to the types and code already in use; the
2011
value is not meant to be visible to the user.
2013
DocType string values of "matroska" and "webm" are RESERVED to the IETF for future use.
2014
These can be assigned via the "IESG Approval" or "RFC Required" policies [@!RFC8126].