Skip to content
Lysann Schlegel edited this page Nov 27, 2018 · 2 revisions

Note: This document is work in progress.


Note: This is not a specification of the format. Rather, it is the collected knowledge that we found out reverse-engineering the format.

TODO: summary

All numbers are stored in little-endian byte order, unless otherwise specified.
Some numbers are indicated to be stored in a compressed format. That format is described here: https://docs.microsoft.com/en-us/dotnet/api/system.io.binarywriter.write7bitencodedint?view=netframework-4.7.2

Abstract Format

The file consists of a content section and a tags section.
The last 4 bytes of the file specify the offset of the tags section.

Offset Field Size
0 content section variable
X tags section variable
-4 offset of tags section (X) 4

Content Section

The content section is a hierarchical data structure. Each node can consist of any number of sub-nodes and attributes.
(TODO node structure)
The root node has an additional 0 at the end before the tags section.

Attributes consist of length (compressed integer) and contents (variable length).

Example

04 00                          beginning of node: 0x0040
   03 00                          beginning of node: 0x0030
      02 80                          attribute: 0x8002
         04                             length: 4
         02 00 00 00                    contents: (interpreted by application)
   00                             end of node
   00 80                          attribute: 0x8000
      08                             length: 8
      64 00 61 00 74 00 61 00        contents: (interpreted by application; looks like "data" as a 16-bit character string)
   02 80                          attribute: 0x8002
      04                             length: 4
      01 00 00 00                    contents: (interpreted by application)
00                             end of node

Tag IDs

Tag IDs are 16 bit integers.
Tag IDs for structure tags always have their most significant bit unset, for example 0x0005.
Tag IDs for attribute tags always have their most significatn bit unset, for example 0x8005.
The special tag ID 0 specifies the end of a node.

Tags Section

There are two lists in the tags section: first structures, then attributes.

Field Notes
number of structure tags (N) compressed integer
structure tag 1
...
structure tag N
number of attribute tags (M) compressed integer
attribute tag 1
...
attribute tag M

Tag Declaration

Field Size Notes
Name variable 0-terminated ASCII string
Tag ID 2