Skip to content

Format: CsPack Archive

Robert Jordan edited this page Oct 4, 2021 · 2 revisions

Format: CsPack Archive

The asset archive format used by the CatSystem (1) engine. CsPack archives have 2 versions, differentiated by their file signature. The newer "CsPack2" archives use the .dat extension, and earlier "CsPack1" archives use the extension to label their contents:

File Structure

Note: The '2' in "CsPack2" has no relation to the newer engine CatSystem2, it only states the difference in file version. (It could be considered a multiplier for the length of each File Entry in bytes (12 * V), however internally, the engine hardcodes support for "CsPack1" and "CsPack2" versions.

Data Type Value Description
char[8] "CsPack1"
"CsPack2"
File Signature and Version
uint32 DataOffset Absolute offset to file data, used to calculate EntryCount
Entry[EntryCount] Entries Entry table
... DataRegion Entry file data region

The number of entries is calculating by dividing the data offset by the entry size (after subtracting the header size of 12 bytes)

EntrySize = (Signature=="CsPack2" ? 24 : 12);

// Where 12 == sizeof(Header)
EntryCount = (DataOffset - 12) / EntrySize;

Version differences

Hardcoded differences in each CsPack version are as follows. The Filename row is explained in Entry filename encoding.

Difference CsPack1 CsPack2
Signature "CsPack1" "CsPack2"
EntrySize 12 24
NameSize 12 30
Filename 8.3 16.3

Entry

N = Header.EntryCount
offset = Header.DataOffset

Data Type Value Description
uint32[2]
uint32[5]
FileNameBlocks Compressed Name of the entry's file, stored as 8.3 / 16.13 filename at RADIX40
uint32 OffsetNext XOR-Encrypted DataRegion offset to the next entry's file data, used to calculate Length

The OffsetNext field, as stated is relative to the DataRegion. So a (decrypted) OffsetNext of 143 would put the file data at absolute offset DataOffset + 143. The first entry's data is always at absolute offset DataOffset.

// Decrypt/Encrypt OffsetNext

//NOTE: Only the FIRST TWO filename blocks are ever used for the XOR encryption, regardless of file version.
OffsetNext ^= FileNameBlocks[0] ^  FileNameBlocks[1];

Entry filename encoding

Theoretically, up to a 26.3 filename could be stored, but due to a possible bug in the code, (a logic error on positioning the extension), the extension is placed at index 16. All characters past the extension are still written, even though they are made inaccessible.

The encoding method is almost identical to how MSDOS / FAT12 8.3 filenames are stored (internally), the biggest difference being using RADIX 40 (instead of RADIX 50), and having two different length options for each CsPack version.

Encoding visuals

Legend
Blocks
O OffsetNext field block
F FileName field block
^ XOR with O block to decrypt
Characters
n Decoded Name char
x Decoded Extension char
0 Null terminator char
? Invalid/unused char
. Detect extension at this char

"CsPack1" 8.3 filename (12 bytes)

             ^ ^
    <Entry:  F F O >
         ___/  |
        /     /
     nnnnnn nnxxx0
              .

"CsPack2" 16.3 filename (24 bytes)

             ^ ^
    <Entry:  F F F F F O >
       _____/ /  |  \ \______
      /      /    \  \       \
 nnnnnn nnnnnn nnnnxx x????? ?????0
                   .

Notes

Multiple bugs exist with the "CsPack2" implementation in CatSystem. The most glaring one being the unexpected lengths of the 16.3 filename.

16.3 filename encoding bug

CsPack2 entries have room to store up to 30 characters (including null terminator). Like CsPack1, 3 of these characters are reserved for extensions. Theoretically the limit would have been 26.3 filenames.

See Internal Details

CsPackReader class

Depending on CsPack1 or CsPack2, class fields are assigned to denote the length of each entry in bytes, and max length of decoded entry names.

Field CsPack1 CsPack2
EntrySize 12 24
NameSize 12 30

The issue arises in the EntryNameEncoder and EntryNameDecoder functions, which are the same function for both versions.

EntryNameEncoder

These steps fill a buffer with raw text before encoding into blocks.

  1. Read 'name' into buffer: Break at character ., null terminator, or at index NameSize - 4. (good so far)
  2. Check for file 'extension': If name index is on ., increment name index, set buffer index to EntrySize - 8 !! and continue to step 3.
  3. Read 'extension' into buffer: Break after null terminator, or after 3 'extension' characters.
  4. After reading, no null termination is performed, as the buffer is zeroed beforehand. (TODO: confirm)
  5. Encode buffer into blocks: Break at index NameSize - 1(?), or only at null terminator? (TODO: confirm)

Any.3 filename decoding bug

Note: It's highly possible CatSystem does not use decoded names during lookup, but instead encodes the lookup name and compares the name blocks this way.

Along with the bugs in filename encoding. With both CsPack1 and CsPack2 formats, the decoded extension is checked for in another wrong place, resulting in CsPack1 4.3 / 8.0 (?) filenames and CsPack2 16.0 / 20.3 / 26.0 (?) filenames.

See also

External links

Clone this wiki locally