Skip to content
Library for interfacing with SYSTEMAX's Easy Paint Tool Sai.
C++ CMake
Branch: master
Clone or download

Cracking PaintTool Sai documents

This document represents about a year and a half of off-and-on hobby-research on reverse engineering the digitizing raster/vector art program PaintTool Sai. This write-up in particular is focused on the technical specifications of the user-created .sai file format used to archive a user's artwork and the layers of abstraction implemented by SYSTEMAX for extracting this data outside of the context of the original software. This document is more directed at anyone that wants to implement their own library to read or interface with .sai files or just to get a comprehensive understanding of the decisions that SYSTEMAX has chosen to make for their file format. If you find anything in this document to be misleading, incomplete, or flat-out incorrect feel free to shoot me an email at Wunkolo (at) Previous work includes my now-abandoned run-time exploitation framework SaiPal and the more recent Windows explorer thumbnail extension SaiThumbs. This document assumes you have some knowledge of the C and C++ syntax as the data structures and algorithms here will be presented in the form of C and C++ structures and subroutines.

PaintTool SAI Ver.1

PaintTool SAI is high quality and lightweight painting software, fully digitizer support, amazing anti-aliased paintings, provide easy and stable operation, this software make digital art more enjoyable and comfortable.

SYSTEMAX Software Development


  • Fully digitizer support with pressure.
  • Amazing anti-aliased drawings.
  • Highly accurate composition with 16bit ARGB channels.
  • Simple but powerful user interface, easy to learn.
  • Fully support Intel MMX Technology.
  • Data protection function to avoid abnormal termination such as bugs.

Copyright 1996-2016 SYSTEMAX Software Development


Sai uses the file type .sai as its document format for storing both raster and vector layers as well as other canvas related meta-data. The .sai file among with other files such as thumbnails, the sai.ssd file and others is but an archive containing a file-system-like structure once decrypted. Each layer, mask, and related meta data is stored in an individual pseudo-file which also has a layer of block-level encryption. The file itself is encrypted in ECB blocks in which any randomly accessed block can be decrypted by also decrypting the appropriate Table-Block and accessing its 32-bit key found within. It's been found that some preliminary files such as thumbnails and the archive responsible for swatches/palettes use a different decryption key, block size, and Table-Block location. This document will mostly cover the method used for sai's user created .sai documents and very partially show related information for the other files.

An individual block in a .sai file is 4096 bytes of data. Every block index that is a multiple of 512(0, 512, 1024, etc) is a Table-Block containing meta-data about the block itself and the 511 blocks after it. Every other block that is not a Table-Block is a Data-Block:

// Gets the Table-Block index appropriate for the current block index
std::size_t NearestTable(std::size_t BlockIndex)
	return BlockIndex & ~(0x1FF);
// Demonstrating how to quickly determine if a block Index is a data-block or a table-block
bool IsTableBlock(std::size_t BlockIndex)
	return (BlockIndex & 0x1FF) ? false:true;
bool IsDataBlock(std::size_t BlockIndex)
	return (BlockIndex & 0x1FF) ? true:false;

All blocks are encrypted and decrypted symmetrically using a simple exclusive-or-based encryption which refers to a static atlas of 256 32-bit integers which can be found at the end of this text. Different files related to Sai use different static keys. The keyvault used for the .sai file will be referred to as the UserKey since this is the only symmetrical key used to decrypt and encrypt files generated by the end-ser. Table-Blocks and Data-Blocks are encrypted differently using the same UserKey.

Table-Blocks can be decrypted by random access using only their multiple-of-512 block index and the the UserKey. The first block of a .sai file (block index 0) will be a Table-Block storing related data for the 511 blocks after it. When decrypting a Table-Block, four of the 256 keys within UserKey are indexed by the four bytes of the 32-bit block-index and then summed together. This sum is exclusive-ored with the current 4-byte cipher-word and the block-index followed by a 16-bit left rotation of the result. When decrypting a Data-Block, an initial decryption vector is given which selects the appropriate integers from UserKey using the individual bytes of the 32-bit vector integer and xors with the vector integer itself, and subtracts this value from the cipher to get the plaintext before passing on the vector to the next round using the cipher integer. The input Vector is the checksum integer found in the Table-Block.

// Ensure BlockIndex is a valid Table-Block index
void DecryptTable(std::uint32_t BlockIndex, std::uint32_t* Data)
	// see "IsTableBlock" above on making sure BlockIndex
	// is a table or use:
	// BlockNumber &= (~0x1FF);
	for( std::size_t i = 0; i < 1024; i++ )
		std::uint32_t CurCipher = Data[i];
		std::uint32_t X = BlockIndex ^ CurCipher ^ (
			UserKey[(BlockIndex >> 24) & 0xFF]
			+ UserKey[(BlockIndex >> 16) & 0xFF]
			+ UserKey[(BlockIndex >> 8) & 0xFF]
			+ UserKey[BlockIndex & 0xFF]);

		Data[i] = static_cast<std::uint32_t>((X << 16) | (X >> 16));

		BlockIndex = CurCipher;

void DecryptData(std::uint32_t Vector, std::uint32_t* Data)
	for( std::size_t i = 0; i < 1024; i++ )
		std::uint32_t CurCipher = Data[i];
		Data[i] =
			- (Vector ^ (
				UserKey[Vector & 0xFF]
				+ UserKey[(Vector >> 8) & 0xFF]
				+ UserKey[(Vector >> 16) & 0xFF]
				+ UserKey[(Vector >> 24) & 0xFF]));
		Vector = CurCipher;

Table-Blocks contain 512 8-byte structures containing a a 32-bit checksum and a 32-bit integer used to store an index to the next block(similar to a singly linked list). Each index of table-entries corresponds to the appropriate block index after the table index. The first checksum entry found within the Table-Block is a checksum of the table itself, excluding the first 32-bit integer. Setting the first checksum to 0 and calculating the checksum of the entire table produces the same results as if the first entry was skipped. A table entry with a checksum of 0 is considered to be an unallocated/unused block.

struct TableEntry
	std::uint32_t Checksum;
	std::uint32_t NextBlock;
} TableEntries[512];
                                                   ~         ~
                             Table-Block           |         |
                     0 |0xChecksum|0xPrelimin|     |XXXX|XXXX| Block 512
Checksum used to+--> 1 |0xChecksum|0xPrelimin|     |XXXX|XXXX| 0x200200
decrypt block 513    2 |0xChecksum|0xPrelimin|     |XXXX|XXXX|
                     3 |0xChecksum|0xPrelimin|     +---------+
                     4 |0xChecksum|0xPrelimin|    /|         | Block 513
        512 entries  5 |0xChecksum|0xPrelimin|   / |         | 0x200400
                     6 |0xChecksum|0xPrelimi.|  /  |         |
                     7 |0xChecksum|0xPrelim..| /   +---------+
                     8 |0xChecksum|0xPreli...|<    |         | Block 514
                     9 |0xChecksum|0xPrel....|     |         | 0x200600
                    10 |0xChecksu.|          |     |         |
                       ~          ~          ~     +---------+
                                                   |         |
                                                   ~         ~

The checksum for Data-Blocks and Table-Blocks is a simple exclusive-or and bit-rotate which interprets all 4096 bytes of the block as 1024 32-bit integers, with the exception that the checksum for Table-Blocks does not include the first four bytes(the checksum integer of the block itself). All 1024 integers are exclusive-ored with an initial checksum of zero, which is rotated left 1 bit before the exclusive-or operation. Finally the lowest bit is set, making all checksums an odd number.

The NextBlock integer is a block index used to point to the next block that should be read if one is trying to read a serial stream of data. Ex: A large file that spans multiple blocks will be broken up into multiple blocks, and the table-block will use the "NextBlock" flag to point to the next block that should be read, with "0" being the last block.

// If your block number is a multiple of 512, set `Table` to true.
std::uint32_t Checksum(bool Table, std::uint32_t* Data)
	std::uint32_t Sum = 0;
	for( std::size_t i = (Table ? 1 : 0); i < 1024; i++ )
		Sum = ( ( Sum << 1 ) | (Sum >> 31)) ^ Data[i];
	return Sum | 1; 

// Generic version for both Table-Blocks and Data-Blocks
// Works on tables if you set the first 32-bit integer to 0 before running.
std::uint32_t Checksum(std::uint32_t* Data)
	std::uint32_t Sum = 0;
	for( std::size_t i = 0; i < 1024; i++ )
		Sum = ( ( Sum << 1 ) | (Sum >> 31)) ^ Data[i];
	return Sum | 1;

A block-level corruption can be detected by a checksum mismatch. If the Data-Block's generated checksum does not match the checksum found at the appropriate table entry within the Table-Block then the Data-Block is considered corrupted.


Sai internally uses a Direct Mapped cache table to speed up the random access and decryption of a file by caching both Table-Blocks and Data-Blocks. An arbitrary block number will have its appropriate cache entry looked up by first shifting the BlockNumber integer right by 14 bits and comparing both the upper 18 bits of the block ID to the lower 31 bits of the cache entry found within the internally mounted file object. Should these two numbers match then a cache-hit has occurred. Otherwise the block is to fully loaded and decrypted into the cache. The the mounted file context object(I've called it VFSObject in IDA Pro, has exactly 32 cache lines for Table-Blocks. The highest bit of the cache table line is the dirty bit which notes if the block is due for a write-back before a new block is to overwrite the entry. Cache size seems to generally be the block-size divided by 8 and will be a different size depending on the file being handled. This cache mechanism is Sai's mechanism to minimize the need for constant file IO stalls at run-time and for efficient file-writing and flushing. Changes are fully "flushed" simply by writing any remaining cache lines to the file with the upper dirty bit set(and adjusting appropriate checksums within appropriate Table-Blocks if needed). If you plan to implement a library that reads from .sai files, you should probably follow the same cache routine to speed up your file access as Sai. Table-Blocks should at the very least be cached as almost every random access of a .sai file will require you to read the appropriate Table-Block before being able to decrypt the Data-Block.

File System

Now that the cipher can be fully randomly accessed and decrypted, the virtual file system actually implemented can be deciphered. The file system found after decrypting will be described as a Virtual File system or VFS(Internally sai refers to them as a VFS along with terminology such as "mounting" within its error messages). Individual files are described by a File Allocation Table that describe the name, timestamp, starting block index, and the size(in bytes) of the data. A Data-Block can contain a max of 64 FATEntries. Folders are described by having their Type variable set to Folder and the starting Block variable instead points to another Data-Block of 64 FATEntries depicting the contents of the folder.

enum class EntryType : std::uint8_t
	Folder = 0x10,
	File = 0x80

struct FATEntry
	std::uint32_t Flags;
	char Name[32];
	std::uint8_t Pad1;
	std::uint8_t Pad2;
	std::uint8_t Type; // EntryType enum
	std::uint8_t Pad4;
	std::uint32_t Block;
	std::uint32_t Size;
	// Windows FILETIME structure
	std::uint64_t TimeStamp;
	std::uint64_t UnknownB;

struct FATBlock
	FATEntry Entries[64];

Note: When reading file-data of an FATEntry, files are not stored continuously.

TableBlocks may intercept the file stream and must be skipped. So when reading filedata you must abstract away table blocks. This means when reading a file, you must skip all table blocks as if they did not exist and skip over them to simulate continuous files

So offsets such as:

[0,4096],[2097152,2101248],[4194304,4198400],...,[TableIndex * 4096,TableIndex * 4096 + 4096]

must be skipped over

Some info on TimeStamp: To convert this 64 bit integer to the more standardized time_t variable simply divide the 64-bit integer by 10000000UL and subtract by 11644473600ULL. FILETIME is the number of 100-nanosecond intervals since January 1, 1601 while time_t is the number of 1-second intervals since January 1, 1970. If you're writing a multi-platform library it's best to use the more standardized time_t format when available as most functions converting timestamps into strings use the time_t format.

time_t filetime_to_time_t(std::uint64_t Time)
	return Time / 10000000ULL - 11644473600ULL;

The root directory of the VFS will always start at block index 2. This will always be the position of the first FATBlock containing 64 FatEntries of the root folder. If the Flags variable of the FATEntry structure is 0 the entry is considered to be unused. The full hierarchy of files can be traversed simply by iterating through all 64 entries of the FatBlock within block index 2 and stopping at the entry whose Flags variable is set to 0. If any of the 64 FATEntries is a folder, then recursively iterate at the 64 FatEntries at the Block variable. If the entry is a file then simply go to the starting block index and read Size amount of bytes continuously, decrypting appropriate Data-Blocks along the way should Size be larger than 1 block(0x1000 bytes). Padded bytes within a block will always be 0.

From this point on it is assumed you are capable of decrypting the file for random access and can interpret the internal file system format. Now we will look at the actual files and the strucutre in which they are placed within this virtual file system.

Folder structure

The actual file/folder structure found within .sai files describes information on the canvas, layers, a thumbnail image, and other meta-data. Here is a sample file structure of a .sai file created in October.

/.a1541b366925e034 |     32 bytes | 2016/10/12 03:53:53
/canvas            |     56 bytes | 2016/10/12 03:53:53
/laytbl            |     60 bytes | 2016/10/12 03:53:53
/layers/           |          --- | 2016/10/12 03:53:53
     /0000000a     | 464007 bytes | 2016/10/12 03:53:53
     /00000010     |    452 bytes | 2016/10/12 03:53:53
     /0000000e     |    361 bytes | 2016/10/12 03:53:53
     /00000011     |    373 bytes | 2016/10/12 03:53:53
     /00000012     |    373 bytes | 2016/10/12 03:53:53
     /0000000f     |    538 bytes | 2016/10/12 03:53:53
     /0000000b     |  82454 bytes | 2016/10/12 03:53:53
/subtbl            |     12 bytes | 2016/10/12 03:53:53
/sublayers/        |          --- | 2016/10/12 03:53:53
     /0000000d     |  87213 bytes | 2016/10/12 03:53:53
/thumbnail         |  90012 bytes | 2016/10/12 03:53:53

the first entry .a1541b366925e034 will vary in name but will always be the first entry. See .xxxxxxxxxxxxxxxx for more info on this file.

Serialization Streams

Before going into the file formats a specific format of serialization needs to be explained that is found across the internal files. Sai.exe internally uses a specially formatted array of 32 bit integers that describe how serialized data is to be read and written to a file. A size of 0 delimits the end of the table.

Format of the Serial-Table found within Sai.exe for the reso identifier.

  Serialization Table for `reso` identifier 

0-0x00000004 Serial Entry+-----+------------------------+
  0x0000014C <-----------+     |                        |
1-0x00000002 Serial Entry \    |  Size in Bytes         |
  0x00000150 <-----------+ \   +------------------------+
2-0x00000002 Serial Entry   \  |                        |
  0x00000152 <-----------+   \ |  Runtime Offset        |
  0x00000000 End               +------------------------+

Runtime Offset is the offset within the runtime object where Size amount of data gets written to in memory after reading from the file. In C++ code this would be the offsetof and sizeof macro of specific fields of an object being stored in an array. One could trace what an unknown serial entry does by finding what runtime object gets written to and finding out when that specific field gets used again.

SYSTEMAX Source code, probably

struct ResData
	std::uint32_t DPI;//0x14C bytes within some class/struct/etc
	std::uint16_t Unknown150;
	std::uint16_t Unknown152;

std::uint32_t ResDataStream[] =
	offsetof(ResData, DPI),
	offsetof(ResData, Unknown152)

Output written by the Serial-Table for some arbitrary runtime ResData object

6F 73 65 72 08 00 00 00 00 00 48 00 00 00 00 00
^         ^ ^         ^ ^         ^ ^   ^ ^   ^
+---------+ +---------+ +---------+ +---+ +---+
  `oser`       Size       Serial    Ser.  Ser.
                          Data      Data  Data
                            0        1     2

oser is the little endian storage of reso. In code the identifier oser is actually defined as something along the lines of:

const std::uint32_t ResDataMagic = `reso`;

Size is simply the sum of all Size integers for each Serial Entry. This integer gets written so that entire streams of unneeded data may be skipped. If two streams reso and lyid were next to each other, one could skip to the lyid stream by reading 32-bit identifier reso to see that it does not match up with lyid and use the next 32-bit Size integer to know the amount of bytes to skip to get to the next stream. A tag identifier of 0 delimits the end of a Serial Stream.

Sample code for reading a serial stream.

std::uint32_t CurTag;
std::uint32_t CurTagSize;
while( File.Read<std::uint32_t>(CurTag) && CurTag )
	switch( CurTag )
		case 'reso':
			//Handle 'reso' data
		case 'lyid':
		case 'layr':
			// for any streams that we do not handle,
			// we just skip forward in the stream
			File.Seek(File.Tell() + CurTagSize);

Serial streams from here on out will be depicted as an enumeration of the four-byte identifier and the formatted data that it contains.



This file name is procedurally generated based on the system that wrote the file. It is a 64 bit hash integer generated from a string involving the information of the motherboard formatted into a %s/%s/%s string.

Three strings are queried from Windows Management Instrumentation(WMI) first with the query

SELECT * FROM Win32_BaseBoard

and then taking the Manufacturer, Product, and SerialNumber table entries (making sure to convert the UTF16 into UTF8) and formatting them together into a string identifying the user's chipset(formatted %s/%s/%s). An example chipset:

ASUSTeK COMPUTER INC./Z87-DELUXE/130410781704124

The machine-identifying hash is then calculated with this from this string. Within the hash function this null-terminated string is repeated continuously until it fits a 256 byte span.

/130410781704124\ASUSTeK COMPUTE
R INC./Z87-DELUXE/13041078170412
XE/130410781704124\0ASUSTeK COMPU
TER INC./Z87-DELUXE/130410781704
LUXE/130410781704124\0ASUSTeK COM

This 256 byte array of characters is then interpreted as 64 32-bit integers for a chained rotate-and-xor hashing function, generating a 64 bit hash.

std::uint64_t MachineHash(const char* MachineIdentifier)
    std::uint32_t StringBlock[64];
    const char* ReadPoint = MachineIdentifier;
    for(std::size_t i = 0; i < 256; i++)
        reinterpret_cast<std::uint8_t*>(StringBlock)[i] = *ReadPoint;
        ReadPoint = *ReadPoint ? ++ReadPoint : MachineIdentifier;
    std::uint32_t UpperHash = 0;
    std::uint32_t LowerHash = 0;
    std::uint32_t Temp1 = 0;
    for(std::size_t i = 0; i < 64; i++)
        std::uint32_t CurUpper = UpperHash + StringBlock[i % 64];
        std::uint32_t CurLower = LowerHash + StringBlock[(i + 1) % 64];
        for( std::size_t j = 0; j < 4; j++ )
            CurUpper = CurLower + ((CurUpper << CurLower) | (CurUpper >> (32 - CurLower)));
            CurLower = CurUpper + ((CurLower << CurUpper) | (CurLower >> (32 - CurUpper)));
        LowerHash = CurLower ^ Temp1;
        UpperHash ^= CurUpper;
        Temp1 ^= CurLower;
    return (static_cast<std::uint64_t>(UpperHash) << 32) | LowerHash;

The resulting hash for the above formatted string is a1541b366925e034 which would make the filename .a1541b366925e034 using the internal format /%s.%016I64x. The first string seems to always be null leaving the hash to simply have a period character prepended to it.

The file itself is only 32 bytes long.

struct AuthorSystemInfo
	std::uint32_t BitFlag; // always 0x08000000
	std::uint32_t Unknown4;
	std::uint64_t DateCreated; // Date Created
	std::uint64_t DateModified; // Date Modified
	std::uint64_t MachineHash; // Calculated using the above routine

Timestamps are 64 bit integer counts of seconds since January 1, 1601. This value is calculated using GetSystemTimeAsFileTime and then dividing the 64-bit result by 10000000 to convert from 100-nanosecond-intervals into seconds.


This file contains metadata involving the dimensions of the canvas. The first three integers are a static structure:

struct CanvasInfo
	std::uint32_t Unknown0; // Always 0x10(16), possibly bpc or alignment
	std::uint32_t Width;
	std::uint32_t Height

After this, a Serial Stream:

  • reso
// 16.16 fixed point integer
std::uint32_t DotsPerInch;
// 0 = pixels, 1 = inch, 2 = cm, 3 = mm
std::uint16_t SizeUnits;
// 0 = pixel/inch, 1 = pixel/cm
std::uint16_t ResolutionUnits;
  • wsrc Layer marked as the selection source
std::uint32_t SelectionSourceID;
  • layr
std::uint32_t SelectedLayerID;
  • lyid Seems to be a duplication of layr
std::uint32_t SelectedLayerID;

"laytbl" "subtbl"

These files contains a description of all layers that make up an image stored from "lowest" layer to "highest". subtbl contains preliminary layers such as masks. Both laytbl and subtbl have the same format and describe the contents within their respective layers and sublayers folder.

The first integer of either file is a is a 32bit integer for the number of layers followed by an equivalent amount of LayerTableEntries. Layers are identified by 32 bit integers with their appropriate filename found in the layers and sublayers folder using an 8 digit lowercase hexidecimal file name. The full path for any given layer or sublayer identifier can be generated given the identifying integer and the printf format /layers/%08x or /sublayers/%08x.

enum class LayerType : std::uint16_t
	Null = 0x00,
	Layer = 0x03, // Regular Layer
	Unknown4 = 0x4, // Unknown
	Linework = 0x05, // Vector Linework Layer
	Mask = 0x06, // Masks applied to any layer object
	Unknown7 = 0x07, //Unknown
	Set = 0x08//Layer Folder

struct LayerTableEntry
	std::uint32_t Identifier;
	std::uint16_t Type; // LayerType enum
	std::uint16_t Unknown6; // Gets sent as windows message 0x80CA for some reason

Sample routine:

// First integer is number of layer entires
std::uint32_t LayerCount = File.Read<std::uint32_t>();
while( LayerCount-- ) // Read each layer entry
	// Read current layer entry into above structure
	LayerTableEntry CurrentLayer = File.Read<LayerTableEntry>();
	// Do something with this layer

"/layers" "/sublayers"

The individual layer files within these folders match the numerical hexidecimal identifiers found in laytbl or subtbl. These files contain the actual raster or vector data(or none) of the specified layer entry. The header of the file is a static struture identifying the layer's opacity, size, blending mode, etc.

enum BlendingModes : std::uint32_t
	PassThrough = 'pass',
	Normal = 'norm',
	Multiply = 'mul\0',
	Screen = 'scrn',
	Overlay = 'over',
	Luminosity = 'add\0',
	Shade = 'sub\0',
	LumiShade = 'adsb',
	Binary = 'cbin'

// Rectangular bounds
// Can be off-canvas or larger than canvas if the user moves
// The layer outside of the "canvas window" without cropping
// similar to photoshop
// 0,0 is top-left corner of image
struct LayerBounds
	// Can be negative, rounded to nearest multiple of 32
	std::int32_t X;
	std::int32_t Y;
	std::uint32_t Width;
	std::uint32_t Height;

struct LayerHeader
	std::uint32_t Type; // LayerType enum
	std::uint32_t Identifier;
	LayerBounds Bounds;
	std::uint32_t Unknown18;
	std::uint8_t Opacity;
	std::uint8_t Visible;
	std::uint8_t PreserveOpacity;
	std::uint8_t Clipping;
	std::uint8_t Unknown1C;
	std::uint32_t Blending; // BlendingModes enum

Immediately after the LayerHeader is a Serial Stream.

Note: Not all streams might be present depending on the type of layer the file is referencing. Streams such as texp and peff may not exist if the layer is a lineart layer or folder

  • lorg
std::uint32_t Unknown0;
std::uint32_t Unknown4;
  • name

Zero terminated string of the layer's name.

std::uint8_t LayerName[256];
  • pfid

Parent Set ID. If this layer is a child of a folder this will be a layer identifier of the parent container layer.

std::uint32_t ParentSetID;
  • plid

Parent Layer ID. If this layer is a child of another layer(ex, a mask-layer) this will be a layer identifier of the parent container layer.

std::uint32_t ParentLayerID;
  • lmfl

Only appears in mask layers

// 0b01 = Nonzero blending mode?
// 0b10 = Opacity is greater than 0
std::uint32_t Unknown0; // Bitmask, only the bottom two bits are used
  • fopn

Present only in a layer that is a Set/Folder. A single bool variable for if the folder is expanded within the layers panel or not

std::uint8_t Open;
  • texn

Name of the overlay-texture assigned to a layer. Ex: Watercolor A Only appears in layers that have an overlay enabled

std::uint8_t TextureName[64]; // UTF16 string
  • texp

Options related to the overlay-texture

std::uint16_t TextureScale;
std::uint8_t TextureOpacity;
  • peff

Options related to the watercolor fringe assigned to a layer

std::uint8_t Enabled; // bool
std::uint8_t Opacity; // 100
std::uint8_t Width;   // 1 - 15
  • vmrk
std::uint8_t Unknown0;

Immediately after the stream may be the contents of the layer. If the layer is a folder or set, there is no additional data. If the layer is a raster layer of pixels then specially formatted raster data follows. If the layer is a linework layer, specifically formatted linework data follows.

Sample layer file reading procedure

// Read header
LayerHeader CurHeader = LayerFile.Read<LayerHeader>(LayerHead);

// Read Serial Stream
std::uint32_t CurTag, CurTagSize;
CurTag = CurTagSize = 0;

char Name[256];

while( LayerFile.Read<std::uint32_t>(CurTag) && CurTag )

	switch( CurTag )
	case 'name':
	// any other cases you care for
	case 'pfid': // Parent folder ID
		// ...
		LayerFile.Seek(LayerFile.Tell() + CurTagSize);

if( CurHeader.Type == LayerType::Layer )
	// Read Raster data
else if( CurHeader.Type == LayerType::Linework )
	// Read Linework data

Raster Layers

Raster data is stored in a tiled format immediately after the header structure above. There is an array of (LayerWidth / 32) * (LayerHeight / 32) 8-bit boolean integer values stored before the compressed channel pixel data. Each boolean value within this BlockMap determines if the appropriately positioned 32x32 tile of bitmap data contains pixel data that varies from pure black transparency. If a tile is active(1), its pixel data is stored as four or more streams of Run-Length-Encoding compressed data for each color channel for that 32x32 tile. If a tile is not active(0), the tile is to be filled with a 32x32 fully transparent block of pixels(0x00000000 for all pixels). If more than four streams exist, the extra streams may be safely ignored and skipped. Note that the RLE routine is the very same algorithm that Photoshop uses when compressing layer data and the same as the PackBits algorithm that apple uses.

RLE streams are prefixed with a 16-bit size integer for the amount of RLE stream bytes that follow. Compressed channel data will be at max 0x800 bytes. Decompressed data will be at most 0x1000 bytes. Use these as your buffer sizes when reading and decompressing in-place. Color data is stored with premultiplied alpha and should be converted to straight as soon as relavently needed. It is highly recommended to use SIMD intrinsics featured in C headers such as emmintrin.h and tmmintrin.h to speed up conversions and arithmetic upon pixel data. Internally Sai uses MMX for all of its SIMD speedups so many structures already lend themselves to more modern SIMD speedups(SSE,AVX,etc). Pixel data is stored in BGRA order

  1. First, load in the array of (LayerWidth / 32) * (LayerHeight / 32) bytes immediately following the layer's Serial Stream as BlockMap
  2. Iterate both Y and X dimensions by LayerHeight / 32 and LayerWidth / 32 times respectively
  • Be sure to iterate the Y dimension first, then the X to ensure a row-by-row iteration.
    • Access the the boolean at index (LayerWidth/32) * Y + X from BlockMap
    • If the boolean is true(1)
      • Read a 16 bit integer
      • If nonzero, read this amount of data, decompress it, and put this data into the correct B, G, R, or A channel in order for however you're formatting your pixel data. Read another 16-bit integer and test for non-zero again in step one to get the next channel.
        • If there are more than 4 streams(channels) you can safely skip the extra RLE streams by this 16 bit integer amount in bytes by iterating again at step 2.
        • I have yet to find out what the extra channels are but it is possibly "mip-map-like" data for different zoom levels to speed up certain calculations
      • If zero, no more streams to read. Move on to the next tile by iterating at step 2.

Here is a sample scratch-implementation I made using SIMD to shuffle channels into RGBA format and convert from premultiplied alpha to straight alpha as well as

Routine for decompressing an RLE stream and placing resulting data into the appropriate interleaved 32bpp 8bpc channel index.

void RLEDecompress32(void* Destination, const std::uint8_t *Source, std::size_t SourceSize, std::size_t IntCount, std::size_t Channel)
	std::uint8_t *Write = reinterpret_cast<std::uint8_t*>(Destination) + Channel;
	std::size_t WriteCount = 0;

	while( WriteCount < IntCount )
		std::uint8_t Length = *Source++;
		if( Length == 128 ) // No-op
		else if( Length < 128 ) // Copy
			// Copy the next Length+1 bytes
			WriteCount += Length;
			while( Length )
				*Write = *Source++;
				Write += 4;
		else if( Length > 128 ) // Repeating byte
			// Repeat next byte exactly "-Length + 1" times
			Length ^= 0xFF;
			Length += 2;
			WriteCount += Length;
			std::uint8_t Value = *Source++;
			while( Length )
				*Write = Value;
				Write += 4;
// Read BlockMap
// Do not use a vector<bool> as this is commonly implemented as a specialized vector type that does not implement individual bool values as bytes but rather as packed bits within a word
std::vector<std::uint8_t> BlockMap;
TileData.resize((LayerHead.Bounds.Width / 32) * (LayerHead.Bounds.Height / 32));

// Read Block Map
LayerFile.Read(, (LayerHead.Bounds.Width / 32) * (LayerHead.Bounds.Height / 32));

// the resulting raster image data for this layer, RGBA 32bpp interleaved
// Use a vector to ensure that tiles with no data are still initialized
// to #00000000
// Also note that the claim that SystemMax has made involving 16bit color depth
// may actually only be true at run-time. All raster data found in files are stored at
// 8bpc while only some run-time color arithmetic converts to 16-bit
std::vector<std::uint8_t> LayerImage;
LayerImage.resize(LayerHead.Bounds.Width * LayerHead.Bounds.Height * 4);

// iterate 32x32 tile chunks row by row
for( std::size_t y = 0; y < (LayerHead.Bounds.Height / 32); y++ )
	for( std::size_t x = 0; x < (LayerHead.Bounds.Width / 32); x++ )
		if( BlockMap[(LayerHead.Bounds.Width / 32) * y + x] ) // if tile is active
			// Decompress Tile
			std::array<std::uint8_t, 0x800> CompressedTile;

			// Aligned memory for simd
			alignas(sizeof(__m128i)) std::array<std::uint8_t, 0x1000> DecompressedTile;

			std::uint8_t Channel = 0;
			std::uint16_t Size = 0;
			while( LayerFile.Read<std::uint16_t>(Size) ) // Get Current RLE stream size
				LayerFile.Read(, Size);
				// decompress and place into the appropriate interleaved channel
				Channel++; // Move on to next channel
				if( Channel >= 4 ) // skip all other channels besides the RGBA ones we care about
					for( std::size_t i = 0; i < 4; i++ )
						std::uint16_t Size = LayerFile.Read<std::uint16_t>();
						LayerFile.Seek(LayerFile.Tell() + Size);

			// Current 32x32 tile within final image
			std::uint32_t *ImageBlock = reinterpret_cast<std::uint32_t*>( + (x * 32) + ((y * LayerHead.Bounds.Width) * 32);

			for( std::size_t i = 0; i < (32 * 32) / 4; i++ ) // Process 4 pixels at a time
				__m128i QuadPixel = _mm_load_si128(
					reinterpret_cast<__m128i*>( + i

				// ABGR to ARGB, if you want.
				// Do your swizzling here
				QuadPixel = _mm_shuffle_epi8(
						15, 12, 13, 14,
						11, 8, 9, 10,
						7, 4, 5, 6,
						3, 0, 1, 2)

				/// Alpha is pre-multiplied, convert to straight
				// Get Alpha into [0.0,1.0] range
				__m128 Scale = _mm_div_ps(
								-1, -1, -1, 15,
								-1, -1, -1, 11,
								-1, -1, -1, 7,
								-1, -1, -1, 3
					), _mm_set1_ps(255.0f));

				// Normalize each channel into straight color
				for( std::uint8_t i = 0; i < 3; i++ )
					__m128i CurChannel = _mm_srli_epi32(QuadPixel, i * 8);
					CurChannel = _mm_and_si128(CurChannel, _mm_set1_epi32(0xFF));
					__m128 ChannelFloat = _mm_cvtepi32_ps(CurChannel);

					ChannelFloat = _mm_div_ps(ChannelFloat, _mm_set1_ps(255.0));// [0,255] to [0,1]
					ChannelFloat = _mm_div_ps(ChannelFloat, Scale);
					ChannelFloat = _mm_mul_ps(ChannelFloat, _mm_set1_ps(255.0));// [0,1] to [0,255]

					CurChannel = _mm_cvtps_epi32(ChannelFloat);
					CurChannel = _mm_and_si128(CurChannel, _mm_set1_epi32(0xff));
					CurChannel = _mm_slli_epi32(CurChannel, i * 8);

					QuadPixel = _mm_andnot_si128(_mm_set1_epi32(0xFF << (i * 8)), QuadPixel);
					QuadPixel = _mm_or_si128(QuadPixel, CurChannel);

				// Write directly to final image
					reinterpret_cast<__m128i*>(ImageBlock) + (i % 8) + ((i / 8) * (LayerHead.Bounds.Width / 4)),

Linework Layers


Decryption Keys


This is the key that we care for. Used to encrypt/decrypt all user-created files. Decrypts .sai files.

const std::uint32_t UserKey[256] =


Seems to only be used for the Notremoveme.ssd file located in "C:\ProgramData\SYSTEMAX Software Development\SAI"

Appears to contain log data similar to sai.ssd

const std::uint32_t NotRemoveMeKey[256] =


Used for thumbnail files located in "C:\ProgramData\SYSTEMAX Software Development\SAI\thumbnail"

Thumbnail filenames use printf pattern "%08x.ssd". Named LocalState as it describes an active user context.

const std::uint32_t LocalStateKey[256] =


Used only for sai.ssd Handled the same as user-files but with a different block size of 1024 and Table-blocks indexes at every multiple of 128.

sai.ssd seems to have multiple log files stored with symbolic headers:

  • "++FSIF logfile++"
    • Seems to be related to file-security and encryption
  • "++VFS logfile++"
    • Everything related to the virtual file system
  • "++SCDF logfile++"
    • Unknown
const std::uint32_t SystemKey[256] =
You can’t perform that action at this time.