Skip to content

Latest commit

 

History

History
501 lines (376 loc) · 19.6 KB

binary_executable_format.md

File metadata and controls

501 lines (376 loc) · 19.6 KB

Binary Executable Format

This document describes the goals and rationale for the in-memory representation the executor uses to run host kernels, known as the Binary Executable Format "BEF".

We have the following goals:

  • We want this to be simple and relatively stable - though it is versioned just in case (tm). We don't want this to contain domain specific abstractions that will require changing it frequently over time.

  • This format is designed to be usable as an immutable region in memory, allowing it to be built offline and conveniently mmap'd in (or stored in the .rodata segment of an executable, etc).

  • This is aligned with the abstractions used by the host executor and includes key abstractions that it can directly use to make it speedy and memory efficient. The design puts more burden on the encoder of a BEF file (e.g. requiring multiple passes) to make the decoder fast.

  • We expect this to be produced online (e.g. when a TF_Function is built on demand) e.g. in training or interactive use cases.

  • We need to represent location information in order to reflect runtime errors (e.g. shape errors) back to the user source program.

  • We should follow normal best practices for binary encodings, e.g. having tools to round trip them to a textual form (MLIR in our case), support arbitrary sized files (not limited to 4GB), be efficient to read, make reasonable attempts to be dense, etc.

In addition to describing the format, this document describes some of the rationale for the design decisions as well as TODO items that need to be completed before this work is done. This file format shouldn't be declared stable until all the TODOs are resolved.

Fundamental Concepts

Grammar

  BYTE                   ::= `0x00`...`0xFF`
  INTEGER                ::= (`0x80`...`0xFF`)* (`0x00`...`0x7F`)
  FIXED32                ::= BYTE BYTE BYTE BYTE
  NULL_TERMINATED_STRING ::= (`0x01`...`0xFF`)* `0x00`
  OFFSET                 ::= INTEGER
  INDEX                  ::= INTEGER

A BEF file is formed as a byte stream whose top-level structure is a list of "sections" that hold various sorts of data. It uses a few fundamental concepts:

  • Integers are encoded in a "Variable Byte Rate" (VBR) encoding, allowing small integers (less than 2^7) to be stored in one byte, integers up to 2^14 to be stored in two bytes, etc. This is done by using the high bit of each byte in the stream to indicate "more bytes are coming".

  • Fixed Integers are unsigned integers with fixed bit width. The values are stored in little-endian byte order.

  • Offsets are Integers that demarcate a byte offset in the stream from a fixed position (typically the start of a section). These are useful when the section contains a bunch of variable length things, e.g. when referring to a string in the string table. Because variable sized integers are used pervasively in BEF files, offsets are very common.

  • Indexes are Integers that demarcate an entry in a table, e.g. a register or kernel in a function. These are typically used when indexing into a table where the structure is expected to be decoded and held in memory by the reader, e.g. as is the case with the kernels and types tables.

BEF files can be created by an arbitrary producer, but the standard ways is to use the MLIRToBEF translation library (or standalone tool), which turns an MLIR representation of a host program into a BEF file.

A BEFToMLIR translation library performs the reverse translation from BEF file to MLIR representation of a host program. It is useful for writing BEF encoder testcases and for dealing with that "random file someone sent you".

Sections

Grammar

  SECTION                ::= SECTION_HEADER SECTION_DATA
  SECTION_HEADER         ::= SECTION_ID INTEGER<LENGTH_AND_ALIGNMENT> SECTION_BODY_ALIGNMENT?
  SECTION_ID             ::= BYTE
  LENGTH_AND_ALIGNMENT   ::= (SECTION_LENGTH << 1) | (SECTION_ALIGNMENT_FLAG)
  SECTION_BODY_ALIGNMENT ::= BYTE<"Alignment"> BYTE<"AlignmentPadding">*

Sections are the top level entities in the file (after the header). Each section contains a Section ID, a length (allowing the section to be skipped over entirely by the reader) and an optional alignment for the contents of the section.

Section IDs and other fundamental constants are defined in bef_encoding.h, and utilities for decoding the basic file structures like VBR integers are defined in bef_reader.h.

The LENGTH_AND_ALIGNMENT contains SECTION_LENGTH and SECTION_ALIGNMENT_FLAG. The SECTION_LENGTH contains one bit shifted value of the section length and the SECTION_ALIGNMENT_FLAG (0 bit) indicates if SECTION_BODY_ALIGNMENT exists or not; 0 means the section body starts immediately and 1 means SECTION_BODY_ALIGNMENT follows.

Top Level Structure

Grammar

  BEF_FILE     ::= `0x0B` `0xEF` FORMAT_VERSION_NUMBER SECTION*

  FORMAT_VERSION_NUMBER ::= `0x00`

  SECTION_DATA ::= STRINGS_SECTION
  SECTION_DATA ::= ATTRIBUTES_SECTION
  SECTION_DATA ::= KERNELS_SECTION
  SECTION_DATA ::= TYPES_SECTION
  SECTION_DATA ::= FUNCTION_INDEX_SECTION
  SECTION_DATA ::= FUNCTIONS_SECTION
  SECTION_DATA ::= LOCATION_STRINGS_SECTION
  SECTION_DATA ::= LOCATIONS_SECTION
  SECTION_DATA ::= ATTRIBUTE_TYPES_SECTION
  SECTION_DATA ::= ATTRIBUTE_NAMES_SECTION
  SECTION_DATA ::= REGISTER_TYPES_SECTION

  // Unknown section.
  SECTION_DATA ::= BYTE*

The top level structure of the file is a two-byte "magic number" of 0x0BEF followed by one byte sized FORMAT_VERSION_NUMBER and a list of sections.

The current FORMAT_VERSION_NUMBER is 0, and will be increased when BEF format is changed.

The reader skips over unknown sections, which could be useful for future evolution of the format, e.g. if we want to store extra metadata in the BEF format for some purpose.

Strings Section

Grammar

  STRINGS_SECTION ::= NULL_TERMINATED_STRING*

The Strings section contains a list of NULL terminated strings used by the program (e.g. for type and kernel names). Entries in this section are referenced by an Offset from the start of the section.

Note: This format doesn't support embedded NULL strings, which is sufficient for the existing use cases. We could switch to modified Pascal strings if embeded NULL characters become important for something (at a space/complexity cost).

Attributes Section

Grammar

  ATTRIBUTES_SECTION ::= BYTE*

The Attributes section contains the value of attributes used by kernels in the BEF program. They are stored on the natural alignment boundary of the type, and the address of the attribute is directly passed into the kernel implementation function as an const pointer.

BEF files only support a subset of MLIR attributes, currently including:

  • booleans, stored as 1-byte integers.
  • i1 integers, stored as 1-byte integers.
  • i32 integers, stored as 4-byte little endian integers.
  • i64 integers, stored as 8-byte little endian integers.
  • f32 floats, stored as IEEE single precision floats.
  • f64 floats, stored as IEEE double precision floats.
  • type enums, stored as 1-byte integers. Currently supported type enums are i1, i32, i64, f32 and f64.
  • strings, stored as arrays of bytes, not necessarily NULL terminated.
  • dense elements, stored as shape dtype, shape rank, elements count, followed by shape elements and elements themselves. Each element can be any of the integer and float format above.
  • arrays, all elements of which are of the same type and fixed in width (eg. i32, f32, type).
  • aggregates, stored as array of i32 integers, which are offsets to other constants in Attributes Section. These nested constants can be of any supported attribute type including aggregates. Unlike arrays, an aggregate can contain a mix of different attribute types.

TODO: Support 8/16-bit integers and floating point constants.

Rationale

The host executor needs to refer to attribute values, and since the BEF file is designed to be memory mapped in, we can directly use the encoding in memory. This means that kernels will need to bswap attributes for big-endian systems, but they generally have bswap'ing loads anyway so this shouldn't impose a performance penalty.

Kernels Section

Grammar

  KERNELS_SECTION ::= INTEGER<"KernelCount"> OFFSET<"KernelNameOffset">*

The Kernels section defines a table of kernel names, directly corresponding to the names in the MLIR host executor program. This allows references from the Functions section to use dense indexes into the table.

The format of this section is an Integer count of the number of kernels in the table, followed by 'count' Integer values which are offsets into the string table for the kernel name.

Rationale

We want dense indexes from kernels in the Functions section, and the Host Executor wants to resolve names to kernel implementations at startup time anyway, as such, it makes sense for the host executor to build its own mutable array in memory for kernels.

Types Section

Grammar

  TYPES_SECTION ::= INTEGER<"TypeCount"> OFFSET<"TypeNameOffset">*

The Types section defines a table of type names used by the host program, and is laid out exactly the same as the Kernels section. The Functions section uses Indexes into this section to specify types of registers.

FunctionIndex Section

Grammar

  FUNCTION_INDEX_SECTION ::= INTEGER<"NumFunctions"> FUNCTION_ENTRY*
  FUNCTION_ENTRY         ::= BYTE<"FunctionKind"> OFFSET<"Function"> \
                             OFFSET<"Name"> INTEGER<"NumArguments"> \
                             INDEX<"Type">* INTEGER<"NumResults"> INDEX<"Type">*

The FunctionIndex section defines a table of functions in the BEF file, one for each function, including the kind of this function (defined in bef_encoding.h), an Offset into the Functions Section, a name (an Offset into the Strings section, which may be an empty string), and a list of argument and result types.

Rationale

This defines a symbol table for the BEF file, allowing clients to look up functions by name. While we could intersperse this information into the Functions section itself, doing so would require the reader to make a pass over all of the functions ahead of time. We'd prefer to have a quick index that the reader can scan at load time, deferring processing of any individual function until it is needed.

Functions Section

Grammar

  FUNCTIONS_SECTION ::= FUNCTION*

The Functions section is a list of Function records emitted to the byte stream and then addressed by an Offset indicated by the FunctionIndex section. Functions section is 4-byte aligned.

Rationale

Regions in MLIR are a generalization of a unit of computation. The BEF format and BEF executor support a subset of MLIR region features that are core to the abstractions we need to model - in particular, while MLIR regions can have multiple basic blocks in them, BEF Functions only support a single block - this guarantees that they are always be a DAG of computation.

Functions are used for top-level functions, which BEF files have direct support for (allowing lookup of functions by name) as well as in nested positions for control flow and other concepts that occur with MLIR regions.

Function Definition

Grammar

  FUNCTION       ::= OFFSET<"Location"> REGISTER_TABLE KERNEL_TABLE \
                     RESULT_REGS BYTE<"AlignmentPadding">* KERNEL+

  REGISTER_TABLE ::= INTEGER<"NumRegs"> REGISTER_ENTRY*
  REGISTER_ENTRY ::= INTEGER<"NumUses">

  KERNEL_TABLE   ::= INTEGER<"NumKernels"> KERNEL_ENTRY*
  KERNEL_ENTRY   ::= OFFSET<"KernelOffset"> INTEGER<"NumOperands"> \
                     INTEGER<"StreamId">

  RESULT_REGS    ::= INDEX<"Register">*

Each function is defined by a location (an offset into the LocationPositions section), a register table, a kernel table, a list of result registers, a list of kernels with 4-byte alignment, and ends with a fixed32 integer of value zero.

The Register Table is a count of registers, and an entry for each register - indicating the number of kernels in this section that use the register.

The Kernel Table for a function is a count of kernels, an offset (from the end of the Kernel Table) of the start of the kernel, the number of operands that the kernel has, and a stream id that is used to help runtime scheduling decisions, e.g. successive kernels with the same stream id can be executed in the same thread.

The kernel list that is following the Kernel Table contains all the kernels used in this function. Note that every function has a pseudo kernel that is the single entry point to the rest of the kernels. Specifially, a pseudo kernel defines registers for function arguments and a pseudo register that is conceptually used by kernels that takes no arguments.

The result registers specify the register values to return, and must align with the function result types from the FunctionIndex section.

Rationale

These two tables are key to allowing the BEF executor to efficiently dispatch and destroy kernels in a lock-free way. At startup time, the executor inflates these two tables into arrays, resolving the type descriptors for the types, and building a table of ready counts for the kernels. Having a table of kernels, allow the use of kernel indexes.

Kernel Definition

Grammar

  KERNEL             ::= KERNEL_HEADER KERNEL_RESULT_TABLE KERNEL_BODY

  KERNEL_HEADER      ::= FIXED32<"KernelCode"> FIXED32<"KernelLocation"> \
                         FIXED32<"NumArguments"> FIXED32<"NumAttributes"> \
                         FIXED32<"NumFunctions"> FIXED32<"NumResults">

  KERNEL_RESULT_TABLE::= FIXED32<"NumUsedBys">*

  KERNEL_BODY        ::= FIXED32<KernelArgument>* FIXED32<KernelAttribute>* \
                         FIXED32<KernelFunction>* FIXED32<KernelResult>* \
                         FIXED32<KernelUsedBy>*

Each instance of a kernel includes a kernel header, a result table and a kernel body. The kernel header consists of a opcode (an index into the Kernels table, defined by the Kernels section), a location (an offset into the LocationPositions section) the numbers of arguments, attributes, functions and results in the kernel body.

The result table contains NumResults fixed32 integers, indicating the number of users for each corresponding result. The kernel body consists of zero or more inputs (indexes into the Register Table for this function), zero or more constants (offsets into the Attributes section), zero or more functions (indexes into the FunctionIndex section), zero or more results (indexes into the Register table for this function), and zero or more 'used by' records (which are indexes into the Kernel Table for this function). 'used by' records for the same result are grouped together and these groups are in the same order of 'result' records. For example, if there are two results, A and B, and A has one user (a0) and B has two users (b0, b1), then FIXED32<"NumResults"> will be 0x00000002, followed by two FIXED32<"NumUsedBys">, 0x00000001 0x00000002; and in the kernel body there will be three "used by" records, a0, b0 and b1, consecutively.

Rationale

This record allows us to efficiently form the array of arguments and results that get passed to a kernel implementation function. When the kernel completes, the UsedBy records allow us to efficiently trigger execution of data dependent kernels if they are ready, by decrementing the "NumOperands" field for the kernel.

LocationStrings Section

Grammar

  LOCATION_STRINGS_SECTION ::= NULL_TERMINATED_STRING*

This section contains a list of NULL terminated strings used by locations section. A filename field of FileLineCol location and name field of Name location are stored in this section.

Rationale

We choose to store these separately from the string table, because these locations are only ever decoded in the case of an error. There is no reason to dirty data cache lines with them, and we expect no reuse with other general strings.

Locations Section

Grammar

  LOCATIONS_SECTION ::= LOCATION*

  LOCATION = <UNKNOWN_LOC | FILELINECOL_LOC | NAME_LOC | CALLSITE_LOC
              | FUSED_LOC>

  UNKNOWN_LOC ::= `0x00`

  FILELINECOL_LOC ::= `0x01` OFFSET<"Filename"> INTEGER<"LineNum"> \
                      INTEGER<"ColumnNum">

  NAME_LOC ::= `0x02` OFFSET<"Name"> LOCATION<"Child">

  CALLSITE_LOC ::= `0x03` LOCATION<"Callee"> LOCATION<"Caller">

  FUSED_LOC ::= `0x04` INTEGER<"NumLocations"> LOCATION*

This section contains a list of "locations", which can have five different types: Unknown, FileLineCol, Name, CallSite location, and Fused locations. Strings used in FileLineCol and Name locations are added to the LocationStrings section.

AttributeTypes Section

Grammar

  ATTRIBUTE_TYPES_SECTION   ::= INTEGER<"NumAttributes"> ATTRIBUTE_TYPE_ENTRY*
  ATTRIBUTE_TYPE_ENTRY      ::= OFFSET<"Attributes"> BYTE<"AttributeType">

The AttributeTypes section is an optional section which is not needed for BEF execution, but this section is necessary for bef-to-mlir conversion. This section describes the type information for each attribute (specified by an Offset into Attributes section). An attribute type is an one-byte enum described in Attributes section and defined in bef_encoding.h.

AttributeNames Section

Grammar

  ATTRIBUTE_NAMES_SECTION ::= INTEGER<"NumFunctions"> KERNEL_TABLE*
  KERNEL_TABLE            ::= INTEGER<"NumKernels"> OFFSET<"AttributeName">*

The AttributeNames sections is an optional section which is not needed for BEF execution, but this section is necessary for bef-to-mlir conversion. This section describes the attribute names used by each kernel. There is a 1:1 mapping between Kernel Table in this section and Function Entries in FunctionIndex section. And there is a 1:1 mapping between kernel entries in this section and kernel entries in Functions section's kernel table. Each kernel entry contains any number of AttributeNames. AttributeName is an offset to Strings section that specifies the name of the attribute used by this kernel.

RegisterTypes Section

Grammar

  REGISTER_TYPES_SECTION ::= INTEGER<"NumFunctions"> REGISTER_TYPE_TABLE
  REGISTER_TYPE_TABLE    ::= INTEGER<"NumRegs"> INDEX<"Types">*

The RegisterTypes section is an optional section which is not needed for BEF execution, but this section is necessary for bef-to-mlir conversion. This section describes the type information for registers in each function. The RegisterType Table is a count of registers, and an entry for each register indicating its index into Types section. There is a 1:1 mapping between RegisterType Tables and Function Entries in FunctionIndex section. And it is a 1:1 mapping between types in a RegisterType Table and registers in the RegisterTable of the corresponding Function.