Skip to content

Intermediate Representation

Martin Thompson edited this page Aug 18, 2018 · 76 revisions

SBE uses an Intermediate Representation (IR) between data layout mechanisms and code generators and on-the-fly decoders. SBE Tool can generate serialized IR via the sbe.generate.ir option (false by default). This generates a file, conventionally using the .sbeir extension, that can serve as input to SBE Tool instead of XML for code generation. In addition, the On-The-Fly (OTF) decoder, requires serialized IR for decoding. In this section, we will explore how to generate IR and use it, as well as how IR works.

NOTE: it is our hope that serialized IR can form the foundation for a tool chain approach.

Generating Serialized IR

Generating serialized IR is straight forward with the SBE Tool. Simply set sbe.generate.ir to true. The SBE tool will to write the serialized IR file using the same name as the input file with the .sbeir suffix. Here is an example.

$ java -Dsbe.generate.ir=true -jar sbe-all-1.8.7.jar <message-declarations-file.xml>

Using Serialized IR as Input to SBE Tool

Serialized IR can act as input to the SBE Tool for code generation. This is done by using the file extension .sbeir instead of .xml. Here is an example.

$ java -jar sbe-all-1.8.7.jar <message-declarations-file.sbeir>

Therefore it is possible to use serialized IR as a means to pass around "compiled" representations.

Structure of Serialized IR

IR is serialized via SBE itself. The schema for SBE IR is here.

The format of a serialized IR file is a simple sequence, or list, of "Tokens". Each Token is one of the SerializedToken messages in the schema above. The first sequence of Tokens in the file is the message header for the messages. This is a composite, usually with the name messageHeader that holds the following three encodings: blockLength, templateId, and version. After the header is one or more messages represented by sequences of Tokens. At a high level, a serialized IR file looks like this:

Header (List of Tokens) Message(s) (List of Tokens)

Each Token has several fields, one of which is the Signal. See here and here for more detail on the Token elements. The header would be expanded to look something like below.

BEGIN_COMPOSITE
name="messageHeader"
ENCODING
name="blockLength"
type=uint16
size=2
offset=0
ENCODING
name="templateId"
type=uint16
size=2
offset=2
ENCODING
name="version"
type=uint8
size=1
offset=4
END_COMPOSITE
name="messageHeader"

Each message is similar in that it is started with BEGIN_MESSAGE and ended with END_MESSAGE.

Careful readers may realise that it is simple to add more messages to an IR file by concatenation. Just concatenate a list of Tokens to the end of the file. This is intentional and provides a simple mechanism for composition.

Clone this wiki locally