BFAST

This repository has been archived; the code is now being maintained within the vim-format repository.

BFAST

BFAST stands for the Binary Format for Array Serialization and Transmission.

BFAST is a data format for simple, efficient, and reliable serialization and deserialization of collections of binary data with optional names as a single block of data. It is designed so that efficient and correct readers and writers of the format can be quickly written in different languages.

BFAST is intended to be a high-performance implementation that is fast enough to use as a purely in-memory low-level data format, for representing arbitrary data such as meshes, point-clouds, image data, etc. and to scale to data that must be processed out of core. One of the design goals was to assure that the format could be easily and efficiently decoded using JavaScript on most modern web-browsers with very little code.

BFAST is maintained by VIMaec LLC and is licensed under the terms of the MIT License.

Use Case

You would use the BFAST structure if you have a binary data to serialize that is mostly in the form of long arrays. For example a set of files that you want to bundle together without wanting to bring in the overhead of a compression library or re-implementing TAR. We use BFAST to encode mesh data and as containers for other data.

Features

Very small implementation overhead
Easy to implement efficient and conformant encoders and decoders in different languages
Fast random access to any point in the data format with a minimum of disk accesses
Format and endianess easily identified through a magic number at the front of the file
Data arrays are 64 byte aligned to facilitate casting to SIMD data types (eg. AVX-512)
Array offsets are encoded using 64-bit integers to supports large data sets
Positions of data buffers are encoded in the beginning of the file
Quick and easy to validate that a block is a valid BFAST encoding of data

Rationale

Encoding containers of binary data is a deceptively simple problem that is easy to solve in ways that are overly complex, inefficient, or dependent on a particular platform. We are proposing a standardized solution to the problem in the form of a specification and sample implementation that can allow software to easily encode low level binary data in a manner that is both efficient and cross-platform.

Related Libraries

The following is a partial list of commonly used binary data serialization formats:

For a more comprehensive list see:

Specification

The file format consists of three sections:

Header - Fixed size descriptor (32 bytes) describing the file contents
Ranges - An array of offset pairs indicating the begin and end of each buffer (relative to file begin)
Data - 64-byte aligned data buffers

Header Section

The header is a 32-byte struct with the following layout:

    [StructLayout(LayoutKind.Explicit, Pack = 8, Size = 32)]
    public struct Header
    {
        [FieldOffset(0)]    public long Magic;         // 0xBFA5
        [FieldOffset(8)]    public long DataStart;     // <= File size and >= 32 + Sizeof(Range) * NumArrays 
        [FieldOffset(16)]   public long DataEnd;       // >= DataStart and <= file size
        [FieldOffset(24)]   public long NumArrays;     // Number of all buffers, including name buffer
    }

Ranges Section

The ranges start at byte 32. There are NumArrays of them and they have the following format. NumArrays is the total count of all buffers, including the first buffer that contains the names. NumArrays should always be equal to or greater than one. Each Begin and End values are byte offsets relative to the beginning of the file.

    [StructLayout(LayoutKind.Explicit, Pack = 8, Size = 16)]
    public struct Range
    {
        [FieldOffset(0)] public long Begin;
        [FieldOffset(8)] public long End;
    }

Data Section

The data section starts at the first 64 byte aligned address immediately following the last Range value. This value is stored for validation purposes in the header as DataStart.

Names Buffer

The first data buffer contain the names of the subsequent buffers as a concatenated list of Utf-8 encoded strings separated by null characters. Names may be zero-length and are not guaranteed to be unique. A name may contain any Utf-8 encoded character except the null character.

There must be N-1 names where N is the number of ranges (i.e. the NumArrays value in header).

Implementations

The official reference implementation of BFAST is written in C# and targets .NET Standard 2.0. The C# test suite uses NUnit and targets .NET Core 2.1. At VIM AEC we are using BFAST in production code that targets Unity 2019.1 and .NET Framework 4.7.1.

There is currently a C++ encoder and a JavaScript decoder implementation under development, but they are not tested and supported yet.

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
.github/workflows		.github/workflows
csharp		csharp
include		include
js		js
.gitignore		.gitignore
BFast.sln		BFast.sln
license.txt		license.txt
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BFAST

Use Case

Features

Rationale

Related Libraries

Specification

Header Section

Ranges Section

Data Section

Names Buffer

Implementations

About

Releases

Packages

Contributors 4

Languages

License

vimaec/bfast

Folders and files

Latest commit

History

Repository files navigation

BFAST

Use Case

Features

Rationale

Related Libraries

Specification

Header Section

Ranges Section

Data Section

Names Buffer

Implementations

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages