Library for reading and writing NBT files, as used by Minecraft.
Latest commit 3ff77d7 Oct 8, 2011 @mental refactor a little



NBTFile is a low-level library for reading and writing NBT-format files, as used by the popular game Minecraft.

Official Specification

The official (if somewhat confusing) specification for the NBT file format may be found at

Data Model

The NBT data model has ten different data types:

  • 8-bit signed integers (Tag_Byte)

  • 16-bit signed integers (Tag_Short)

  • 32-bit signed integers (Tag_Int)

  • 64-bit signed integers (Tag_Long)

  • 32-bit floating-point numbers (Tag_Float)

  • 64-bit floating-point numbers (Tag_Double)

  • UTF-8 strings (Tag_String)

  • raw byte sequences (Tag_Byte_Array)

  • homogenous lists (Tag_List)

  • compound structures (Tag_Compound)

Compound structures (Tag_Compound) are unordered collections where each item (“tag”) has a name associated with it. Compound structures are heterogenous; the elements of a compound structure may be any mixture of types.

Lists (Tag_List) are ordered collections of unnamed items. Lists are homogenous; every element of a particular list must have the same type. Note that all lists have the same type (Tag_List) regardless of the type of their elements.

Items of an eleventh “type” (Tag_End) serve to terminate compound structures and lists (of any type). The wording of the official specification could permit lists of Tag_End, but this is not supported in practice.

The top level of an NBT file must be a single-element compound structure.


Low-level API

The low-level API deals directly in NBT tokens, which are represented by eleven types:

  • NBTFile::Tokens::TAG_End

  • NBTFile::Tokens::TAG_Byte

  • NBTFile::Tokens::TAG_Short

  • NBTFile::Tokens::TAG_Long

  • NBTFile::Tokens::TAG_Float

  • NBTFile::Tokens::TAG_Double

  • NBTFile::Tokens::TAG_Byte_Array

  • NBTFile::Tokens::TAG_String

  • NBTFile::Tokens::TAG_List

  • NBTFile::Tokens::TAG_Compound

Their constructors each take two arguments: name and value, which are also exposed by similarly-named accessors. Most of these tokens represent simple values, but TAG_List and TAG_Compound introduce the beginning of a list and a compound structure, and TAG_End terminates either one.

The name field supplies the field name when the token is part of a compound structure, and is ignored when it is part of a list.

For most token types, value represents the value of the token. For TAG_List it represents the type of the list. It is ignored for compound structures.



High-level API

Much of the high-level API deals in higher-level data types from the NBTFile::Types module.

There are eight “primitive” types with value fields which reflect their native Ruby values:

  • NBTFile::Types::Byte

  • NBTFile::Types::Short

  • NBTFile::Types::Int

  • NBTFile::Types::Long

  • NBTFile::Types::Float

  • NBTFile::Types::Double

  • NBTFile::Types::ByteArray

  • NBTFile::Types::String

There are also two “complex” types which reflect NBT lists and compound structures:

  • NBTFile::Types::List - an Array-like class corresponding to an NBT list

  • NBTFile::Types::Compound - a Hash-like class correspodning to an NBT compound structure





NBT files are gzip-compressed; the structure of the uncompressed data can be described by the following ABNF (see RFC 5234).

nbt-data = tag-compound

tag-compound = TAG-COMPOUND name compound-body

compound-body = *tag TAG-END

tag = TAG-BYTE name byte /
      TAG-SHORT name short /
      TAG-INT name int /
      TAG-LONG name long /
      TAG-FLOAT name float /
      TAG-DOUBLE name double /
      TAG-STRING name string /
      TAG-BYTE-ARRAY name byte-array /
      TAG-LIST name list /

name = string

list = TAG-BYTE list-length *byte /
       TAG-SHORT list-length *short /
       TAG-INT list-length *int /
       TAG-LONG list-length *long /
       TAG-FLOAT list-length *float /
       TAG-DOUBLE list-length *double /
       TAG-STRING list-length *string /
       TAG-BYTE-ARRAY list-length *byte-array /
       TAG-LIST list-length *list /
       TAG-COMPOUND list-length *compound-body

list-length = int

; see RFC 3629 for definition of UTF8-octets
string = string-length UTF8-octets

string-length = short

byte-array = byte-array-length *byte

byte-array-length = int

byte = OCTET ; 8-bit signed integer

short = 2OCTET ; 16-bit signed integer, big-endian

int = 4OCTET ; 32-bit signed integer, big-endian

long = 8OCTET ; 64-bit signed integer, big-endian

float = 4OCTET ; 32-bit float, big-endian, IEEE 754-2008

double = 8OCTET ; 64-bit float, big-endian, IEEE 754-2008

TAG-END = %x00
TAG-BYTE = %x01
TAG-SHORT = %x02
TAG-INT = %x03
TAG-LONG = %x04
TAG-FLOAT = %x05
TAG-LIST = %x09

Note on Patches/Pull Requests

  • Fork the project.

  • Make your feature addition or bug fix.

  • Add tests for it. This is important so I don't break it in a future version unintentionally.

  • Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)

  • Send me a pull request. Bonus points for topic branches.


Copyright © 2010-2011 MenTaLguY. See LICENSE for details.