Skip to content

Maxime2/libbson

 
 

Repository files navigation

libbson

libbson is a library providing useful routines related to building, parsing, and iterating BSON documents. It is a useful base for those wanting to write high-performance C extensions to higher level languages such as python, ruby, or perl.

Building

From Git

$ sudo yum install automake autoconf libtool make gcc
$ ./autogen.sh --enable-silent-rules
$ make
$ sudo make install

You can run the unit tests with

$ make test

From Tarball

$ ./configure --enable-silent-rules
$ make
$ sudo make install

Developing using libbson

In your source code:

#include <bson.h>

To get the include path and libraries appropriate for your system.

gcc my_program.c $(pkg-config --cflags --libs libbson-1.0)

Overview

Types

The following list details the various types in libbson and their use. See their individual headers for documentation on the available functions and macros.

bson_t

The bson_t structure is encapsulates a BSON document buffer. It manages growing the buffer using power of 2 allocations as new fields are appended to the BSON document.

Functions working upon a bson_t that do not mutate state are marked as "const bson_t *". These functions are safe to use with an inline sequence of BSON documents as such might be found in a MongoDB wire-protocol packet.

bson_iter_t

Iterating upon a bson_t is performed using a stack allocated bson_iter_t. These structures do not need to be cleaned up after and therefore can be discarded at any time (meaning there is no bson_iter_destroy() function).

Various functions are provided to access fields of different types. You can get the field name with bson_iter_key() and the field type with bson_iter_type(). Additionally, the BSON_ITER_HOLDS_*() macros are a convenient way to check a fields type.

bson_visitor_t

If you would like to iterate upon all of the fields of a BSON you may be interested in bson_visitor_t. It provides a callback style visitor that will call a function for each field found. This is typically useful when building a document in a higher level language binding such as Python, Ruby, or Perl.

bson_context_t

To aid in performance critical functions, a bson_context_t may be required. Think of this structure as a "library" handle. It allows fine tuning of configuration so that various optimizations may occur. This is particularly useful with OID generation so that shared state may be avoided. Optimizations that avoid mutexes or atomic increments can be performed here.

Some systems may not know when fork() has been called underneath them as well as hostname changes. Checking for these often has serious performance implications and therefore are opt-in. See bson_context_flags_t for more information.

bson_oid_t

This structure contains a 12-byte BSON ObjectId. Various routines are provided to manipulate the ObjectId's as well as convert to and from strings.

bson_reader_t

In various drivers you may need to parse a sequential stream of BSON documents. Reducing the number of allocations in this process has positive implications for speed of parsing. bson_reader_t helps abstract the parsing of a sequential list of BSON documents.

Additionally, you can parse a stream of BSON documents from a file using bson_reader_init_from_fd(). This is handy if you are processing backups from the mongodump command.

bson_writer_t

It would be useful to be able to create wire-protocol packets while serializing directly to the same buffer. bson_writer_t achieves this by allowing the caller to provide a malloc()d buffer and a realloc() function to resize the buffer. Additionally, you may set an offset in the buffer to start from. This is useful since you can start encoding to the buffer directly after the message header structure. Buffers are grown in powers of two.

Performance Tricks

Serialize Documents into Output Buffer

You can use bson_writer_t to serialize your dictionary/hash types to the target packet buffer. It supports providing a realloc function to resize the packet buffer as you go. This saves you many small allocations and instead has one large growing allocation. This should help with memory fragmentation greatly.

Build sub-documents with bson_append_document_begin()

When building a sub-document, you have two options. You can build the bson_t structure on its own, using its own allocations. Or, you can use bson_append_document_begin() and bson_append_document_end() to build the document inside of the parent document. This allows you to use one allocation for both rather than multiple.

Keep small documents on the stack

bson_t will use an inline buffer for small documents. Any document under 56 bytes can be built inline on the stack. bson_t will automatically switch to a malloc()d buffer when overflowing that internal buffer.

Remember that even if you are building a bson_t on the stack, you are required to call bson_destroy().

About

A BSON utility library.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C 96.4%
  • C++ 2.4%
  • Python 1.1%
  • Shell 0.1%