Skip to content
Pull request Compare This branch is 892 commits behind reverbrain:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
..
Failed to load latest commit information.
gen
README

README

libeblob is a low-level IO library which stores data in large blob files.

Following features are supported:

    * fast append-only updates which do not require disk seeks
    * compact index to populate lookup information from disk
    * multi-threaded index reading during starup
    * O(1) data location lookup time
    * ability to lock in-memory lookup index (hash table) to eliminate memory swap
    * readahead games with data and index blobs for maximum performance
    * multiple blob files support (tested with blob-file-as-block-device too)
    * optional sha256 on-disk checksumming
    * 2-stage write: prepare (which reserves the space) and commit
    	(which calculates checksum and update in-memory and on-disk indexes).
	    One can (re)write data using pwrite() in between without locks
    * usual 1-stage write interface
    * flexible configuration of hash table size, flags, alignment
    * defragmentation tool: entries to be deleted are only marked as removed,
    	eblob_check tool can actually remove those blocks from blob files
	    by creating a copy and optionally rename it to original file
    * off-line blob consistency checker: we put a checksum into data blob,
    	eblob_check can verify if data matches
    * run-time sync support - library can sync files in background on timed base

All exported functions can be found in include/eblob/blob.h header file,
here is a small example.

-------------------------------------------------------------------------------------
Somewhat obsoleted developer documentation follows
-------------------------------------------------------------------------------------

Initialization/cleanup:
struct eblob_backend *eblob_init(struct eblob_config *c);
void eblob_cleanup(struct eblob_backend *b);
----------------------------------------------------------------------------------

Configuration structure:
struct eblob_config {
	/* hash table size in entries */
	unsigned int		hash_size;

	/* hash table flags above */
	unsigned int		hash_flags;

	/* sync interval in seconds */
	int			sync;

	/* alignment block size*/
	unsigned int		bsize;

	/* logger */
	struct eblob_log	*log;

	/* copy of the base blob file name
	 * library will add .0 .1 and so on
	 * to this name when new files are created
	 *
	 * it will add .index to store on-disk index
	 */
	char			*file;

	/* number of threads which will iterate over
	 * each blob file at startup
	 */
	int			iterate_threads;

	/* maximum blob size (supported modifiers: G, M, K)
	 * when blob file size becomes bigger than this value
	 * library will create new file
	 */
	uint64_t		blob_size;
};
--------------------------------------------------------------------------

Following directory structure can be created after a while:
$ ls -l /tmp/blob/
-rw-r--r-- 1 zbr zbr 45056 2010-07-26 22:09 data.0
-rw-r--r-- 1 zbr zbr    52 2010-07-26 22:09 data.0.index
-rw-r--r-- 1 zbr zbr 45056 2010-07-26 22:10 data.1
-rw-r--r-- 1 zbr zbr    52 2010-07-26 22:10 data.1.index
-rw-r--r-- 1 zbr zbr 45056 2010-07-27 19:37 data.2
-rw-r--r-- 1 zbr zbr    52 2010-07-26 22:10 data.2.index
-rw-r--r-- 1 zbr zbr     0 2010-07-26 22:10 data.3
-rw-r--r-- 1 zbr zbr     0 2010-07-26 22:10 data.3.index

those were created with 'file' paremeter being '/tmp/blob/data'.

There are two ways to iterate over files: using eblob_backend structure
and manually.

/*
 * Iterate over single blob file (struct eblob_backend_io) from offset @off
 * total of @size bytes. If @size is 0, function will iterate over whole blob file.
 * @check_index specifies whether @io->fd or @io->index file descriptor will be used
 * (check_index being true means @io->index).
 *
 * Callback will receive provided as @priv private data, per-entry disk control structure,
 * file index (i.e. $num in blob pathname '/tmp/blob/data.$num' and data pointer
 * (obtained via iterating over mapped area, so it can not be modified)
 */
int eblob_iterate(struct eblob_backend_io *io, off_t off, size_t size,
		struct eblob_log *l, int check_index,
		int (* callback)(struct eblob_disk_control *dc, int file_index,
			void *data, off_t position, void *priv),
		void *priv);

/* Iterate over all blob files */
int eblob_blob_iterate(struct eblob_backend *b, int check_index,
	int (* iterator)(struct eblob_disk_control *dc, int file_index, void *data,
		off_t position, void *priv),
	void *priv);

Library will try to find all $file.$num where $num runs from 0.
-------------------------------------------------------------------------------------

Object removal.
/* Remove entry by given key.
 * Entry is marked as deleted and defragmentation tool can later drop it.
 */
int eblob_remove(struct eblob_backend *b, unsigned char *key, unsigned int ksize);
--------------------------------------------------------------------------------------

Data reading.
/* Read data by given key.
 * @fd is a file descriptor to read data from. It is not allowed to close it.
 * @offset and @size will be filled with written metadata: offset of the entry
 * and its data size.
 */
int eblob_read(struct eblob_backend *b, unsigned char *key, unsigned int ksize,
		int *fd, uint64_t *offset, uint64_t *size);
---------------------------------------------------------------------------------------

Data writing.
There are two write types: sync, when data is being written by library and async,
when prepare phase reserves requested size on disk and returns file descriptor
and offset to (re)write data to. Commit phase will write data footer and
optionally calculate checksum.

/*
 * Sync write: we will put data into some blob and index it by provided @key.
 * Flags can specify whether entry is removed and whether library will perform
 * data checksumming.
 * Flags are BLOB_DISK_CTL_* constants above.
 */
int eblob_write_data(struct eblob_backend *b, unsigned char *key, unsigned int ksize,
		void *data, uint64_t size, uint64_t flags);

/* Async write.
 *
 * There are two stages: prepare and commit.
 *
 * Prepare stage receives @eblob_write_control structure and wants
 * @size and @flags parameters. The former is used to reserve enough space
 * in blob file, the latter will be put into entry flags and will determine
 * whether given entry was removed and do we need to perform checksumming on commit.
 *
 * @eblob_write_prepare() will fill the rest of the parameters.
 * @fd specifies file descriptor to (re)write data to.
 * @io_index is a blob file index.
 * @offset specifies position where client is allowed to write to no more than @size bytes.
 *
 * @ctl_offset is start of the control data on disk for given entry.
 * @total_size is equal to aligned sum of user specified @size and sizes of header/footer 
 * structures.
 */
struct eblob_write_control {
	uint64_t			size;
	uint64_t			flags;

	int				fd;
	int				io_index;

	uint64_t			offset;

	uint64_t			ctl_offset;
	uint64_t			total_size;
};
int eblob_write_prepare(struct eblob_backend *b, unsigned char *key, unsigned int ksize,
		struct eblob_write_control *wc);

/* Client may provide checksum himself, otherwise it will be calculated (if opposite
 * was not requested in control flags) */
int eblob_write_commit(struct eblob_backend *b, unsigned char *key, unsigned int ksize,
		unsigned char *csum, unsigned int csize,
		struct eblob_write_control *wc);

struct eblob_disk_footer {
	unsigned char			csum[EBLOB_ID_SIZE];
	uint64_t			offset;
} __attribute__ ((packed));
Something went wrong with that request. Please try again.