Format
This is not fully implemented just yet, working on it!
Numbers are in MSB-first or big-endian byte order.
Every 4096th byte of the file is a zero, unless a header immediately follows.
All data "chunks" are written to the file prefixed with their length as a 32-bit big endian number, followed by the CRC32 hash of the data. The CRC32 is not included in the length.
Prefixed with a 32 bit length and a hash, similarly to other data chunks, but the length does include the length of the hash.
Values in the file header
- 8 bits -- File format version (Currently 10)
- 48 bits -- Update sequence number. This is the sequence number new updates should start at.
- 48 bits -- Purge sequence number.
- 48 bits -- Purged documents pointer
- 16 bits -- Size of
by_seq
B-tree root - 16 bits -- Size of
by_id
B-tree root - 16 bits -- Size of local documents B-tree root
- The B-tree roots, in the order of the sizes, are B-tree node pointers as described in the "B-trees" section.
- First byte -- 1 if a leaf node or K/V node, 0 if a pointer node or K/P Node
- A list of key-value or key-pointer pairs, each of which is:
- 12 bits -- Key size
- 28 bits -- Value size
- Followed by the Key data,
- then the Value data
In pointer nodes the Value parts of these pairs are pointers to another B-tree node, where keys less than or equal to that pair's Key will be.
- 48 bits -- Node position in file
- 48 bits -- Sub tree size, or disk size of B-tree data below this node. For pointers to leaf nodes this is the size on disk of the pointed to node, otherwise it is the sum of the size of the pointed to node and all the values of this field on the pointers in that node.
- 16 bits -- Reduce value size (NOT present in the root pointers embedded in the file header, as it can be inferred from the lengths in the header)
- Reduce value -- Used for memoizing the intermediate values of a reduce function.
The keys in this B-tree are 48-bit numbers representing the sequence number of a change to the database. This number is strictly increasing.
The values are
- 12 bits -- Size of the document ID
- 28 bits -- Size of the document data
- 1 bit -- Deleted flag (this item in the by-sequence index represents a deletion action if this is set)
- 47 bits -- Position of the document content on disk
- 8 bits -- Content type code
- 32 bits -- Document revision number
- Document ID
- Revision ID
The reduce value in this B-tree is a 40-bit number, indicating the amount of records.
This B-tree is an index of documents by their IDs, the keys are the document IDs.
The keys are ordered as if sorted by memcmp
The values are
- 48 bits -- The sequence number this document was last modified at. If this document is modified, this sequence number should be removed from the by-sequence index.
- 32 bits -- Size of the document data
- 1 bit -- Deleted flag. If this is set this document should be ignored in indexing.
- 47 bits -- Position of the doucment content on disk
- 8 bits -- Content type code
- 32 bits -- Document revision umber
- Revision ID
The reduce value in this B-tree is
- 40 bits -- Number of documents that are not flagged as deleted.
- 40 bits -- Number of documents that are flagged as deleted
- 48 bits -- Total disk size of document content
The high bit of this byte is used to indicate whether the document is compressed on disk with the google snappy compression library, the rest is used for identifying the type of the document data.
- 0 -- JSON data
- 1 -- BLOB data (Attempted to parse as JSON but failed)
- 2 -- BLOB data (Attempted to parse as JSON, JSON contained reserved keys)
- 3 -- BLOB data (No attempt to parse as JSON)
This is identifying information about a document revision that should be small enough to be kept in the indexes. This value is not handled by Couchstore.
It's currently used by Couchbase as
- 64 bits -- Memcached CAS value (unique cookie used by memcache to allow atomic writes)
- 32 bits -- Expiration time of a value
- 32 bits -- User flags as specified by Memcache protocol