Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Fetching contributors…

Cannot retrieve contributors at this time

124 lines (87 sloc) 4.906 kB
==================
Farm Store
==================
Liu Yuan <namei.unix@gmail.com> Taobao Inc.
1. OVERVIEW
Farm is an object store for Sheepdog on node basis. It consists of backend
store, which caches the snapshot objects, and working directory, storing
objects that Sheepdog currently operates. That being said, the I/O performance
for VM Guests would be practically the same as Simple Store. [*]
[*] Simple Store is an older storage backend which has been removed from
the tree.
Snapshots are triggered either by system recovery code or users, and Farm is
supposed to restore all the object states into the ones at the time of the user
snapshot being taken. Snapshot object in the context means both meta object and
data object.
2. DESIGN
Simply put, Farm somewhat resembles git a lot (both code and idea level).
there are three object type, named 'data, trunk, snapshot[*]' that is
similar to git's 'blob, tree, commit'.
[*] shorten to 'snap' below.
'data' object is just Sheepdog's I/O object, only named by its sha1-ed
content. So the data objects with the same content will be mapped to only
single sha1 file, thus achieve node-wide data sharing.
'trunk' object ties data objects together into a flat directory structure at
the time of the snapshot being taken. The trunk object provides a means to
find old data objects in the store.
'snap' object describes the snapshot, either initiated by users or triggered
by recovery code. The snap object refers to one of the trunk objects. The two
snap log files provides a means to name the desired snap object.
All the objects are depicted in the context of snapshotting or retrieving old
data from the snapshotted objects, that is, those objects are 'cached' into
Farm store by performing snapshot operations.
2. OBJECT LAYOUT
All the objects(snap, trunk, data) in the Farm is based on the operations of
the sha1_file. sha1_file provides us compressed and consistency-aware
characteristics independent of content or the type of the object.
The object successfully inflates to a stream of bytes that forms a sequence of
<sha1_file_hdr> + <binary object data>
| |
header payload
The payload of the data object is the compressed content of Sheepdog's I/O object.
For trunk object, the compressed content is
<array of the struct trunk_entry>
struct trunk_entry {
uint64_t oid;
unsigned char sha1[SHA1_LEN];
};
For snap object, the compressed content is
<trunk_sha1> + <array of the struct sd_node>
As for snap operations, besides snap object, Farm has two log files with the below
structure
struct snap_log {
uint32_t epoch;
uint64_t time;
unsigned char sha1[SHA1_LEN];
};
This provides an internal naming mechanism and help us find snap objects by epoch.
3. STALE OBJECT
For storing one object into backend store when the snapshot is taken, either
a) no content change, then point to the same old sha1_file (no stale object)
or
b) content updated, then will point to a new object with a new sha1.
We need to remove stale object in case b), only in the assumption that it is the
object generated by recovery code. [*]
When we try store new snapshot object into the backend store, it is safe and
good timing for us to remove the old object with the same object ID.
For user snapshot objects, we don't need to remove them until the snapshot is deleted.
[*] Here I assume we don't need to restore to 'sys epoch' state.
4. FLOW FIGURE
sys_snap, user_snap snapshot requests
| |
|put/get snap_sha1 | trigger
v |
+----------+ +------+ +--------+ v +----------+
| |<------>| snap |<++++++>| | <========> | |
| | +------+ | | | Farm |
| | | trunk | | Working | I/O +-------+
| |<---------------------->| | | Directory| <~~~~~~>|sheep |
| Farm | +--------+ | | +-------+
| Backend | | |
| Store | | |
| |<-------------------------------------------->| |
| | | |
+----------+ +----------+
<-----> put/get objects to/from Farm Store
<+++++> put/get trunk_sha1 to/from snap object
<=====> put/get oid/oid_sha1 pairs to/from trunk object
Jump to Line
Something went wrong with that request. Please try again.