Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Distributed Storage System for QEMU

Revert "sheep: forbid revival of orphan objects"

The reverted patch actually couldn't pass our 061, which means that we will lose
data in the following case:

0 A cluster with 3 copies
1 3 nodes of the cluster failed at the same time
2 after sometime, the cluster is recovered. As we can expect, some poor objects
  that happened to have all its copies dwell on 3 failed nodes, will not be
  recovered. Instead, all the copies are in those failed nodes. We DON'T lose
  them yet.
3 we start those 3 nodes to expect recover the previously unrecovered objects.
  Unfortunately, the reverted patch will mark all 3 nodes as excluded and not
  try to recover the objects at all. So we LOSE the objects permanentally after
  a new epoch is reached, because a cleanup operation across the cluster will be
  triggered and those good objects are cleaned up.

P.S., Merge the patch not passing our test framework is dangerouse!

Signed-off-by: Liu Yuan <liuyuan@cmss.chinamobile.com>
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
latest commit ffa26f6030
Liu Yuan authored mitake committed
Failed to load latest commit information.
debian fix control file for missing yasm binary
doc sheep: cleanup generational reference counting code and document
dog dog: fix invalid argument for ec_decode_buffer
include sheep: replace public functions from plain_store
lib sheep: replace public functions from plain_store
lttng lttng: add a script based on babeltrace for analyzing request latency
man man: add script/gen_man.pl to generate manuals
sbd Revert "sbd: use kstrtoul() instead of strict_strtoul()"
script Set maximum number of open file descriptors to 32768
sheep Revert "sheep: forbid revival of orphan objects"
sheepfs sheepfs/http: clear performance tips generated by cppcheck
shepherd sheepdog:show more detail of the crash source
tests tests:fix vnode strategy output
tools tools/zk_control: fix typo of QUEUE_ZNODE
.gitignore tests/earthquake: testing framework and first test case
.gitmodules tests/earthquake: add tgt as a submodule
CHANGELOG.md Open up 1.0 development
COPYING add a COPYING file
INSTALL Update version for 0.7.0 release
MAINTAINERS update MAINTAINERS file for earthquake testing stuff
Makefile.am deb: create deb package from raw source archive
README update README
autogen.sh configure: fix a stale sheep version output
configure.ac configure: enable LTTng with --enable-lttng-ust option
sheepdog.spec.in rpm: fix build error

README

Sheepdog: Distributed Storage System for KVM
============================================

Overview
--------
Sheepdog is a distributed storage system for QEMU. It provides
highly available block level storage volumes to virtual machines. 
Sheepdog supports advanced volume management features such as snapshot,
cloning, and thin provisioning.

Sheepdog is an Open Source software, released under the terms of the
GNU General Public License version 2. 

For the latest information about Sheepdog, please visit our website at:

   http://sheepdog.github.io/sheepdog/

And (recommend for new comers) wiki at:
   https://github.com/sheepdog/sheepdog/wiki/

Requirements
------------
* Three or more x86-64 machines
* Corosync cluster engine

Install
-------
Please read the INSTALL file distributed with this package for detailed
instructions on installing or compiling from source.

Usage
-----

* Cluster Management Backends

   Sheepdog uses a cluster management backend to manage membership and broadcast
   messages to the cluster nodes.

   For now, sheepdog can use local driver (for development on a single box),
   corosync (the default), zookeeper and Accord.

* Local Driver

   This driver just makes use of UNIX IPC mechanism to manage the membership
   on a single box, where we start multiple 'sheep' processes to simulate the
   cluster. It is very easy and fast setup and especially useful to test
   functionality without involving any other software.

   To set up a 3 node cluster using local driver in one liner bash:

      $ mkdir /path/to/store
      $ for i in 0 1 2; do sheep -c local /path/to/store/$i -z $i -p 700$i;done

* Configure corosync.

   Nearly every modern Linux distribution has x86_64 corosync binaries pre-built
   available via their repositories. We recommend you use these packages if they
   are available on your distribution.

   For debian package based systems:

      $ sudo aptitude install corosync libcorosync-dev

   For RPM package based systems:

      $ sudo yum install corosynclib-devel

   Reference our wiki, the corosync(8) and corosync.conf(5) man page for further
   details.

* Setup Sheepdog
   1. Launch sheepdog on each machines of the cluster.

      $ sheep /store_dir

      Notes:
        /store_dir is a directory to store objects. The directory must
        be on the filesystem with an xattr support. In case of ext3, you
        need to add 'user_xattr' to the mount options.

        $ sudo mount -o remount,user_xattr /store_device

   2. Make fs

      $ dog cluster format --copies=3

      --copies specifies the number of default data redundancy. In this case,
      the replicated data is stored on three machines.

   3. Check cluster state

      Following list shows that Sheepdog is running on 32 nodes.

      $ dog node list
        Idx	Node id (FNV-1a) - Host:Port
      ------------------------------------------------
        0	0308164db75cff7e - 10.68.13.15:7000
      * 1	03104d8b4315c8e4 - 10.68.13.1:7000
        2	0ab18c565bc14aea - 10.68.13.3:7000
        3	0c0d27f0ac395f5d - 10.68.13.16:7000
        4	127ee4802991f308 - 10.68.13.13:7000
        5	135ff2beab2a9809 - 10.68.14.5:7000
        6	17bd6240eab65870 - 10.68.14.4:7000
        7	1cf35757cbf47d7b - 10.68.13.10:7000
        8	1df9580b8960a992 - 10.68.13.11:7000
        9	29307d3fa5a04f78 - 10.68.14.12:7000
        10	29dcb3474e31d4f3 - 10.68.14.15:7000
        11	29e089c98dd2a144 - 10.68.14.16:7000
        12	2a118b7e2738f479 - 10.68.13.4:7000
        13	3d6aea26ba79d75f - 10.68.13.6:7000
        14	42f9444ead801767 - 10.68.14.11:7000
        15	562c6f38283d09fe - 10.68.14.2:7000
        16	5dd5e540cca1556a - 10.68.14.6:7000
        17	6c12a5d10f10e291 - 10.68.14.13:7000
        18	6dae1d955ca72d96 - 10.68.13.7:7000
        19	711db0f5fa40b412 - 10.68.14.14:7000
        20	7c6b95212ee7c085 - 10.68.14.9:7000
        21	7d010c31bf11df73 - 10.68.13.2:7000
        22	82c43e908b1f3f01 - 10.68.13.12:7000
        23	931d2de0aaf61cf5 - 10.68.13.8:7000
        24	961d9d391e6021e7 - 10.68.13.14:7000
        25	9a3ef6fa1081026c - 10.68.13.9:7000
        26	b0b3d300fed8bc26 - 10.68.14.10:7000
        27	b0f08fb98c8f5edc - 10.68.14.8:7000
        28	b9cc316dc5aba880 - 10.68.13.5:7000
        29	d9eda1ec29c2eeeb - 10.68.14.7:7000
        30	e53cebb2617c86fd - 10.68.14.1:7000
        31	ea46913c4999ccdf - 10.68.14.3:7000

* Create a virtual machine image
   1. Create a 256 GB virtual machine image of Alice.

      $ qemu-img create sheepdog:Alice 256G

   2. You can also convert from existing KVM images to Sheepdog ones.

      $ qemu-img convert ~/amd64.raw sheepdog:Bob

   3. See Sheepdog images by the following command.

      $ dog vdi list
        name        id    size    used  shared    creation time  object id
      --------------------------------------------------------------------
        Bob          0  2.0 GB  1.6 GB  0.0 MB 2010-03-23 16:16      80000
        Alice        0  256 GB  0.0 MB  0.0 MB 2010-03-23 16:16      40000

* Boot the virtual machine
   1. Boot the virtual machine.

      $ qemu-system-x86_64 -hda sheepdog:Alice

   2. Following command checks used images.

      $ dog vm list
      Name            |Vdi size |Allocated| Shared  | Status
      ----------------+---------+---------+---------+------------
      Bob             |   2.0 GB|   1.6 GB|   0.0 MB| running on xx.xx.xx.xx
      Alice           |   256 GB|   0.0 MB|   0.0 MB| not running

* Snapshot
   1. Snapshot

      $ qemu-img snapshot -c name sheepdog:Alice

      -c flag is meaningless currently

   2. After getting snapshot, a new virtual machine images are added as a not-
      current image.

      $ dog vdi list
        name        id    size    used  shared    creation time  object id
      --------------------------------------------------------------------
        Bob          0  2.0 GB  1.6 GB  0.0 MB 2010-03-23 16:16      80000
        Alice        0  256 GB  0.0 MB  0.0 MB 2010-03-23 16:21      c0000
      s Alice        1  256 GB  0.0 MB  0.0 MB 2010-03-23 16:16      40000

   3. You can boot from the snapshot image by spcifing tag id

      $ qemu-system-x86_64 -hda sheepdog:Alice:1

* Cloning from the snapshot
   1. Create a Charlie image as a clone of Alice's image.

      $ qemu-img create -b sheepdog:Alice:1 sheepdog:Charlie

   2. Charlie's image is added to the virtual machine list.

      $ dog vdi list
        name        id    size    used  shared    creation time  object id
      --------------------------------------------------------------------
        Bob          0  2.0 GB  1.6 GB  0.0 MB 2010-03-23 16:16      80000
        Alice        0  256 GB  0.0 MB  0.0 MB 2010-03-23 16:21      c0000
      s Alice        1  256 GB  0.0 MB  0.0 MB 2010-03-23 16:16      40000
        Charlie      0  256 GB  0.0 MB  0.0 MB 2010-03-23 16:23     100000

Test Environment
----------------
    - Debian squeeze amd64
    - Debian lenny amd64

===============================================================================
Copyright (C) 2009-2011, Nippon Telegraph and Telephone Corporation.
Something went wrong with that request. Please try again.