kumofs is a scalable and highly available distributed key-value store.
- Data is replicated over multiple servers.
- Data is partitioned over multiple servers.
- Extreme single node performance; comparable with memcached.
- Both read and write performance got improved as servers added.
- Servers can be added without stopping the system.
- Servers can be added without changing the client applications.
- The system does not stop even if one or two servers crashed.
- The system does not stop to recover crashed servers.
- Scalable from 2 to 60 servers. (more than 60 servers has not be tested yet)
- Optimized for storing a large amount of small data.
- memcached protocol support.
- supported commands are get (+get_multi), set, delete, gets and cas.
- specify -F option to the kumo-gateway to save flags.
- specify -E option to the kumo-gateway to save expiration time.
kumofs supports following 4 operations:
Set(key, value) Store the key-value pair. One key-value pair is copied on three servers. If the Set operation is failed (because of network trouble, etc.), the associated value of the key becomes indefinite. Retry to set the value, delete the key or not to get the key.
value = Get(key) Retrieve the associated value of the key.
Delete(key) Delete the key and its associated value.
CAS(key, value, compare) Compare-and-Swap the key and its associated value. The semantics of the CAS operation is that "the swapping always fails if the comparison fails". This means that the swapping may not succeed if the comparison succeeds. This restriction is caused when some servers are detached or attached. You are required to retry the operation if the swapping is failed.
Following environment is required to build kumofs:
- linux >= 2.6.18
- g++ >= 4.1
- ruby >= 1.8.6
- Tokyo Cabinet >= 1.4.10
- MessagePack for C++ >= 0.3.1
- MessagePack for Ruby >= 0.3.1
- libcrypto (openssl)
- zlib
Configure and install in the usual way:
$ ./bootstrap # if needed
$ ./configure
$ make
$ sudo make install
To install Tokyo Cabinet, get the latest source package from the web site and run ./configure && make && make install.
To install MessagePack, get the latest source package from the web site and run ./configure && make && make install.
The administration tools are implemented in ruby. Since they requires "msgpack" package, install it using RubyGems.
$ sudo gem install msgpack
The latest release of the kumofs is available at Downloads. Get it and run ./configure && make && make install.
$ ./bootstrap # if needed
$ ./configure
$ make
$ sudo make install
According to the install directory of MessagePack or Tokyo Cabinet, --with-msgpack option or --with-tokyocabinet option might have to be added to ./configure.
$ ./bootstrap
$ ./configure --with-msgpack=/usr/local --with-tokyocabinet=/opt/local
$ make
$ sudo make install
kumofs system consists of following 3 daemons:
kumo-server this node stores data. Run this daemon on least one hosts. You can add kumo-servers after building the cluster.
kumo-manager this node manages kumo-servers. Run this daemon on one or two hosts.
kumo-gateway this daemon receives requests from applications and relays it to the kumo-servers. Run this daemon on the hosts that runs applications that uses kumofs.
This tutorial explains how to build kumofs cluster using 3 hosts named svr1,svr2,svr3. Run kumo-manager on svr1 and svr2 and run kumo-server on svr1,svr2,svr3. Then use the kumofs cluster from client host named cli1.
First, run kumo-manager on svr1 and svr2. kumo-manager requires IP address of its host and IP address of the another kumo-manager.
[on svr1]$ kumo-manager -v -l svr1 -p svr2
[on svr2]$ kumo-manager -v -l svr2 -p svr1
kumo-manager listens on 19700/tcp by default.
Second, run kumo-server on svr1,svr2,svr3. kumo-server requires IP address of its host, IP address of the kumo-managers and path to the database file:
[on svr1]$ kumo-server -v -l svr1 -m svr1 -p svr2 -s /var/kumodb.tch
[on svr2]$ kumo-server -v -l svr2 -m svr1 -p svr2 -s /var/kumodb.tch
[on svr3]$ kumo-server -v -l svr3 -m svr1 -p svr2 -s /var/kumodb.tch
Third, attach kumo-servers into the cluster using kumoctl command. Confirm that the kumo-servers are recognized by the kumo-manager first:
[ ]$ kumoctl svr1 status
hash space timestamp:
Wed Dec 03 22:15:55 +0900 2008 clock 62
attached node:
not attached node:
192.168.0.101:19800
192.168.0.102:19800
192.168.0.103:19800
Recognized kumo-servers are listed at not attached node. Then, attach them as follows:
[ ]$ kumoctl svr1 attach
Attached kumo-servers are listed at attached node. Confirm it as follows:
[ ]$ kumoctl svr1 status
hash space timestamp:
Wed Dec 03 22:16:00 +0900 2008 clock 72
attached node:
192.168.0.101:19800 (active)
192.168.0.102:19800 (active)
192.168.0.103:19800 (active)
not attached node:
Finally, run kumo-gateway on the hosts that runs applications that uses kumofs. kumo-gateway requires IP address of the kumo-managers and port number that accepts memcached protocol client.
[on cli1]$ kumo-gateway -v -m svr1 -p svr2 -t 11211
It is ready to use kumofs cluster. Connect localhost:11211 using memcached client on cli1.
To add a server into the system that is built in the tutorial, run kumo-server on another host, and then attach it using kumoctl command.
[on svr4]$ kumo-server -v -l svr4 -m svr1 -p svr2 -s /var/kumodb.tch
[ ]$ kumoctl svr1 attach
[localhost]$ kumo-manager -v -l localhost
[localhost]$ kumo-server -v -m localhost -l localhost:19801 -L 19901 -s ./database1.tch
[localhost]$ kumo-server -v -m localhost -l localhost:19802 -L 19902 -s ./database2.tch
[localhost]$ kumo-server -v -m localhost -l localhost:19803 -L 19902 -s ./database3.tch
[localhost]$ kumo-gateway -v -m localhost -t 11211
You can get status of the cluster from from kumo-manager. Use kumoctl command to get the status. Specify IP address of the kumo-manager for first argument and 'status' for second argument.
[ ]$ kumoctl svr1 status
hash space timestamp:
Wed Dec 03 22:15:45 +0900 2008 clock 58
attached node:
192.168.0.101:19800 (active)
192.168.0.102:19800 (active)
192.168.0.103:19800 (active)
192.168.0.104:19800 (fault)
not attached node:
192.168.0.104:19800
hash space timestamp The time that the list of attached kumo-servers is updated. It is updated when new kumo-server is added or existing kumo-server is down.
attached node The list of attached kumo-servers. (active) is normal node and (fault) is fault node or recovered but not re-attached node.
not attached node The list of recognized but not-attached nodes.
kumo-server is the node that stores data. You can get load information such as the number of items that is stored on the node using kumostat command.
$ kumostat svr3 items
$ kumostat svr3 cmd_get
uptime uptime of the kumo-server
version version of the kumo-server
cmd_get total number of processed get requests
cmd_set total number of processed set requests
cmd_delete total number of processed delete requests
items number of stored items
kumotop command shows status of the servers like top command.
$ kumotop -m svr1
kumofs distributes key-value pairs into the multiple servers. kumohash command look up which node stores the key.
$ kumohash -m svr1 assign "the-key"
First, confirm that which of kumo-servers are down:
[ ]$ kumoctl m1 status # m1 is address kumo-manager
hash space timestamp:
Wed Dec 03 22:15:35 +0900 2008 clock 50
attached node:
192.168.0.101:19800 (active)
192.168.0.102:19800 (active)
192.168.0.103:19800 (active)
192.168.0.104:19800 (fault)
not attached node:
(fault) nodes are down.
Second, reboot the host and run kumo-server. Before running kumo-server, remove database file. Otherwise deleted keys may be restored. The status of cluster will be as following:
[ ]$ kumoctl m1 status
hash space timestamp:
Wed Dec 03 22:15:45 +0900 2008 clock 58
attached node:
192.168.0.101:19800 (active)
192.168.0.102:19800 (active)
192.168.0.103:19800 (active)
192.168.0.104:19800 (fault)
not attached node:
192.168.0.104:19800
Finally, attach the kumo-server:
[ ]$ kumoctl m1 attach
[ ]$ kumoctl m1 status
hash space timestamp:
Wed Dec 03 22:15:55 +0900 2008 clock 62
attached node:
192.168.0.101:19800 (active)
192.168.0.102:19800 (active)
192.168.0.103:19800 (active)
192.168.0.104:19800 (active)
not attached node:
To get best performance, create database file using tchmgr command before starting kumo-server. See documents of Tokyo Cabinet for details.
Example:
[on svr1]$ tchmgr create /var/kumodb.tch 1310710
[on svr1]$ kumo-server -v -l svr1 -m mgr1 -p mgr2 -s /var/kumodb.tch
kumo means cloud in Japanese. Cloud won't fall down ;-) fs means Fast Storage.