Clustering Guide

natacado edited this page Sep 12, 2010 · 3 revisions

Dynomite was written from the ground up to be used as a distributed system. It’s easy to get started and add nodes to a running cluster.

Let’s say you started up a node on a single machine called host A.

[user@hosta]$ ./bin/dynomite start -c config.json -l /var/log/dynomite -d

Since N is set to 3, you should add at least 2 other nodes to the network in order to achieve your replication requirements. On host B you should start with a slightly different command line.

[user@hostb]$ ./bin/dynomite start -c config.json -j dynomite@hosta -l /var/log/dynomite -d

This tells the dynomite instance on host B that it should try to contact host A and take over some of its partitions. After this operation is done, half of the partitions will be mastered by host B. After host B comes up you can add host C to the mix.

[user@hostc]$ ./bin/dynomite start -c config.json -j dynomite@hosta -l /var/log/dynomite -d

Congratulations, you now have a 3 node dynomite network. Clients can connect to any of these nodes on either the thrift port (9200) or the ascii port (11222). In ideal conditions each node in the network will present a consistent view of the entire data store. In the case of network partitions and other errors consistency and eventually durability will degrade with the level of failure.

Joining nodes to an already running cluster

To join a node to an already running network one need only start a new node on a host and join it to one of the nodes already running in a network. To extend the example above, one would issue the following command on host D.

[user@hostd]$ ./bin/dynomite start -c config.json -j dynomite@hosta -l /var/log/dynomite -d

The difference is that if the dynomite network has been running for some time, there may be a significant amount of data to transfer over to host D. This is currently accomplished via the synchronization mechanism, which isn’t the most efficient for bulk transfer of data. So the one caveat with adding nodes to a running network is that it will cause load spike on some of the nodes in the network as the newer empty node is brought up to speed.

Currently there are no provisions made for quickly growing a dynomite network. For instance, if one wanted to double the number of nodes in a network, it may dilute the partition map enough to cause data loss. This will be addressed in the near future with a more controlled bootstrapping procedure for new nodes.