Skip to content

DHT Setup Guide

Thilina Buddhika edited this page Jun 12, 2020 · 5 revisions

Given below is the list of main steps when setting up a Sustain-DHT cluster.

  1. Setting up a Zookeeper cluster.
  2. Configuring a storage node
  3. Launching the DHT
  4. Launching the proxy nodes
  5. Terminate a cluster

1. Setting up a Zookeeper Cluster.

Synopsis DHT uses Apache Zookeeper for membership management. We have tested our implementation with v3.4.6 of Zookeeper.

It is recommended to run an ensemble of Zookeeper to avoid single point of failures. Please follow the official guide on setting up a Zookeeper cluster. Based on our experience, cluster of 3 nodes is sufficient to manage a cluster of few hundred physical nodes.

2. Configuring a storage node

Extract the binary archive. Let's refer the resulting directory as the synopsis-dht-0.x. The configuration file, named dht-node-config.yaml is available in the conf directory of the synopsis-dht-0.x directory.

Following is the list of key properties that need to be configured.

Property Description
ingestionServicePort Port where all gRPC services will be running
zkEnsemble List of Zookeeper servers
storageDirs Storage directories and the maximum space allowed to be used in GB, provided as a YAML dictionary
rootJournalLoc Location used for storing the log of the Node Manager
memTableSize Maximum allowed size of a memory table
blockSize Maximum allowed size of a block within a SSTable
metadataStoreDir Directory used for storing the logs of the Entity Stores

3. Launching the DHT

To launch a single node, use the following command. cd synopsis-dht-0.x
sh bin/node_starter.sh

However to launch multiple nodes at once, you can use the dssh utility. dssh utility script is included inside the bin directory. The list of machines can be provided using one of the several input formats supported by the script. In the following example, we provide the list of machines as a file - each machine name is separated by a newline. An example of a list of machines is given below.

lattice-1
lattice-2
lattice-3

Assuming Command: cd synopsis-dht-0.x
sh bin/dssh -cap -f dht_nodes 'cd /path/to/synopsis-dht-0.1/bin;nohup sh node_starter.sh > /tmp/$HOSTNAME-dht-node.log &'

This will launch each node ignoring hangup signals when the ssh session is terminated. The logs will be written to a file in the /tmp directory. To read the log files, we can use the same utility.

./dssh -cap -f dht_nodes 'tail -f /tmp/$HOSTNAME-dht-node.log'

4. Launching the proxy nodes

Extract the archive. Let's assume the extracted directory name as synopsis-proxy-0.x. Proxy nodes share the same configuration file as the DHT nodes. But the only effective property is ingestionServicePort. Please change it accordingly if required.

To launch a single node; cd synopsis-proxy-0.x
sh bin/node_starter.sh proxy

To launch multiple proxy nodes at once; sh bin/dssh -cap -f proxy_nodes 'cd /path/to/synopsis-proxy-0.1/bin;nohup sh node_starter.sh proxy > /tmp/$HOSTNAME-proxy-node.log &'

5. Terminate a cluster

Use the kill_node.sh script to terminate a node.

sh bin/dssh -cap -f host_names 'cd /path/to/synopsis-dht-0.1/bin; sh kill_node.sh'