hack

sit edited this page Mar 11, 2013 · 2 revisions
Clone this wiki locally

Hacking with Chord

Hacking with Chord

This document will introduce you writing applications that make use of various functionality in the MIT Chord distribution. This process is not for the faint-of-heart: as of April 2005, it is not well documented and has only been done by graduate students that are very familiar with the internals of the Chord code. The difficulty is futher compounded by the fact that the implementation makes use of the somewhat under-documented SFS libraries. Please send us suggestions and comments.

Architecture

The Chord/DHash daemon, lsd, is structured as a single monolithic process that provides two important functions:

  • Routing (via the Chord protocol and its many variants)
  • Storage (using DHash)

lsd communicates with peer nodes to handle these functions, using ONC RPC protocols. In addition, lsd listens on a separate TCP port and local unix domain socket as a dhashgateway to provide applications using the DHT with a simple interface for inserting and retrieving blocks. (Note that DHash is designed primarily to provide immutable storage; for example, there is no method for deletion of data.) We have two programming interfaces that you can use to interact with DHash and Chord.

C++ interface

The C++ interface to dhashgateway is using libdhashclient which is built in the dhash/ directory of the source tree. This is the only library that needs to be linked into an application program in order to call out to an external lsd process. The header for this library is dhashclient.h.

This interface has actual stub functions that take care of marshalling and calling the RPCs for you.
Compared to the Python interface, you will need to learn how to program in the SFS callback style, but this interface can operate much faster than the Python one. Also, only an asynchronous interface is provided; there are currently no synchronous stubs (though it would not be too difficult to produce wrappers around the current functions that simply block calling acheck()).

For an example of using this interface, see the source code for dhash_benchmark.C in the devel/ sub-directory and associated rules in devel/Makefile.am. A very simplified skeleton program might look something like:

#include <async.h>
#include <string.h>
#include <dhash_prot.h>
#include <dhashclient.h>

void cb (dhash_stat status, ptr<insert_info> i) {
  exit (!(status == DHASH_OK));
}
int main (int argc, char *argv[]) {
  ptr<dhashclient> dhash = New refcounted<dhashclient> ("/tmp/chord-sock");
  dhash->insert ("hello", strlen ("hello"), wrap (&cb));
  amain ();
}

There is no stub wrapper support for calling the various Chord RPCs though you can also do that directly (see devel/rpclib.C and devel/findroute.C).

Python interface

The Python interface to dhashgateway is less well tested and relies on the Python stubs generated by the rpcc compiler included in SFS-0.8pre and later. (As of April 2005, you must use the CVS version of SFS to use this feature; a release of SFS-0.8 is not anticipated prior to June 2005.) If you have a recent CVS snapshot of Chord and SFS, the relevant .py files will be automatically generated for you in your svc build directory.

There are also several other files that are needed in order to write Python clients --- svc/bigint.py, devel/RPCProto.py and devel/RPC.py. The easiest way to get all these files in the right place is to gmake install and then add ${prefix}/share/chord-0.1 to your PYTHONPATH. However, you can manually add the various directories to your PYTHONPATH as well, if you prefer running directly from your build tree or source tree.

RPC.py provides an asynchronous (using asyncore) and synchronous implementation of RPC calling. This is roughly equivalent to libarpc from SFS --- there is not yet a Python equivalent of libdhashclient. Further, this implementation only supports connecting to the TCP socket of lsd and not the local unix domain socket: lsd listens for TCP connections on the port above the port specified on the command-line by the -p flag. For example, if lsd is run with -p 10000, you would need to connect to port 10001.

A very simplified skeleton program might look like:

import RPC
import sha
import dhashgateway_prot
client = RPC.SClient (dhashgateway_prot,
    dhashgateway_prot.DHASHGATEWAY_PROGRAM, 1, "yourhost", 10001)
sock = client.start_connect ()
print sock
arg = dhashgateway_prot.dhash_insert_arg ()
arg.block   = make_block (data_size)
sobj = sha.sha(arg.block)
arg.blockID = bigint(sobj.digest())
print "Inserting", arg.blockID
arg.ctype   = dhash_types.DHASH_CONTENTHASH
arg.len     = data_size
arg.options = 0
arg.guess   = bigint(0)

res = client (dhashgateway_prot.DHASHPROC_INSERT, arg)
print res

Debugging Information

The lsd daemon will print out useful debugging information when signaled. SIGUSR1 prints statistics about the node including its routing tables and RPC timer history. SIGUSR2 will stop the stabilization process. This is useful if you wish to test how Chord behaves after a failure occurs and before it can be repaired. SIGINT will exit lsd cleanly, useful if you are using dmalloc and want a memory usage report. You can also use the lsdctl command to get some information from from a running lsd process.

Because it uses the SFS RPC library, setting the environment variables ACLNT_TRACE and/or ASRV_TRACE will cause our applications to pretty-print each RPC it sends or recieves. Try setting these to 2 for basic information and up to 10 to see arguments in more detail.

Layout

The bulk of the Chord protocol is implemented in the 'chord` directory. It contains:

  • chord container object and vnode interface: chord.h
  • Basic Chord vnode implementation: chord_client.C, server.C, chord.C.
  • Basic Chord data structures:
    • Stabilizable superclass for managing periodic updates
    • Successor list: succ_list
    • Finger table: finger_table
    • Predecessor list, so we can estimate key range we might replicate in DHash: pred_list
    • Proximity neighbor selection based fingers for performance: finger_table_pns
    • Accordion routing table: accordion_table
  • RPC transport subsystem: comm.h, comm.C, stp_manager.C
  • Routing:
    • Main route iterator structure route.h
    • Debruijn routing
    • Standard Chord routing (iterative for various finger types)
    • Recursive Chord routing (recursive for various finger types)
    • Accordion routing

The various different configurations (e.g. SOSP2001 vs NSDI2004 vs ...) are created as different implementations of the vnode interface with various routing implementations and supporting data structures instantiated. An produce_vnode method is used (see table in lsd/lsd.C) to select appropriate instantiations.

Protocols are all specified in the svc directory. Notably relevant:

  • All RPCs are encapsulated in a Chord specific RPC wrapper defined by transport_prot.x. This is needed mostly for Vivaldi coordinates (SIGCOMM).
  • chord_types.x specifies base types used in many places. This is separate from the main protocol definition in order to reduce dependencies between files elsewhere in the system and the Ch ord protocol.
  • chord_prot.x specifies basic RPCs used by all variants of the Chord protocol, mainly fetching successors and predecessors with varying degrees of detail.
  • fingers_prot.x defines the RPCs used to get fingers (in both the SOSP2001 and NSDI2004 PNS based variants).
  • accordion_prot.x is used for the Accordion algorithm (NSDI2005).
  • debruijn_prot.x is probably broken and used by Debruijn routing (IPTPS).
  • Recursive routing state is handled in recroute_prot.x.