Adhd is an asynchronous, distributed hard drive. Actually we're not sure if it's really asynchronous or even what asynchronicity would mean in the context of a hard drive, but it is definitely distributed. Adhd is essentially a management layer (written using Ruby and eventmachine) which controls clusters of CouchDB databases to replicate files across disparate machines. Unlike most clustering storage solutions, adhd assumes that machines in the cluster may have different capabilities and is designed to work both inside and outside the data centre.
Someone told us recently, “if you're not ashamed of your code, you're not releasing early enough.” The shame we feel hopefully indicates that the decision to start coding on Github from day 1 was ideologically correct in an agile world.
So, adhd is aggressively alpha. The docs suck, the latest code from Github may or may not work at all, we're still feeling our way forward on how to write tests for a distributed system, and not all features work. On the other hand, it is possible to install a ruby gem, start several nodes, add files to the system, and see them replicate across nodes. Performance optimizations and some of the more advanced sharding features are the main missing things; the software does sometimes work at present.
The big recent triumph is that we have acheived “jellyfish” status, one of our design goals. We uploaded a bunch of 60MB files, watched replication happen, then cut the connection between several parts of the cluster and uploaded more files. When the connection between the pieces of the system was restored, content replicated properly throughout the nodes and the system recombined into a single cluster once again.
How it works
The user issues an HTTP PUT request to a ADHD url. This request can go to any machine in the ADHD cluster. A file is attached to the body of the request, and ADHD stores the file on a number of different servers. At a later time, the user can issue a GET request to the same url and get back their file.
Under the Hood
Multiple server nodes are part of an ADHD network. Each node has a full copy of two CouchDB databases: the node management database, and the shard range database.
The node management database keeps a record of all existing nodes, their location (IP), and current status. This database is replicated across all nodes in the cluster; any node is in theory capable of becoming a management node if the current management nodes become unavailable.
The shard range database is also replicated everywhere. It allows any node to find out which storage nodes are responsible for holding a particular piece of content. Management nodes are in charge of maintaining the shard database and pushing changes to non-management nodes.
Storage nodes hold one or more shards in addition to the management and shard range databases. These shards contain the file content and some metadata about the stored file. Physically, a shard consists of a CouchDB database which holds Couch documents - one attached file per document.
A PUT request reaches any ADHD node. The first line of the request is parsed, and the ADHD node figures out the name, MIME type and content length of the file based on the incoming HTTP request headers. Based on the MD5 hash of the filename, we extract a 20-byte internal ID string for the file and use this string to figure out which shard range the file will be assigned to.
Once we have a shard range, we ask the shard range database which node(s) are in charge of storage for the given shard and are currently available. One of these nodes is chosen (starting with a master node and falling back to other nodes as necessary). The file metadata is written as a CouchDB document to the chosen node, followed by the file as an attachment.
The GET request is basically the same as the PUT request, except that the file is retrieved from the CouchDB shard instead of being created on a shard.
Replication happens differently for different databases. The node management database and the shard range database are periodically synced to and from the management nodes. The content shards are replicated when they get updated (i.e. when a file is added to a node which is responsible for a given shard, the file will be replicated to all other nodes which are responsible for the same shard).
We can have multiple redundant nodes storing the same files, getting rid of the need for backups. Not every node needs to store every file, so we can scale up storage capacity by adding more nodes. The design also provides load balancing as it means that we can serve files from multiple nodes.
Because all admin information is shared by every node, the unavailability of any node does not jeopardize the operation of the system. As soon as a node becomes unavailable, any other node can perform its functions (although we will lose storage capacity throughout the cluster if we lose multiple large storage nodes).
Some vapourware objectives
The design also allows for different capabilities in storage nodes. Shards can be assigned based on specific properties (available bandwidth, available processing power, etc) so that we can store files in the most efficient possible way. It might be desirable, for example, to put all video files on machines with fast processors for doing video encoding, whereas audio and photos wouldn't need any post-processing and could go somewhere else. Or we could store new and popular content in shard ranges which are on fast servers with high throughput, while putting less popular content on servers with less bandwidth (and lower storage costs). Ideally we would like to get to a point where low-demand files can be stored on relatively low-bandwidth home or office network connections and still be available in the cluster.
Fictional use cases (no one has ever used this software in real life)
Wikipedia: photos of Cheryl Cole and Robbie Williams are unfortunately wildly popular at present and will be requested often. Photos of the millions of singers with more talent and less marketing budgets will not be requested very often, but it's still good if they are in the cluster. If by some stroke of good luck Kevin Quain suddenly gets the recognition he deserves, his photo will be shifted from the “dial-up” node class to something with more stomp.
Archive.org: those sex-ed videos and Bert the turtle from the Prelinger Archive are awesome and everybody wants to watch them, so they should be on beefy video-storage nodes with high bandwidth. Documentaries on the yellow-bellied sapsucker may be requested only once in a while.
BBC News: material going up on the site right now will need a lot of bandwidth, but by next week most of this media will be out of the news cycle and can be consigned to a set of servers with much less bandwidth. By next year, it is unlikely that this media will be viewed even a few times a day, so it could be smart to trade storage costs for speed and put old media on cheaper, lower-bandwidth boxes with huge hard drives.
Don't use this software yet. It is experimental and may eat your mother.
Having said that,
sudo apt-get install couchdb; sudo gem install adhd
may do the job.
You'll need a node_config.yml file that looks something like this:
--- node_name: my_adhd_node node_url: "0.0.0.0" couchdb_server_port: 5984 http_server_port: 5985 buddy_server_url: "http://other.couch.server:5984" buddy_server_node_name: another_adhd_node
The config parameters are:
node_name: a unique prefix name that will be used as an identifier for your nodes in the global node database.
node_url: the hostname or IP which the node should bind to.
couchdb_server_port: the port that CouchDB is running on, usually 5984.
http_server_port: the port which the adhd node should run on.
buddy_server_url: the URL (including the CouchDB port) of any other adhd node which is currently running. The node will connect to the buddy_server_url and replicate the cluster's management database information from it. This can be blank for the first node or if you want to make an entirely new cluster.
buddy_server_node_name: the node_name of the buddy node as set in its config file.
You can start adhd once it's installed by invoking it like this:
adhd -c /path/to/node_config.yml
Note that by default, CouchDB binds only to the IP 127.0.0.1, so if you want to be able to replicate in a cluster across machines, you will need to modify your CouchDB instances so that they bind to 0.0.0.0 (all available interfaces). Don't do this on a public-facing server unless you work out a security vpn, are really promiscuous or feeling insane.
If it worked, you should see a bunch of messages in your console telling you what's replicating, how many shards there are, etc. If you want to test out multiple nodes on the same machine, you can simply have multiple config files which use different http_server_port and node_name values.
Once a node has started, you can do something like this to add content to it:
curl -i -0 -T /path/to/somefile.txt http://192.168.1.93:5985/adhd/somefile.txt
That command should PUT somefile.txt into the adhd cluster. If there is more than one node running, somefile.txt will be automatically replicated to the proper shard.
Note on Patches/Pull Requests
Fork the project.
Make your feature addition or bug fix.
Add tests for it. This is important so I don't break it in a future version unintentionally.
Commit, do not mess with rakefile, version, or history (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull).
Send me a pull request. Bonus points for topic branches.
Copyright © 2009 dave@netbook. See LICENSE for details.