A very simple peer-to-peer implementation using consistent hashing, the Thrift protocol and its basic RPC capabilities. It is not expected to be generically usable by arbitrary projects: thrifty-p2p implements a flat network, which is justifiable for the author's purposes (distributing processing amongst dozens of data-rich nodes), but will encounter trouble scaling.
python location.py [-h peer_node] [-p port_num] [--help]
Initiates and/or joins a simple peer-to-peer network. Default port_num is 9900. Absent a peer_node (which is the peer initially contacted for joining the network), the command initiates a network.
Example usage, in different terminal windows:
python location.py python location.py --host localhost:9900 --port 9901 python location.py -h localhost:9901 -p 9902
... etc. ...
Dependencies & Requirements
The basic server uses consistent hashing via hash_ring.py, developed by Amir Salihefendic. For better and for worse, I modified the code to allow for easier node set manipulations (as if it were a list) and different node resolution behavior so that the internal keys are more directly related to md5 hashes.
The library requires the Apache Thrift Python libraries to be installed, but -- because the thrift compiler has been run already -- does not currently require the Thrift compiler or any other language libraries.
I'm using Thrift as much for its simple RPC underpinnings as the message format. I wanted to consistently distribute some processing and bother as little as possible with node lookup. This is the simplest thing that I came up with that sort-of works: a flat peer-to-peer network with each node keeping up with the state of all the nodes as well as it can. This necessarily limits the ring's ability to scale beyond dozens of nodes.
Diststore is an example of an application that can be layered atop the thrifty-p2p location service. It's a toy version of a distributed key-value store of the sort that all the cool kids are doing these days. The underlying peer-to-peer implementation makes no attempt to optimize for large scales, so don't get too excited about deploying this in production. Rather, it's provided for instruction and testing.
Although diststore.thrift uses locator.thrift for a lot of its interfaces, the implementation overloads some of the base methods as an illustration of how this simple library might be used in other projects.
Example usage, run this in a couple terminal windows:
Note how the second and following servers auto-discover the other servers you've initiated, join them as peers on the network, and find an open port. Then in another window start populating the store:
python storeput.py a apple python storeput.py b banana python storeput.py c crayon python storeput.py d dinosaur python storeprimer.py
You can then query the store:
python storeget.py c python storetest.py
or start a new server, and see how a minimum of the existing keys redistribute themselves and an example of consistent hashing:
or even stop a server with ctrl-C or a
kill -INT signal and see
how the server hands its items off gracefully. However, the system
is not robust to any more aggressive termination: keys will go missing.
What's happening here? Because every node has a full model of the network, it
knows which node to forward a
get() request to, or where to hand off its
items when it leaves the network. Key methods here are overridden from
There are two primary thrift interfaces exposed in locator.thrift:
Locator.Base supports primal methods
don't rely on any of the locator services.
Locator.Locator implements basic
node location and management: propagating and dropping services on the
network. When creating your own thrift definition, your service should import
locator.thrift and extend one of the two service classes.
When creating a thrift handler implementation, the python class should inherit from location.BaseHandler or location.LocatorHandler and the Iface stub from your own class generated from your thrift file.