-
Notifications
You must be signed in to change notification settings - Fork 0
a tool for sampling peers uniformly at random from the Gnutella network
License
DanielStutzbach/ion-sampler
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Copyright (C) 2006--2009 Daniel Stutzbach README for ion-sampler 1.0.1 ion-sampler is a tool for selecting peers approximately uniformly at random from the active peers in a peer-to-peer network. The tool's algorithm is described in detail in the paper: Daniel Stutzbach, Reza Rejaie, Nick Duffield, Subhabrata Sen, Walter Willinger, ``On unbiased sampling for unstructured peer-to-peer networks'', IEEE/ACM Trans. Netw., vol. 17, iss. 2, pp. 377-390, 2009. http://www.barsoom.org/papers/ton-2007-sampling.pdf Reading the paper before using the tool is highly recommended. The tool is intended for research use. It depends on protocol hooks generously provided by P2P network developers. If the tool is abused, it is likely these hooks will be removed. Use with care. ------------------------------------------------------------------------ Requirements: 1. An initial list of ultrapeers 2. gcc 3. GNU make 4. A real-time library (librt) with the clock_gettime() and clock_getres() functions. 5. Python 2.4, 2.5, or 2.6 Place the list of ultrapeers in the "gnutella.in" file. The ion-sampler distribution comes with a list of around 1,000 ultrapeers that were up in October 2009. If ion-sampler takes a long time before printing any output when run, probably none of the ultrapeers are still online and it is diligently trying all of them. In that case, you will need to acquire a new list. It's not important that all of the ultrapeers listed in gnutella.in are online, as long as a decent percentage of them are. Tested on Red Hat Debian and Linux. Porting to other Linux systems should be a piece of cake. Porting to non-Linux systems shouldn't be too hard either, as long as they have gcc and Python. The use of the real-time library is to avoid problems with clocks running backwards. Ordinary time-related system calls simply return the current wall time, which might change and might even change often (such as on systems running NTP). Clocks running backward can wreck havoc when trying to keep accurate timers for timing out network traffic, so we use functions from the real-time library that provide clocks that measure elapsed time. ------------------------------------------------------------------------ Build instructions: Run "make" ------------------------------------------------------------------------ Execution instructions: "./ion-sampler gnutella -n 50" will select 50 ultrapeers from the Gnutella network and print out their IP:port pairs on standard output. Adding the -d option will print out the degree (how many neighbors) the selected ultrapeers have. ion-sampler typically takes a few minutes to run. Don't be alarmed that it doesn't output anything immediately. ------------------------------------------------------------------------ Hacking: ion-sampler is really two programs. One of them is called "gnutella" and deals with the protocol details of asking a Gnutella peer for a list of its neighbors. The other program is called "ion-sampler" and performs the random walk to select peers uniformly at random. The mandatory argument "gnutella" to ion-sampler tells it run the "gnutella" program and use "gnutella.in" as an initial source of addresses. ion-sampler can be extended to other peer-to-peer networks by writing an appropriate program, similar to the provided "gnutella" program. Hereinafter, such programs are called "plug-ins". The plug-in reads addresses from standard input. Each address is separated by a newline character ('\n'). The plug-in must contact that address via the network and either discover the peer's neighbors' addresses or fail (due to a connection refused, timeout, etc). When the plug-in retrieves the list of neighbors, it must print out a line in the following format: R: IP:port(optional) peertype neighbors, leafs where: IP:port -> the peer's IP address and port, as earlier read from stdin optional -> any extra data you need for debugging peertype -> "Ultrapeer" or "Leaf" neighbors -> a space separated list of IP:port pairs leafs -> a space separated list of IP:port pairs This format is regrettably a little Gnutella-specific. In the future, a more flexible system may be used. In the even of a failure, the output format is: R: IP:port() failure message The failure message may be any string as long as it does not start with the words "Ultrapeer", "Leaf", or "Peer". The plug-in *must* output either a list of neighbors or an error for every address read from standard input. If it loses track of an address or waits indefinitely for a response, ion-sampler will also wait indefinitely. For correct operation, the plug-in must continue to read addresses from standard input even while processing earlier addresses. In other words, the program must multiplex. It cannot just read one address, get data from the network, print out the results, then read the next address. ------------------------------------------------------------------------ Please send all questions, comments, or patches to: "Daniel Stutzbach" <daniel@stutzbachenterprises.com>
About
a tool for sampling peers uniformly at random from the Gnutella network
Resources
License
Stars
Watchers
Forks
Packages 0
No packages published