This tracker is part of our entire deployment pipeline which consists of several moving parts:
- Jenkins buildserver(s) that build our sourcecode
- Custom scripting for starting deploys
- rtorrent client running on every node participating in a deploy
- a servermanagement database that holds metadata on nodes in our environment
- a tracker for bittorrent
The tracker uses the knowledge we have about our network topology to build two-tier swarms of bittorrent clients. It will return peers in a global swarm to the first two clients in a rack requesting tracker information. Any additional clients from the SAME rack requesting peer information from the tracker will only get peers in the same rack, thus building a second tier swarm that spans a single rack.
This setup was chosen because the uplink bandwidth in a rack is a critical resource for us. If many clients in a rack start downloading pieces from other peers randomly distributed in our network environment they may/will saturate the uplink in the rack, causing serious starvation issues and failing requests to production services.
By limiting the amount of peers that participate in a global bittorrent swarm in a single rack to 2 and capping the bittorrent client to ~100mbit/s we can guarantee that the rack uplink is only utilized for 20% by bittorrent traffic.
You could probably do something similar with QoS on your rack switches, but we deploy dumb switches which do not allow QoS easily and since we deploy a lot of them, this saves us the overhead of building a complex rack switch configuration management system.
Disclaimer for general use
This tracker works in our environment with our setup and specifically exploits our knowledge of our network topology.
YOUR MILEAGE WILL VARY !!
Your network topology is most certainly different than ours, your nodes will be different and the servermanagement metadata REST service that we operate is not (yet) open source.
HYVES, THE AUTHOR OR ANY CONTRIBUTERS WILL NOT ACCEPT ANY RESPONSIBILITY, LIABILITY NOR CLAIM ANY GUARANTEE THAT THIS SOFTWARE WILL WORK FOR YOU. IT MAY EAT YOUR CAT, LUNCH OR ENTIRE DATACENTER BANDWIDTH WITHOUT ANY PRIOR NOTICE OR WARNING
Or in legalese:
The Software is provided "AS IS", without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose and noninfringement. In no event shall the authors or copyright holders be liable for any claim, damages or other liability, whether in an action of contract, tort or otherwise, arising from, out of or in connection with the Software or the use or other dealings in the Software.
The call to our servermanagement metadata service is extremely simple and can easily be replaced by other logic such as:
- dns calls for TEXT records or LOCATION records
- Subnet logic (swarms grouped by subnet)
- Any form of key logic that will group clients by a key based on the ipaddress that they have
- Any form of external service that will return a group key
There is NO support at the moment for multiple concurrent transfers as the groups are organized by transfer hash, which is different per transfer.
There is no code to generate torrent files, the torrent files we use are generated by running mktorrent from http://mktorrent.sourceforge.net
We run Gentoo and you can find the two relevant ebuilds for the rtorrent client and the libtorrent library in the client directory. These ebuilds are lightly modified ebuilds from the mainline of the gentoo project: http://www.gentoo.org
The tracker should work with any bittorrent client. I'd be interested in reports of other (non)working clients. Please file issues on github for non-working clients.
We use rtorrent which is a high performance C++ based bittorrent client based on libtorrent. Both are developed here: http://libtorrent.rakshasa.no/
I've included the patch we apply to rtorrent/libtorrent to speed it up. This patch removes features that you'd probably want if you're deploying this client in a hostile environment such as the internet. In addition it removes the minimal timeout for tracker requests which will get you (rightfully) blocked on any public tracker.
The tracker configfile has the following recognized values:
HOST: which hostname is the tracker running PORT: which port should it respond to REDISHOST: where to contact the redis backing store REDISPORT: which port to connect to redis on SMDB_URL: where to contact the REST servermanagement metadata service MAX_REPR_RACK: How many peers from a single rack can participate in a global swarm ACTIVE_INTERVAL: How often to contact the tracker if a transfer is active PASSIVE_INTERVAL: How often to contact the tracker if a transfer is passive MAXPEERS: How many peers to return to nodes PROXYPASS: Is the tracker behind a proxy and should it fix clients vars DEBUG: Should the tracker log debug statements/run debug code
In addition to the tracker configfile there is a sample rtorrent configfile included in the etc directory, which is a puppet template that we use to deploy the client in our environment. The variables should be fairly self-evident.
Redis storage model
Transfer information is stored in redis and requires at least redis 2.0.x and redis-py-2.0.x The redis model is described below
racks = all racks we've seen rack:rackname = all hosts we've ever seen in this rack
transfers = all seen info_hashes active_transfers = all active info_hashes
hash:peers:N = all seen peers for the hash hash:peers:R = all representants for the hash hash:peers:S = all seeders for the hash hash:peers:L = all leechers for the hash
hash:rack:rackname:N = all peers for the hash in a rack hash:rack:rackname:R = all repr for the hash in a rack hash:rack:rackname:S = all seeders for the hash in a rack
hash:peer:peeripaddress:compact = True/False hash:peer:peeripaddress:port = port where the client is operating on hash:peer:peeripaddress:peer_id = peer id from client hash:peer:peeripaddress:key = peer key hash:peer:peeripaddress:last_event = last seen event hash:peer:peeripaddress:event: = datetime event was seen hash:peer:peeripaddress:seeder = True/False hash:peer:peeripaddress:downloaded = bytes downloaded hash:peer:peeripaddress:left = bytes left to downlaod hash:peer:peeripaddress:uploaded = bytes uploaded to other clients hash:peer:peeripaddress:rack = rack where the peer is located hash:peer:peeripaddress:hostname = hostname reported for ipaddress
hash:length = length of the torrent payload hash:name = name of the torrent hash:registered = datetime transfer was activated hash:deregistered = datetime transfer was deactivated hash:first_started = datetime first peer started downloading hash:last_started = datetime last peer started downloading hash:first_completed = datetime first peer completed downloading hash:last_completed = datetime last peer completed downloading
peer = ipaddress:port of the peer rack = rackname hash = uppercase hash for the torrent
ALL VALUES ARE STRINGS !!!!!!
Deactivation renames all keys that start with hash, to datetime:hash where datetime is the datetime of deactivation
We love patches, bug reports, and anything related to trying to get this to work in a different environment than ours
Please use githubs excellent issue system for bug reports Please use githubs even more awesome pull-request system to contribute
Contact me: ramon at hyves dot nl with any feedback
You can also find me on IRC, I usually hang out on Freenode in one of the gentoo-* channels and/or #vagrant #pocoo #fabric and #openstack