Containing Node Communication In Seattle
Please see attached paper for overview and detailed explanation about Containment in the Seattle Testbed.
Building and Configuring
Start by creating a directory using preparetest.py. We'll call this directory testdir.
In testdir, copy all CNC Core Modules, as listed below. If you are planning to run experiments, also copy the CNC experiment modules (listed below) to testdir. Also, copy the 'seattlegeni_xmlrpc.py' file from trunk seattlegeni xmlrpc_clients to the testdir directory. Finally, copy the 'send_gmail.py' file from trunk integrationtests common to the testdir directory.
A config file must be created that will describe the topology of the containment server farm (Seattle Node Directory Service). It must be named 'cnc_server_list.txt', and must list the information of every server in the farm. Each line describes a different server, and each line must follow this exact format: <server_ip>:<server_port>
A config file must be created that will describe the backup configuration of the containment server farm (Seattle Node Directory Service). The file must be named 'cnc_backup_config.txt.' If it is left empty, it indicates the server farm should use no backup configuration. This is fine for most experiments. See comments on cncFileParser_read_backup_config in cncFileParser.repy for information on how to specify a backup configuration.
Copy 'restrictions.cnc' into the test directory if you are planning to run the Seattle Node Directory Service (cncStandaloneServer.repy is used to run this service).
If you are planning to run the Seattle Node Directory Service, run:
python repypp.py cncStandaloneServer.repy cncStandaloneServer.py
If you are planning to run experiments from the directory, run:
python repypp.py cncallpairsping.repy cncallpairsping.py python repypp.py deploycncexperiment.mix deploycncexperiment.py
- If you are planning to run the Seattle Node Directory Service, copy testdir to every machine in your farm.
Running the Seattle Node Directory Service
Prior to this section, please complete the steps in the 'Building and Configuring' Section On every machine on your farm run:
python repy.py restrictions.cnc cncStandaloneServer.py <local available port to use> <user key range lower> <user key range upper> <server public key> <server private key>
Please see the paper attached for more information on keyranges. Basically, the combined keyranges of all servers on your farm should cover the entire keyspace from 0 to 999, endpoints inclusive. When specifying the keyrange parameters to the server, the endpoints are includes. For example, if I specify lower 0 and upper 333 for a server, that server will cover the keyrange from 0 to 333 with endpoints 0 and 333 included. Keyranges of different servers should not overlap. For example, it would not be correct to have one server with keyrange 0-333 and another server with keyrange 333-666, since these ranges overlap for the value 333.
Running an Experiment
Prior to this section, please complete the steps in the 'Building and Configuring' and 'Running the Seattle Node Directory Service' sections.
- Run a command in the following format to start the experiment.
python deploycncexperiment.py wan <number clients> <experiment run duration in seconds> <number of client groups>
The nummber of client groups argument basically indicates the nubmer of unique keys. The script will set up clients to have one userkey each. So for 2 client groups, half of the VMs will have one key, and the other half will share a completely different key.
For example, the following command will run an experiment with 100 clients for 20 minutes, with all clients sharing the same single userkey.
python deploycncexperiment.py wan 100 1200 1
Wait for the experiment to complete. This may take a few hours if the number of clients is greater than 100 due to high costs of setting up the experiment and downloaging the log data. You can track the progress of the experiment in the log file generated by the script named 'cnc_experiment_out'.
Once the experiment is completed, there will be a subdirectory in testdir (on the machine the experiment script was run from) generated with a collection of logs from all the clients in the experiment. We will refer to this subdirectory as 'the result directory'. Stop all Seattle Node Directory Servers in your farm. From each machine in your farm, look inside the testdir directory and find the logfiles with the prefix 'cncserverlog'. Copy these files to the result directory.
4)On the machine the experiment script was run on, browse to the parent directory of testdir. Copy analyze_logs.py to this directory (analyze_logs.py is in the cnc code in the cncSystemPerformance directory). Then, making sure you are in the parent directory of test dir (not in testdir directory), run the following command to analyze your results.
python analyze_logs.py testdir [name of result dir] > temp.txt
In place of [name of result dir], make sure to enter the actual name of the result directory. This may take a fw minutes to run depending on the number and size of the logs.
The report will be stored in the file temp.txt.
CNC Core Modules
The following modues are used in the Seattle Node Containment system. Please note that unless specified otherwise, when we refer to the term userkey, we mean cncuserkey, not the userkeys Seattle uses to assign VMs. Cncuserkeys are essentially hashes of Seattle userkeys.
keyrangelib.repy - Library that provides utility and helper functions for key range related tasks. These include tasks such as converting from Seattle userkeys to cncuserkeys, or looking up the address of the supporting server for a particular userkey.
cncSignData.repy - Handles tasks related to packet signing. This is essentially a wrapper around signeddata.repy that makes it easier to use with the containment system.
cncperformance.repy - Includes funtions whose purpose is to improve the performance of the cnc system. Currently it only provides functionality for compressing and restoring ip,port information, but more may be added to this modeule later as optimizations are made.
cncmultifilelog.repy - provides simple interface to the cnc system for logging. It allows for two types of logging, single and multi-file. Multifile logging splits logs accross many files which is helpful for downloading experiment data from VMs as it avoids timeouts associated with retriveving what would otherwise be massive log files.
cncFileParser.repy - used to read configuration files for both client and server.
cncperfutils.repy - This module handles tracking of the order packets arrive in for diagnostic/debugging purposes. All calls to this method should be removed from production code (debug and experiments only).
cncclient.repy - provides client interface to the containment system. It handles registration, update processing and other cnc related tasks transparently. To use this library, the caller (which will be the dedicated VM on each node hosting the client service) must first call cncclient_initialize, specifying a free port that can be used by the system. cnc_sendmess and cnc_openconn are written to handle the restricting of sending traffic and opening connections. Similar methods need to be written for receiving traffic and waiting for connections.
cncStandaloneServer.repy - Server module for the containment system. Handles registration, update dissemination, and query requests for a specified keyrange.
These modules are used only for running experiments, collecting data, and analyzing data.
cncallpairsping.repy - all-pairs-ping program that has containment restrictions on sending traffic. It detects which addresses to ping using the seattle contralized advertize server (looking up the specific experiment name, each client only pings other client with which it shares at least one key). For debugging and test purposes, it provides callers the option of specifying a set of userkeys to use (in place of Seattle userkeys).
cncdeploylib.mix - provides numerous low level utilities for management of VMs. These include uploading of files, collecting log information (which is split accross multiple files as it has been collected with multifilelog), and ressetting/stopping of VMs.
deployallpairspingtest.mix - provides methods used to automatically deploy the all pairs ping program on VMs. Not standalone, used to provide helper methods to deploycncexperiment.mix
deploycncservers.mix - provides methods used to automatically deploy the cnc servers on VMs. Note, this method of testing the cnc system is not reccomended. Not standalone, used to provide helper methods to
deploycncexperiment.mix - support for automatically aquiring arbitrary numbers of VMs and running containment experiments. The recommeded way to use this script is to first manually configure and start the server, then use this script to aquire and start clients.
analyze_logs.py - a result analysis script that will parse log files from the Seattle Node Directory service and from its clients to determine information such as cache accuracy, load on clients, and load on servers.