Skip to content
Go to file

Latest commit


Git stats


Failed to load latest commit information.

Top-level simulation code:

  • Path simulator code. Needs Tor's stem library, consensuses, and descriptors
  • Path simulator code for congestion-aware Tor (CAT) variant
  • Path simulator code for SAFEST (i.e. virtual-coordinate system) variant

Top-level analysis scripts:

  • Turns simulator output into statistics.
  • Turns simulator statistics into plots.

Useful shell scripts:

  • Runs a simple simulation with the input paramaters
  • Runs parallel CAT simulations
  • Runs parallel simulations where adversary enters after start.
  • Runs parallel simulations where guard/exit bandwidths are varied
  • Runs parallel simulations where total bandwidth is varied
  • Runs parallel simulations where user models are varied
  • Moves simulation files around, runs analysis scripts on them, runs plot scripts on the output, archives the output.


  • ext: Code for SAFEST extension
  • util: Code for various useful intermediate operations

For an example of how TorPS can be used, see

Users Get Routed: Traffic Correlation on Tor by Realistic Adversaries
by Aaron Johnson, Chris Wacek, Rob Jansen, Micah Sherr, and Paul Syverson
To appear in Proceedings of the 20th ACM Conference on Computer and Communications Security (CCS 2013).

The BibTeX citation for this paper is

      author = {Aaron Johnson and Chris Wacek and Rob Jansen and Micah Sherr and Paul Syverson},
      title = {Users Get Routed: Traffic Correlation on Tor by Realistic Adversaries},
      booktitle = {Proceedings of the 20th ACM Conference on Computer and Communications Security (CCS 2013)},
      year = {2013},
      publisher = {ACM}

Path Simulation HOWTO

Basic path simulation can be done entirely with It requires Stem ( Simulation is a two-step process:

  1. Process Tor consensuses and descriptors into a faster and more compact format for later path simulation. This is done with the following command:
python process [args]

Replace [args] with "-h" for argument details. An example of this command is:

python process --start_year 2013 --start_month 8 --end_year 2014 --end_month 7
  --in_dir in --out_dir out --initial_descriptor_dir in/server-descriptors-2013-07
TorPS expects to find all consensuses and descriptors for a given month in the format

and organization of the consensus archives. Extract the consensus archive for a month into a directory named "[in-dir]/consensuses-[year]-[month]", where [year] is in YYYY format and [month] in is MM format. Similarly, extract the archive of descriptors for a given month into the directory "[in-dir]/server-descriptors-[year]-[month]".

The processing command will go through each month from [start_year]/[start_month] to

[end_year]/[end_month]. It will output the processed "network state files" for a given month into the directory "[out_dir]/network-state-[year]-[month]", which will be created if it doesn't exist.

If --fat is provided, then the network state files will contain all data from the Tor consensuses and descriptors. However, the resulting "fat" network state files *cannot* be used by TorPS for simulation. They may be useful to inspect more fully the network states of a given simulation.

If the consensuses being processed start at the very beginning of a

month, which is true assuming you just extract some monthly consensus archives as provided by Tor Metrics, then the --initial_descriptor_dir argument should be included with a directory containing the descriptors from the month before the first consensus month. If this argument is omitted, then the first ~18 hours of network state files of the first month of the period being processed will incorrectly contain many fewer relays than actually existed in the Tor network at that time. This is because a relay is only included if its descriptor has been found in a descriptor archive, but a relay only publishes a new descriptor after ~18 hours. Thus the for the initial hours, the needed descriptors are in the descriptor archive of the month before the period being processed. You can see how many relays are included in each network state file by looking at the output lines of the process command. For example, the relevant lines should look something like:

Processing consensus file 2013-09-01-00-00-00-consensus
  Wrote descriptors for 2 relays.
  Did not find descriptors for 4277 relays

Notice in this example that nearly all relays are missing descriptors here (and thus would not exist in the network state file), which occurred in this case because the consensuses to process started 2013-09-01-00-00-00 and --initial_descriptor_dir was omitted. Output from the second day of this example shows that indeed there are no missing descriptors after at most 24 hours of consensuses:

Processing consensus file 2013-09-02-00-00-00-consensus
  Wrote descriptors for 4261 relays.
  Did not find descriptors for 0 relays

The script util/ can be fed the output of the process command to provide convenient statistics about the relays and descriptors produced in each network state file. 2. Run simulations over a given period. This is done with the following command:

python simulate [args]

Replace [args] with "-h" for argument details. An example of the command for a 5000-sample simulation in which the client makes a connection to Google ( every 10 minutes (i.e. 600 seconds) is:

python simulate --nsf_dir out/ns-2013-08--2014-07 --num_samples 5000 
  --user_model simple=600 --format normal tor

Following is another example of the simulate command. This example executes a simulation in which the user has "typical" behavior as given in the included trace file, a malicious guard relay is added with consensus bandwidth 15000, a malicious exit relay is added with consensus bandwidth 10000, the output indicates only if a malicious guard and/or exit is selected, the number of client guards is adjusted to 2, and guard expiration occurs randomly between 270 and 300 days after initial selection:

python simulate --nsf_dir out/ns-2013-08--2014-07 --num_samples 5000
  --trace_file in/users2-processed.traces.pickle --user_model typical --format relay-adv
  --adv_guard_cons_bw 15000 --adv_exit_cons_bw 10000 --adv_time 0 --num_adv_guards 1
  --num_adv_exits 1 --num_guards 2 --guard_expiration 270 --loglevel INFO tor

The included trace file (in/users2-processed.traces.pickle) includes six 20-minute traces recorded from a volunteer using Tor for the following activities: Facebook, Gmail / Google Chat (now Hangouts), Google Calendar / Google Docs, Web search, IRC, and BitTorrent. These are repeated on a weekly schedule to create user models that fill the simulated time period. Also, a "typical" model is provided including all of the first four traces (i.e. Facebook, Gmail/GChat, GCal/GDocs, Web search) in the schedule, and "best" and "worst" models are provided by replacing the TCP ports in the typical model with ports 443 and 6523, respectively. See the paper "Users Get Routed: Traffic Correlation on Tor by Realistic Adversaries" cited above for details on these traces and models.

Plotting Simulation Data

TorPS includes some basic functions to quickly analyze and view the results of your simulations. Note that the shell script gives an example of how to use this functionality.

  1. will process a number of log files in parallel and store the result for each one as a file containing pickled objects. It has command options: "simulation-set" and "simulation-top". simulation-set will compute statistics for the case that the adversary controls a set of relays. simulation-top will compute statistics as if the adversary controls a varying number of the "top" relays. See the script output for command options.
  2. requires numpy and matplotlib. It takes the files output by and produces a set of graphs showing the CDFs of compromise time and rate for the guard/exit/guard&exit of user circuits. See the script output for command options.


The latest version of TorPS (tag "tor-") simulates path selection as performed by Tor stable release The TorPS version at tag "tor-" simulates path selection as performed by Tor stable release

You can’t perform that action at this time.