Skip to content
Robert Grider edited this page Apr 18, 2017 · 12 revisions

For benchmarking NetLogo's engine speed, we have a suite of benchmark models in models/test/benchmarks.

We also have a framework that runs each of our benchmark models repeatedly, discards the initial results during JIT warmup, and then keeps a running average of the remaining runs, stopping once sufficient data has been gathered for the results to be reliable within a percent or two. We've been using this for many years now to work on speedups and catch performance regressions.

In order to make a model usable with the benchmarking framework, you add globals [result] and a to benchmark procedure that does random-seed at the beginning (so there's no random variation in the results) and set result at the end. Then put the model in models/test/benchmarks, with a name ending in Benchmark.nlogo.

The benchmarks are run headless, from the command line. They're for measuring engine speed, not graphics speed.

Here's how to run the benchmarks:

% ./sbt
...
> netlogo/runMain org.nlogo.headless.HeadlessBenchmarker Bureaucrats
[info] Running org.nlogo.headless.HeadlessBenchmarker Bureaucrats
[info] @@@@@@ benchmarking NetLogo 5.1.0-M1 (INTERIM DEVEL BUILD)
[info] @@@@@@ warmup 60 seconds, min 60 seconds, max 300 seconds
[info] (Bureaucrats Benchmark)
[info] 1/2 (mean=4.156, stddev=0.000)
[info] 4/340 (mean=4.040, stddev=0.096)

Here "4/340" means that NetLogo is predicting that 340 runs will be necessary to get reliable data. (Quitting other apps, disconnecting the network, etc., helps keep these numbers down so you get good results faster.)

You can add more arguments to runMain to specify a time-spent-per-model window of different than the default of min 30 seconds, max 300 seconds (5 minutes). The benchmarks will stop before the time limit if the results so far are statistically computed to be 98% likely to be within 0.5% of the "true" answer, so even with the 5 minute default, with any luck it'll move on to the next model in less than the full 5 minutes. The minimum number is also used as a warmup time, so the 30 second default minimum means each model will take at least 60 seconds: 30 to warm up, 30 to gather data. If you want reliable data I don't recommend using a warmup time of less than 30 seconds. If all you need is rough ballpark figures, 5 5 works OK.

A little progress report is printed every 10 seconds, so you have something to look at, and so you can stop the benchmark if you notice something going wrong. For example, if all of a sudden there is a big jump in the standard deviation, then your computer probably decided to perform some unrelated task in the middle of the benchmark, and now the results are screwed up. (This will result in the statistics code thinking a very large number of runs is needed, which means you'll hit your max time limit, so in the results that model will have "hit time limit" printed next to the final number; this is a sign that the result is probably not accurate.)

If you want to run the whole suite, use the script in bin/benches.scala. It runs the entire benchmark suite repeatedly, so it's suitable for leaving running overnight. The script saves just the results, without the progress reports, to tmp/bench.txt for easy retrieval.

Accurate benchmarking is difficult and frustrating. Your results will be more reliable if you quit all other open apps and turn off or unplug your computer's network connection. If you're on Linux or Mac OS X, it's better if you run the benchmarks from the console with no GUI processes whatsoever active on the machine at all. On Linux, kill X11. On Mac OS X, kill the window server by logging out, and then from the login screen, press option-down-arrow followed by option-return, then type >console for your user name; the GUI will quit and you should get a plain-text login prompt.

Extensions

Writing benchmarks for an extension works the same as it would for NetLogo. You can either just put the benchmark model in models/test/benchmarks, or you can put it in your extension folder and run it with

headless/runMain org.nlogo.headless.HeadlessBenchmarker ../../../extension/my-extension/BenchmarkName
Clone this wiki locally