Skip to content

Commit

Permalink
Hacked up implementation to compare cpython to pypy rather than diffe…
Browse files Browse the repository at this point in the history
…rent django versions.
  • Loading branch information
fennb committed Sep 1, 2011
1 parent 9e04e3a commit f674e1b
Show file tree
Hide file tree
Showing 2 changed files with 31 additions and 121 deletions.
80 changes: 5 additions & 75 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,86 +1,16 @@
Djangobench
===========

A harness and a set of benchmarks for measuring Django's performance over
time.
A hacked up version of Djangobench that compares performance of django running under cpython vs pypy.

Running the benchmarks
----------------------

Setup a virtualenv (or other environment) that has both cpython and pypy available.

Here's the short version::

mkvirtualenv --no-site-packages djangobench
pip install -e git://github.com/jacobian/djangobench.git#egg=djangobench
svn co http://code.djangoproject.com/svn/django/tags/releases/1.2/ django-control
svn co http://code.djangoproject.com/svn/django/trunk django-experiment
pip install -e git://github.com/fennb/djangobench.git#egg=djangobench
svn co http://code.djangoproject.com/svn/django/tags/releases/1.3/ django-control
djangobench

Okay, so what the heck's going on here?

First, ``djangobench`` doesn't test a single Django version in isolation --
that wouldn't be very useful. Instead, it benchmarks an "experiment" Django
against a "control", reporting on the difference between the two and
measuring for statistical significance.

So to run this, you'll need two complete Django source trees. By default
``djangobench`` looks for directories named ``django-control`` and
``django-experiment`` in the current working directory, but you can change
that by using the ``--control`` or ``--experiment`` options.

Now, because you need two Django source trees, you can't exactly install
them: ``djangobench`` works its magic by mucking with ``PYTHONPATH``.
However, the benchmarks themselves need access to the ``djangobench``
module, so you'll need to install it.

If you're feeling fancy, you can use one of them there newfangled DVCSes instead
and test against a single repository containing branches::

git clone git://github.com/django/django.git
djangobench --vcs=git --control=1.2 --experiment=master

Git's the only supported VCS right now, but patches are welcome.

At the time of this writing Django's trunk hasn't significantly diverged
from Django 1.2, so you should expect to see not-statistically-significant
results::

Running 'startup' benchmark ...
Min: 0.138701 -> 0.138900: 1.0014x slower
Avg: 0.139009 -> 0.139378: 1.0027x slower
Not significant
Stddev: 0.00044 -> 0.00046: 1.0382x larger

Writing new benchmarks
----------------------

Benchmarks are very simple: they're a Django app, along with a settings
file, and an executable ``benchmarks.py`` that gets run by the harness. The
benchmark script needs to honor a simple contract:

* It's an executable Python script, run as ``__main__`` (e.g. ``python
path/to/benchmark.py``). The subshell environment will have
``PYTHONPATH`` set up to point to the correct Django; it'll also have
``DJANGO_SETTINGS_MODULE`` set to ``<benchmark_dir>.settings``.

* The benchmark script needs to accept a ``--trials`` argument giving
the number of trials to run.

* The output should be simple RFC 822-ish text -- a set of headers,
followed by data points::

Title: some benchmark
Description: whatever the benchmark does

1.002
1.003
...

The list of headers is TBD.

There's a couple of utility functions in ``djangobench.utils`` that assist
with honoring this contract; see those functions' docstrings for details.

The existing benchmarks should be pretty easy to read for inspiration. The
``query_delete`` benchmark is probably a good place to start.

**Please write new benchmarks and send me pull requests on Github!**
72 changes: 26 additions & 46 deletions djangobench/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,19 @@
import email
import simplejson
import sys
import os
from djangobench import perf
from unipath import DIRS, FSPath as Path

__version__ = '0.9'

# Define environment system commands that will invoke the right interpreter
cpython = 'python2.6'
pypy = 'pypy'

DEFAULT_BENCMARK_DIR = Path(__file__).parent.child('benchmarks').absolute()

def run_benchmarks(control, experiment, benchmark_dir, benchmarks, trials, vcs=None, record_dir=None, profile_dir=None):
def run_benchmarks(control, benchmark_dir, benchmarks, trials, record_dir=None, profile_dir=None):
if benchmarks:
print "Running benchmarks: %s" % " ".join(benchmarks)
else:
Expand All @@ -28,38 +33,27 @@ def run_benchmarks(control, experiment, benchmark_dir, benchmarks, trials, vcs=N
raise ValueError('Recording directory "%s" does not exist' % record_dir)
print "Recording data to '%s'" % record_dir

control_label = get_django_version(control, vcs=vcs)
experiment_label = get_django_version(experiment, vcs=vcs)
branch_info = "%s branch " % vcs if vcs else ""
print "Control: Django %s (in %s%s)" % (control_label, branch_info, control)
print "Experiment: Django %s (in %s%s)" % (experiment_label, branch_info, experiment)
control_label = get_django_version(control, vcs=None)
branch_info = ""
print "Benchmarking: Django %s (in %s%s)" % (control_label, branch_info, control)
print " Control: %s" % cpython
print " Experiment: %s" % pypy
print

# Calculate the subshell envs that we'll use to execute the
# benchmarks in.
if vcs:
control_env = {
'PYTHONPATH': '%s:%s' % (Path.cwd().absolute(), Path(benchmark_dir)),
}
experiment_env = control_env.copy()
else:
control_env = {'PYTHONPATH': '%s:%s' % (Path(control).absolute(), Path(benchmark_dir))}
experiment_env = {'PYTHONPATH': '%s:%s' % (Path(experiment).absolute(), Path(benchmark_dir))}


control_env = {'PYTHONPATH': '.:%s:%s' % (Path(control).absolute(), Path(benchmark_dir))}

for benchmark in discover_benchmarks(benchmark_dir):
if not benchmarks or benchmark.name in benchmarks:
print "Running '%s' benchmark ..." % benchmark.name
settings_mod = '%s.settings' % benchmark.name
control_env['DJANGO_SETTINGS_MODULE'] = settings_mod
experiment_env['DJANGO_SETTINGS_MODULE'] = settings_mod
experiment_env = control_env.copy()
if profile_dir is not None:
control_env['DJANGOBENCH_PROFILE_FILE'] = Path(profile_dir, "con-%s" % benchmark.name)
experiment_env['DJANGOBENCH_PROFILE_FILE'] = Path(profile_dir, "exp-%s" % benchmark.name)
control_env['DJANGOBENCH_PROFILE_FILE'] = Path(profile_dir, "cpython-%s" % benchmark.name)
experiment_env['DJANGOBENCH_PROFILE_FILE'] = Path(profile_dir, "pypy-%s" % benchmark.name)
try:
if vcs: switch_to_branch(vcs, control)
control_data = run_benchmark(benchmark, trials, control_env)
if vcs: switch_to_branch(vcs, experiment)
experiment_data = run_benchmark(benchmark, trials, experiment_env)
control_data = run_benchmark(benchmark, trials, control_env, cpython)
experiment_data = run_benchmark(benchmark, trials, experiment_env, pypy)
except SkipBenchmark, reason:
print "Skipped: %s\n" % reason
continue
Expand All @@ -69,17 +63,15 @@ def run_benchmarks(control, experiment, benchmark_dir, benchmarks, trials, vcs=N
diff_instrumentation = False,
benchmark_name = benchmark.name,
disable_timelines = True,
control_label = control_label,
experiment_label = experiment_label,
)
result = perf.CompareBenchmarkData(control_data, experiment_data, options)
if record_dir:
record_benchmark_results(
dest = record_dir.child('%s.json' % benchmark.name),
name = benchmark.name,
result = result,
control = control_label,
experiment = experiment_label,
control = 'cpython',
experiment = 'pypy',
control_data = control_data,
experiment_data = experiment_data,
)
Expand All @@ -94,21 +86,22 @@ def discover_benchmarks(benchmark_dir):
class SkipBenchmark(Exception):
pass

def run_benchmark(benchmark, trials, env):
def run_benchmark(benchmark, trials, env, interpreter):
"""
Similar to perf.MeasureGeneric, but modified a bit for our purposes.
"""
# Remove Pycs, then call the command once to prime the pump and
# re-generate fresh ones. This makes sure we're measuring as little of
# Python's startup time as possible.
perf.RemovePycs()
command = [sys.executable, '%s/benchmark.py' % benchmark]
out, _, _ = perf.CallAndCaptureOutput(command + ['-t', 1], env, track_memory=False, inherit_env=[])
command = [interpreter, '%s/benchmark.py' % benchmark]
#print " - Running command: %s" % command
out, _, _ = perf.CallAndCaptureOutput(command + ['-t', 1], env, track_memory=False, inherit_env=['PATH'])
if out.startswith('SKIP:'):
raise SkipBenchmark(out.replace('SKIP:', '').strip())

# Now do the actual mesurements.
output = perf.CallAndCaptureOutput(command + ['-t', str(trials)], env, track_memory=False, inherit_env=[])
output = perf.CallAndCaptureOutput(command + ['-t', str(trials)], env, track_memory=False, inherit_env=['PATH'])
stdout, stderr, mem_usage = output
message = email.message_from_string(stdout)
data_points = [float(line) for line in message.get_payload().splitlines()]
Expand Down Expand Up @@ -237,17 +230,6 @@ def main():
default = 'django-control',
help = "Path to the Django code tree to use as control."
)
parser.add_argument(
'--experiment',
metavar = 'PATH',
default = 'django-experiment',
help = "Path to the Django version to use as experiment."
)
parser.add_argument(
'--vcs',
choices = ['git'],
help = 'Use a VCS for control/experiment. Makes --control/--experiment specify branches, not paths.'
)
parser.add_argument(
'-t', '--trials',
type = int,
Expand Down Expand Up @@ -288,11 +270,9 @@ def main():
args = parser.parse_args()
run_benchmarks(
control = args.control,
experiment = args.experiment,
benchmark_dir = args.benchmark_dir,
benchmarks = args.benchmarks,
trials = args.trials,
vcs = args.vcs,
record_dir = args.record,
profile_dir = args.profile_dir
)
Expand Down

0 comments on commit f674e1b

Please sign in to comment.