Hacked up implementation to compare cpython to pypy rather than diffe…

…rent django versions.
fennb · Sep 1, 2011 · f674e1b · f674e1b
1 parent 9e04e3a
commit f674e1b
Show file tree

Hide file tree

Showing 2 changed files with 31 additions and 121 deletions.
diff --git a/README.rst b/README.rst
@@ -1,86 +1,16 @@
 Djangobench
 ===========
 
-A harness and a set of benchmarks for measuring Django's performance over
-time.
+A hacked up version of Djangobench that compares performance of django running under cpython vs pypy.
 
 Running the benchmarks
 ----------------------
 
+Setup a virtualenv (or other environment) that has both cpython and pypy available.
+
 Here's the short version::
 
-    mkvirtualenv --no-site-packages djangobench
-    pip install -e git://github.com/jacobian/djangobench.git#egg=djangobench
-    svn co http://code.djangoproject.com/svn/django/tags/releases/1.2/ django-control
-    svn co http://code.djangoproject.com/svn/django/trunk django-experiment
+    pip install -e git://github.com/fennb/djangobench.git#egg=djangobench
+    svn co http://code.djangoproject.com/svn/django/tags/releases/1.3/ django-control
     djangobench
 
-Okay, so what the heck's going on here?
-
-First, ``djangobench`` doesn't test a single Django version in isolation --
-that wouldn't be very useful. Instead, it benchmarks an "experiment" Django
-against a "control", reporting on the difference between the two and
-measuring for statistical significance.
-
-So to run this, you'll need two complete Django source trees. By default
-``djangobench`` looks for directories named ``django-control`` and
-``django-experiment`` in the current working directory, but you can change
-that by using the ``--control`` or ``--experiment`` options.
-
-Now, because you need two Django source trees, you can't exactly install
-them: ``djangobench`` works its magic by mucking with ``PYTHONPATH``.
-However, the benchmarks themselves need access to the ``djangobench``
-module, so you'll need to install it.
-
-If you're feeling fancy, you can use one of them there newfangled DVCSes instead
-and test against a single repository containing branches::
-
-    git clone git://github.com/django/django.git
-    djangobench --vcs=git --control=1.2 --experiment=master
-
-Git's the only supported VCS right now, but patches are welcome.
-
-At the time of this writing Django's trunk hasn't significantly diverged
-from Django 1.2, so you should expect to see not-statistically-significant
-results::
-
-    Running 'startup' benchmark ...
-    Min: 0.138701 -> 0.138900: 1.0014x slower
-    Avg: 0.139009 -> 0.139378: 1.0027x slower
-    Not significant
-    Stddev: 0.00044 -> 0.00046: 1.0382x larger
-
-Writing new benchmarks
-----------------------
-
-Benchmarks are very simple: they're a Django app, along with a settings
-file, and an executable ``benchmarks.py`` that gets run by the harness. The
-benchmark script needs to honor a simple contract:
-
-    * It's an executable Python script, run as ``__main__`` (e.g. ``python
-      path/to/benchmark.py``). The subshell environment will have
-      ``PYTHONPATH`` set up to point to the correct Django; it'll also have
-      ``DJANGO_SETTINGS_MODULE`` set to ``<benchmark_dir>.settings``.
-
-    * The benchmark script needs to accept a ``--trials`` argument giving
-      the number of trials to run.
-
-    * The output should be simple RFC 822-ish text -- a set of headers,
-      followed by data points::
-
-            Title: some benchmark
-            Description: whatever the benchmark does
-
-            1.002
-            1.003
-            ...
-
-      The list of headers is TBD.
-
-There's a couple of utility functions in ``djangobench.utils`` that assist
-with honoring this contract; see those functions' docstrings for details.
-
-The existing benchmarks should be pretty easy to read for inspiration. The
-``query_delete`` benchmark is probably a good place to start.
-
-**Please write new benchmarks and send me pull requests on Github!**
diff --git a/djangobench/main.py b/djangobench/main.py
@@ -9,14 +9,19 @@
 import email
 import simplejson
 import sys
+import os
 from djangobench import perf
 from unipath import DIRS, FSPath as Path
 
 __version__ = '0.9'
 
+# Define environment system commands that will invoke the right interpreter
+cpython = 'python2.6'
+pypy = 'pypy'
+
 DEFAULT_BENCMARK_DIR = Path(__file__).parent.child('benchmarks').absolute()
 
-def run_benchmarks(control, experiment, benchmark_dir, benchmarks, trials, vcs=None, record_dir=None, profile_dir=None):
+def run_benchmarks(control, benchmark_dir, benchmarks, trials, record_dir=None, profile_dir=None):
     if benchmarks:
         print "Running benchmarks: %s" % " ".join(benchmarks)
     else:
@@ -28,38 +33,27 @@ def run_benchmarks(control, experiment, benchmark_dir, benchmarks, trials, vcs=N
             raise ValueError('Recording directory "%s" does not exist' % record_dir)
         print "Recording data to '%s'" % record_dir
 
-    control_label = get_django_version(control, vcs=vcs)
-    experiment_label = get_django_version(experiment, vcs=vcs)
-    branch_info = "%s branch " % vcs if vcs else ""
-    print "Control: Django %s (in %s%s)" % (control_label, branch_info, control)
-    print "Experiment: Django %s (in %s%s)" % (experiment_label, branch_info, experiment)
+    control_label = get_django_version(control, vcs=None)
+    branch_info =  ""
+    print "Benchmarking: Django %s (in %s%s)" % (control_label, branch_info, control)
+    print "    Control: %s" % cpython
+    print "    Experiment: %s" % pypy
     print
-
-    # Calculate the subshell envs that we'll use to execute the
-    # benchmarks in.
-    if vcs:
-        control_env = {
-            'PYTHONPATH': '%s:%s' % (Path.cwd().absolute(), Path(benchmark_dir)),
-        }
-        experiment_env = control_env.copy()
-    else:
-        control_env = {'PYTHONPATH': '%s:%s' % (Path(control).absolute(), Path(benchmark_dir))}
-        experiment_env = {'PYTHONPATH': '%s:%s' % (Path(experiment).absolute(), Path(benchmark_dir))}
-
+
+    control_env = {'PYTHONPATH': '.:%s:%s' % (Path(control).absolute(), Path(benchmark_dir))}
+
     for benchmark in discover_benchmarks(benchmark_dir):
         if not benchmarks or benchmark.name in benchmarks:
             print "Running '%s' benchmark ..." % benchmark.name
             settings_mod = '%s.settings' % benchmark.name
             control_env['DJANGO_SETTINGS_MODULE'] = settings_mod
-            experiment_env['DJANGO_SETTINGS_MODULE'] = settings_mod
+            experiment_env = control_env.copy()
             if profile_dir is not None:
-                control_env['DJANGOBENCH_PROFILE_FILE'] = Path(profile_dir, "con-%s" % benchmark.name)
-                experiment_env['DJANGOBENCH_PROFILE_FILE'] = Path(profile_dir, "exp-%s" % benchmark.name)
+                control_env['DJANGOBENCH_PROFILE_FILE'] = Path(profile_dir, "cpython-%s" % benchmark.name)
+                experiment_env['DJANGOBENCH_PROFILE_FILE'] = Path(profile_dir, "pypy-%s" % benchmark.name)
             try:
-                if vcs: switch_to_branch(vcs, control)
-                control_data = run_benchmark(benchmark, trials, control_env)
-                if vcs: switch_to_branch(vcs, experiment)
-                experiment_data = run_benchmark(benchmark, trials, experiment_env)
+                control_data = run_benchmark(benchmark, trials, control_env, cpython)
+                experiment_data = run_benchmark(benchmark, trials, experiment_env, pypy)
             except SkipBenchmark, reason:
                 print "Skipped: %s\n" % reason
                 continue
@@ -69,17 +63,15 @@ def run_benchmarks(control, experiment, benchmark_dir, benchmarks, trials, vcs=N
                 diff_instrumentation = False,
                 benchmark_name = benchmark.name,
                 disable_timelines = True,
-                control_label = control_label,
-                experiment_label = experiment_label,
             )
             result = perf.CompareBenchmarkData(control_data, experiment_data, options)
             if record_dir:
                 record_benchmark_results(
                     dest = record_dir.child('%s.json' % benchmark.name),
                     name = benchmark.name,
                     result = result,
-                    control = control_label,
-                    experiment = experiment_label,
+                    control = 'cpython',
+                    experiment = 'pypy',
                     control_data = control_data,
                     experiment_data = experiment_data,
                 )
@@ -94,21 +86,22 @@ def discover_benchmarks(benchmark_dir):
 class SkipBenchmark(Exception):
     pass
 
-def run_benchmark(benchmark, trials, env):
+def run_benchmark(benchmark, trials, env, interpreter):
     """
     Similar to perf.MeasureGeneric, but modified a bit for our purposes.
     """
     # Remove Pycs, then call the command once to prime the pump and
     # re-generate fresh ones. This makes sure we're measuring as little of
     # Python's startup time as possible.
     perf.RemovePycs()
-    command = [sys.executable, '%s/benchmark.py' % benchmark]
-    out, _, _ = perf.CallAndCaptureOutput(command + ['-t', 1], env, track_memory=False, inherit_env=[])
+    command = [interpreter, '%s/benchmark.py' % benchmark]
+    #print " - Running command: %s" % command
+    out, _, _ = perf.CallAndCaptureOutput(command + ['-t', 1], env, track_memory=False, inherit_env=['PATH'])
     if out.startswith('SKIP:'):
         raise SkipBenchmark(out.replace('SKIP:', '').strip())
 
     # Now do the actual mesurements.
-    output = perf.CallAndCaptureOutput(command + ['-t', str(trials)], env, track_memory=False, inherit_env=[])
+    output = perf.CallAndCaptureOutput(command + ['-t', str(trials)], env, track_memory=False, inherit_env=['PATH'])
     stdout, stderr, mem_usage = output
     message = email.message_from_string(stdout)
     data_points = [float(line) for line in message.get_payload().splitlines()]
@@ -237,17 +230,6 @@ def main():
         default = 'django-control',
         help = "Path to the Django code tree to use as control."
     )
-    parser.add_argument(
-        '--experiment',
-        metavar = 'PATH',
-        default = 'django-experiment',
-        help = "Path to the Django version to use as experiment."
-    )
-    parser.add_argument(
-        '--vcs',
-        choices = ['git'],
-        help = 'Use a VCS for control/experiment. Makes --control/--experiment specify branches, not paths.'
-    )
     parser.add_argument(
         '-t', '--trials',
         type = int,
@@ -288,11 +270,9 @@ def main():
     args = parser.parse_args()
     run_benchmarks(
         control = args.control,
-        experiment = args.experiment,
         benchmark_dir = args.benchmark_dir,
         benchmarks = args.benchmarks,
         trials = args.trials,
-        vcs = args.vcs,
         record_dir = args.record,
         profile_dir = args.profile_dir
     )