Merge PR #66: Document ReBench and fix consistency issues

smarr · Jul 14, 2018 · 2b5a6a8 · 2b5a6a8
2 parents 113211e + 4780da2
commit 2b5a6a8
Show file tree

Hide file tree

Showing 38 changed files with 1,274 additions and 172 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -4,19 +4,23 @@
 
  - 
 
+## [0.10.1] - 2018-06-08
+
+ - fix experiment filters and reporting on codespeed submission errors (#77)
+
 ## [0.10.0] - 2018-06-08
 
- - Restructure command-line options in help, and use argparse (#73)
- - Add support for Python 3 and PyPy (#65)
- - Add support for extra criteria (things beside run time) (#64)
- - Add support for path names in ReBenchLog benchmark names
+ - restructure command-line options in help, and use argparse (#73)
+ - add support for Python 3 and PyPy (#65)
+ - add support for extra criteria (things beside run time) (#64)
+ - add support for path names in ReBenchLog benchmark names
 
 ## [0.9.1] - 2017-12-21
 
- - Fix time-left reporting of invalid times (#60)
- - Take the number of data points per run into account for estimated time left (#62)
- - Obtain process output on timeout to enable results of partial runs
- - Fix incompatibility with latest setuptools
+ - fix time-left reporting of invalid times (#60)
+ - take the number of data points per run into account for estimated time left (#62)
+ - obtain process output on timeout to enable results of partial runs
+ - fix incompatibility with latest setuptools
 
 ## [0.9.0] - 2017-04-23
 
@@ -56,7 +60,8 @@
  - [0.6.0] - 2014-05-19
  - [0.5.0] - 2014-03-25
 
-[Unreleased]: https://github.com/smarr/ReBench/compare/v0.10.0...HEAD
+[Unreleased]: https://github.com/smarr/ReBench/compare/v0.10.1...HEAD
+[0.10.1]: https://github.com/smarr/ReBench/compare/v0.10.0...v0.10.1
 [0.10.0]: https://github.com/smarr/ReBench/compare/v0.9.1...v0.10.0
 [0.9.1]: https://github.com/smarr/ReBench/compare/v0.9.0...v0.9.1
 [0.9.0]: https://github.com/smarr/ReBench/compare/v0.8.0...v0.9.0

diff --git a/INSTALL b/INSTALL
diff --git a/LICENSE b/LICENSE
@@ -0,0 +1,19 @@
+Copyright (c) 2009-2018 Stefan Marr <git@stefan-marr.de>
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in
+all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+THE SOFTWARE.
diff --git a/README.md b/README.md
@@ -0,0 +1,128 @@
+# ReBench: Execute and Document Benchmarks Reproducibly
+
+[![Build Status](https://travis-ci.org/smarr/ReBench.svg?branch=master)](https://travis-ci.org/smarr/ReBench)
+[![Documentation](https://readthedocs.org/projects/rebench/badge/?version=latest)](https://rebench.readthedocs.io/)
+[![Codacy Quality](https://api.codacy.com/project/badge/Grade/2f7210b65b414100be03f64fe6702d66)](https://www.codacy.com/app/smarr/ReBench)
+
+ReBench is a tool to run and document benchmark experiments.
+Currently, it is mostly used for benchmarking language implementations,
+but it can be used to monitor the performance of all
+kind of other applications and programs, too.
+
+The ReBench [configuration format][docs] is a text format based on [YAML](http://yaml.org/).
+A configuration file defines how to build and execute a set of *experiments*,
+i.e. benchmarks.
+It describe which binary was used, which parameters where given
+to the benchmarks, and the number of iterations to be used to obtain 
+statistically reliable results.
+
+With this approach, the configuration contains all benchmark-specific
+information to reproduce a benchmark run. However, it does not capture
+the whole system.
+
+The data of all benchmark runs is recorded in a data file for later analysis.
+Important for long-running experiments, benchmarks can be aborted and
+continued at a later time.
+
+ReBench is focuses on the execution aspect and does not provide advanced
+analysis facilities itself. Instead, it is used in combination with
+for instance R scripts to process the results or [Codespeed][1] to do continuous
+performance tracing.
+
+The documentation is hosted at [http://rebench.readthedocs.io/][docs].
+
+## Goals and Features
+
+ReBench is designed to
+
+ - enable reproduction of experiments
+ - document all benchmark parameters
+ - a flexible execution model,
+   with support for interrupting and continuing benchmarking
+ - defining complex sets of comparisons and executing them flexibly
+ - report results to continuous performance monitoring systems, e.g., [Codespeed][1]
+ - basic support to build/compile benchmarks/experiments on demand
+ - extensible support to read output of benchmark harnesses
+
+## Non-Goals
+
+ReBench isn't
+
+ - a framework for microbenchmark.
+   Instead, it relies on existing harnesses and can be extended to parse their
+   output.
+ - a performance analysis tool. It is meant to execute experiments and
+   record the corresponding measurements.
+ - a data analysis tool. It provides only a bare minimum of statistics,
+   but has an easily readable data format that can be processed, e.g., with R.
+
+## Installation and Usage
+
+<a id="install"></a>
+
+ReBench is implemented in Python and can be installed via pip:
+
+```bash
+pip install rebench
+```
+
+A minimal configuration file looks like:
+
+```yaml
+# this run definition will be chosen if no parameters are given to rebench
+default_experiment: all
+default_data_file:  'example.data'
+
+# a set of suites with different benchmarks and possibly different settings
+benchmark_suites:
+    ExampleSuite:
+        gauge_adapter: RebenchLog
+        command: Harness %(benchmark)s %(input)s %(variable)s
+        input_sizes: [2, 10]
+        variable_values:
+            - val1
+        benchmarks:
+            - Bench1
+            - Bench2
+
+# a set of binaries use for the benchmark execution
+virtual_machines:
+    MyBin1:
+        path: bin
+        binary: test-vm1.py %(cores)s
+        cores: [1]
+    MyBin2:
+        path: bin
+        binary: test-vm2.py
+
+# combining benchmark suites and benchmarks suites
+experiments:
+    Example:
+        suites:
+          - ExampleSuite
+        executions:
+            - MyBin1
+            - MyBin2
+```
+
+Saved as `test.conf`, it could be executed with ReBench as follows:
+
+```bash
+rebench test.conf
+```
+
+See the documentation for details: [http://rebench.readthedocs.io/][docs].
+
+## Support and Contributions
+
+In case you encounter issues,
+please feel free to [open an issue](https://github.com/smarr/rebench/issues/new)
+so that we can help.
+
+For contributions, we use the [normal Github flow](https://guides.github.com/introduction/flow/)
+of pull requests, discussion, and revisions. For larger contributions,
+it is likely useful to discuss them upfront in an issue first.
+
+
+[1]: https://github.com/tobami/codespeed/
+[docs]: http://rebench.readthedocs.io/
diff --git a/README.rst b/README.rst
diff --git a/docs/CHANGELOG.md b/docs/CHANGELOG.md
@@ -0,0 +1 @@
+../CHANGELOG.md
diff --git a/docs/LICENSE.md b/docs/LICENSE.md
@@ -0,0 +1 @@
+../LICENSE
diff --git a/docs/concepts.md b/docs/concepts.md
@@ -0,0 +1,85 @@
+# Basic Concepts
+
+Some of the used terminology may not be usual. To avoid confusion,
+the following defines the basic concepts 
+
+<dl>
+    <dt id="#experiment">experiment</dt>
+    <dd>
+      <p>A combination of <a href="#suite">benchmark suites</a> and
+      <a href="#vm">virtual machines</a>.</p>
+      
+      <p>ReBench executes experiments to collect the desired measurements.</p>
+    </dd>
+
+    <dt id="suite">benchmark suite</dt>
+    <dd>
+      A set of <a href="#benchmark">benchmarks</a>
+      which is used to define <a href="#experiment">experiments</a>.
+    </dd>
+
+    <dt id="vm">virtual machine</dt>
+    <dd>
+    <p>A named set of settings for the executor of a
+       <a href="#suite">benchmark suite</a>.</p>
+    <p>Typically, this is one specific virtual machine with a set of
+      startup parameters. It refers to an executable that will execute
+      <a href="#benchmark">benchmarks</a> from a <a href="#suite">suite</a>.
+      Thus, the virtual machine is the executor.</p>
+    </dd>
+
+    <dt id="benchmark">benchmark</dt>
+    <dd>
+      <p>A program to be executed by a <a href="#vm">virtual machine</a>.</p>
+      
+      <p>A benchmark can define a number of different <a href="#variable">variables</a>
+      that can be varied, for instance, to change the input data set,
+      the number of cores to be used, etc.</p>
+    </dd>
+
+    <dt id="variable">variable</dt>
+    <dd>
+        <p>A dimension of the <a href="#benchmark">benchmark</a>
+           that can be varied to influnce execution characteristics</p>
+        <p>Currently, we have the notion of input sizes, cores, and other
+           variable values. Each of them is varied independently and can potentially
+           be used to enumerate a large number of <a href="#run">runs</a>.</p>
+    </dd>
+
+    <dt id="run">run</dt>
+    <dd>
+      <p>A concrete execution of a <a href="#benchmark">benchmark</a> by
+      a specific <a href="#vm">virtual machine</a>.</p>
+      
+      <p>A run is a specific combination of variables.
+         It can be executed multiple times. Each time is referred to as an
+         <a href="#invocation">invocation</a>.
+         One run itself can execute a benchmark also multiple times, to which
+         we refer to as <a href="#iteration">iterations</a>.</p>
+      <p>One run can generate multiple <a href="#data-point">data points</a>.</p>
+    </dd>
+
+    <dt id="invocation">invocation</dt>
+    <dd>
+        The execution of a <a href="#run">run</a>. It may execute itself multiple
+        <a href="#iteration">iterations</a> of a <a href="#benchmark">benchmark</a>.
+    </dd>
+
+    <dt id="iteration">iteration</dt>
+    <dd>
+        The execution of a benchmark within a virtual machine
+        <a href="#invocation">invocation</a>.
+        An iteration is expected to generate one
+        <a href="#data-point">data point</a>, possibly including
+        multiple <a href="#measurement">measurements</a>.
+    </dd>
+
+    <dt id="data-point">data point</dt>
+    <dd>
+      A set of <a href="#measurement">measurements</a> belonging together.
+      They are generated by an <a href="#iteration">iteration</a>.
+    </dd>
+
+    <dt id="measurement">measurement</dt>
+    <dd>One value for one specific criterion.</dd>
+</dl>
diff --git a/docs/conf.py b/docs/conf.py
@@ -0,0 +1,8 @@
+from recommonmark.parser import CommonMarkParser
+
+source_parsers = {
+    '.md': CommonMarkParser,
+}
+
+source_suffix = ['.md']
+html_theme = 'sphinx_rtd_theme'