MMTests is a configurable test suite that runs performance tests against arbitrary workloads. This is not the only test framework but care is taken to make sure the test configurations are accurate, representative and reproducible. Reporting and analysis is common across all benchmarks. Support exists for gathering additional telemetry while tests are running and hooks exist for more detailed tracing using ftrace or perf.
The top-level directory has a single driver script called
which reads a config file that describes how the benchmarks should be
configured and executed. In some cases, the same benchmarking tool may
be used with different configurations that stresses the scenario.
A test name can have any name. A common use case is simply to compare kernel versions but it can be anything --different compiler, different userspace package, different benchmark configuration etc.
Monitors can be optionally configured, but care should be taken as there is a possibility that they introduce overhead of their own. Hence, for some performance sensitive tests it is preferable to have no monitoring.
Many of the tests download external benchmarks. An attempt will be made
to download from a mirror if it exists. To get an idea where the mirror
should be located, grep for
A basic invocation of the suite is
$ ./bin/autogen-configs $ ./run-mmtests.sh --no-monitor --config configs/config-pagealloc-performance 5.8-vanilla $ ./run-mmtests.sh --no-monitor --config configs/config-pagealloc-performance 5.9-vanilla $ cd work/log $ ../../compare-kernels.sh $ mkdir /tmp/html/ $ ../../compare-kernels.sh --format html --output-dir /tmp/html > /tmp/html/index.html
The first step is optional. Some configurations are auto-generated from a template, particularly the filesystem-specific ones.
and maybe even
need to be installed from CPAN for the reporting to
R should be installed if
attempting to highlight whether performance differences are statistically
A "tutorial" with some more details and the full output of each step is available here:
Running Benchmarks with MMTests
All available configurations are stored in
config-pagealloc-performance can be used to run tests that
may be able to identify performance regressions or gains in the page allocator.
Similarly there are network, disk and scheduler configs.
The config file can take many options, in the form of
variables. There is an example (functional) config file available in
Some options are universal, others are specific to the test. Some of the universal ones are:
MMTESTS: A list of what tests will be run.
AUTO_PACKAGE_INSTALL: Whether packages necessary for building or running benchmarks should be automatically installed, without asking any confirmation (takes a
noand creating a
/.mmtests-auto-package-installwould be equivalent of setting this to
numactlshould be used for deciding (typically, for restricting) on what CPUs and/or NUMA nodes the benchmark will run. It accepts several values.
interleave, are the simplest, but the following ones can also be used:
cpubind_largest_nonnode0_memory, in which case,
MMTESTS_NODE_IDshould also be defined
cpubind_node_nrcpus, in which case
MMTESTS_NUMA_NODE_NRCPUSshould also be defined. If
noneis used or the option is not present, nothing is done in terms of NUMA pinning of the benchmarks.
MMTESTS_TUNED_PROFILE: Whether or not the tuned tool should be used and, if yes, with which profile. In fact, the option takes the name of the desired profile (which should be present in the system. If this is defined
tunedis started and stopped around the execution of the benchmarks.
SWAP_SWAPFILE_SIZEMB: It's possible to use a different swap configuration than what is provided by default.
TESTDISK_RAID_TYPE: If the target machine has partitions suitable for configuring RAID, they can be specified here. This RAID partition is then used for all the tests.
TESTDISK_PARTITION: Use this partition for all tests.
TESTDISK_MOUNT_ARGS: The filesystem,
mkfsparameters and mount arguments for the test partitions.
TESTDISK_DIR: A directory passed to the test. If not set, defaults to
SHELLPACK_TEMP. The directory is supposed to contain a precreated environment (eg. a specifically created filesystem mounted with desired mount options).
STORAGE_BACKING_DEVICE: It's also possible to use storage caching.
STORAGE_CACHE_TYPEis either "dm-cache" or "bcache". The devices specified with
STORAGE_BACKING_DEVICEare used to create the cache device which then is used for all the tests.
Platform Specific Configuration
It is possible to retrieve information about the characteristics of the system where the benchmarks will be running, and use them inside a config file.
MEMTOTAL_BYTES: Tells how much memory there is in the system.
NUMCPUS: Tells how many CPUs are present in the system.
It is possible to add the following to the config file:
. $SHELLPACK_INCLUDE/include-sizes.sh get_numa_details
This will give access to more information about the system topology, such as:
NUMLLCS: Number of Last Level Caches present in the system.
NUMNODES: Number of NUMA nodes.
Taking advantage of this knowledge about the characteristics of the platform the configuration of the benchmarks can be refined.
The entry point to running benchmarks is
run-mmtests.sh. If run with
--help the available options are shown:
run-mmtests [-mnpb] [-c config-file] test-name Options: -m|--run-monitors Run with monitors enabled as specified by the configuration -n|--no-monitor Only execute the benchmark, do not execute it -p|--performance Set the performance cpufreq governor before starting -c|--config Configuration file to read (default: config) -b|--build-only Only build the benchmark, do not execute it
If no config file is specified, the one in
./config is used.
After a run, the benchmark results as well as any data that can be useful for
a report will be available in the
work/log directory. (more specifically,
work/log/TEST_RUN/iter-0, for a run called
Note that often a configuration will run more than just one single benchmark
(this depends on the value of the
MMTESTS option in the config itself),
resulting in some subdirectories being present in the results directory.
Running MMTests as
Configuring the system for running a benchmark may include doing some changes
to the system itself that can only be done with
For instance, the config-workload-thpfioscale-defrag config does:
echo always > /sys/kernel/mm/transparent_hugepage/defrag
run-mmtests.sh as a "regular user" doing something like that
will fail. benchmarks should still complete (most likely with some warnings)
but results will likely not be the ones expected.
In fact, MMTests is intended to be run as
root. For most of the changes that
it applies to the system, the framework is careful to (try to) undo them. It
is however fair to say that MMTests is best used on machines that can be
redeployed and reset to a clean known state both before and after running a
A full list of available monitors is in
The following options, to be defined in the config file, can be used to control monitoring:
noswitch for deciding whether monitoring should happen or not, during the execution of the benchmarks. If set to
no, even if monitors are defined, they will be ignored (but see
MONITOR_ALWAYSbelow). It can be overridden by the
--no-monitorcommand like parameters. I.e.,
--run-monitorsmeans we will always run monitors, even if we have
RUN_MONITOR=noin the config file (and vice versa, for
MONITORS_ALWAYS: Basically, another override. In fact, monitors defined here will be started even if we and
MONITORS_GZIP: A list of monitors to be used during the benchmarks. Their output will be saved in compressed (with gzip) log files.
MONITORS_WITH_LATENCY: A list of monitors to be used during the benchmarks with their output augmented with some additional timestamping.
MONITOR_UPDATE_FREQUENCY: How frequently, in seconds, the various defined monitors should produce and log a sample.
MONITOR_FTRACE_EVENTS: respectively, options to set and tracing events to enable for
ftrace, if the "ftrace" monitor is enabled.
MONITOR_PERF_EVENTS: list of
perfevents to stat or record, when any of the "perf-foo" monitor is enabled (see below).
The files in
monitors/ all follow the same naming scheme, which is
watch-foo.[sh|pl]. For instance, we have
monitoring the output of
mpstat and the content of
/proc/vmstat during the execution of a benchmark, include this option in
to the config file:
MONITORS_GZIP="proc-vmstat mpstat proc-interrupts"
Similarly, to monitor the output of
iostat, and also add
some timestamps to the output, define this option:
In order to record the output of, for instance, the
tracepoint, make sure to have
ftrace in the list of monitors defined in
MONITORS_GZIP and then add
for a more advanced example.)
perf "as a monitor", a list of events should be defined, e.g.
MONITOR_PERF_EVENTS=node-load-misses,node-store-misses. Also, the monitor
should be defined either adding
perf-time-stat to the list of
MONITORS_GZIP, or adding
perf-event-stat to the
For reporting, there is a basic
Despite the name, it can compare an arbitrary number of benchmarking runs. The name has historical reasons, from the time when the only use case was comparing kernel versions, but nowadays anything can be compared --machines, userspace packages, benchmark versions, tuning parameters etc.
It is optionally possible to specify a different baseline and comparison points, while by default the results are organised by the time the test was executed.
NAME compare-kernels.sh - Compare results between benchmarking runs SYNOPSIS compare-kernels.sh [options] Options: --baseline <testname> Baseline test name, default is time ordered --compare "<test> <test>" Comparison test names, space separated --exclude "<test> <test>" Exclude test names --auto-detect Attempt to automatically highlight significant differences --sort-version Assume kernel versions for test names and attempt to sort --format html Generate a HTML format of the report --output-dir Output directory for HTML report
It must be run from within an MMTests results directory. So, even if the
benchmarks have been run on a different machine, it is enought to capture
work/log and run
compare-kernels.sh from there.
In the table(s) produced, it is usually the most interesting to look at the average values, computed over the individual results of multiple repetitions of the benchmarks. Note that some benchmarks use the harmonic mean (Hmean) and some use the arithmetic mean (Amean), depending of the nature of the results.
compare-kernel.sh can generate an HTML report, with both tables and graphs.
For doing that, both the format and the output directory needs to be
specified. The HTML page will them come directly out of the standard output
of the tool. Therefore, invoking it like this is recommended:
$ cd work/log $ mkdir /tmp/report $ ../../compare-kernels.sh --format html --output-dir /tmp/report > /tmp/report/index.html
An example of the HTML reporting is available
This comes from two simple run of the default
config (i.e., of the
STREAM benchmark) when the system was idle (
TEST_RUN) and busy with
something else (
It is possible to obtain a report using a different tool. It is the script
compare-kernels.sh calls internally and it located at
The output is the same table(s) produced by
A possible invocation could look like this:
./bin/compare-mmtests.pl --directory work/log --benchmark stream --names TEST_RUN,TEST_RUN_BUSY TEST_RUN TEST_RUN_BUSY MB/sec copy 19059.86 ( 0.00%) 15234.88 ( -20.07%) MB/sec scale 14078.10 ( 0.00%) 11258.38 ( -20.03%) MB/sec add 14740.32 ( 0.00%) 11749.84 ( -20.29%) MB/sec triad 14504.22 ( 0.00%) 11317.26 ( -21.97%)
If the benchmark does multiple operations --like STREAM above that checks the
memory throughput of four different operations-- there will be one result for
each. In these cases,
compare-mmtests.pl can be used to produce an overall
comparison between the benchmarks.
This is done by taking the geometric mean (Gmean of the results. The geometric mean is chosen because it has the nice property that the mean of ratios is equal to the ratios of the means, so we do not get different results depending on the order of the operations.
Looking at the Gmean offers a concise and hence rather useful overview of the overall performance, especially when complex benchmarks are used.
./bin/compare-mmtests.pl --directory work/log/ --benchmark stream --names TEST_RUN,TEST_RUN_BUSY --print-ratio TEST_RUN TEST_RUN_BUSY Ratio copy 1.00 (0.00%) (NaNs) 0.80 (-20.07%) (NaNs) Ratio scale 1.00 (0.00%) (NaNs) 0.80 (-20.03%) (NaNs) Ratio add 1.00 (0.00%) (NaNs) 0.80 (-20.29%) (NaNs) Ratio triad 1.00 (0.00%) (NaNs) 0.78 (-21.97%) (NaNs) Gmean Higher 1.00 0.79
Of course, the Gmean for the benchmark chosen as the baseline will always
1.00. Additionally, the
Lower "tag" tells us whether
it is the higher or lower values that represent better performance.
In the example above,
TEST_RUN_BUSY reaches only the 79% of
performance, which means that it is 21% slower.
Further info about reporting:
MMTests Internal Structure & Development
Benchmarks & Shellpacks
The install and test scripts are automatically generated from "shellpacks".
A shellpack is a pair of benchmark and install scripts that are stored in
Actual shellpacks are automatically generated from template files stored
shellpack_src/src/. Some have a build suffix indicating that it is only
building a supporting tool like a library a benchmark requires. Do not
modify the generated test-scripts in
shellpacks/ directory as they will
simply be overwritten.
/shellpacks/shellpack-bench-pgbench --which will be
automatically generated from
contains all the individual test steps.
Each test is driven by
bin/run-single-test.sh script which reads
drivers/driver-<testname>.sh script (e.g.,
Downloading Benchmarks & Mirrors
MMTests needs to download the various benchmarks from their official location, i.e., from the Internet. That might be problematic because it can (should!) be considered not trusted, or even just because the official version may have been updated to a newer version which maybe is not yet compatible with the current version of MMTests' shellpacks for that particular benchmark. And if this happens, the run will likely fail.
Other potential problems are that the download may fail due to temporary networking issues, that it consumes bandwidth and that it adds delays and makes testing longer.
It is therefore possible to create a local mirror. The location of such mirror
can be configured in
kernbench tries to download
If this is not available, it is downloaded from the internet.
This can add delays in testing and consumes bandwidth so is worth configuring.
Contributing and Bug Reporting
(Pseudo-)Random links to when MMTests got mentioned around in the Internet:
- MMTests being used to benchmark patches to the task wake-up path inside the Linux scheduler, on LKML here and here.
- MMTests used to reproduce a bug in the accounting code inside the Linux scheduler, on LKML.
- MMTests used to benchmark some early version of the Core Scheduling patches, highlighting their impact on both baremetal and virtualization workloads, on LKML (check the replies for seeing all the benchmark results).
- Additionally to the above examples, a lot more reports of MMTests being used for Linux kernel development can be found just by searching for 'MMTests' in an LKML archive.
- Giovanni Gherdovich explaining running MMTests and reading the reporting on LKML.
Talks and presentation about or related to MMTests:
- Scheduler benchmarking with MMTests is a report of a talk about MMTests given at 2020 OSPM conference (slides).
- FOSDEM 2020 talk about MMTests, focusing on using it for running benchmarks inside virtual machines Automated Performance Testing for Virtualization with MMTests
- Mel Gorman's talk at SUSE Labs Conference 2018, Marvin: Automated assistant for development and CI. It's about Marvin, but mentions MMTests as well.
- Davidlohr's talk at LinuxCon NA 2015 Performance Monitoring in the Linux Kernel
Some historic references: