This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-`--study-name`: a name to identify a run and provide label during analysis,
-`--function`: the name of the function under test.
The display target will attempt to open a window on the machine where you're
running the benchmark. If this may not work for you then you may want `render`
or `run` instead as detailed below.
It also provides optional flags:
-`--num-trials`: repeats the benchmark more times, the analysis tool can take this into account and give confidence intervals.
-`--output`: specifies a file to write the report - or standard output if not set.
-`--aligned-access`: The alignment to use when accessing the buffers, default is unaligned, 0 disables address randomization.
## Benchmarking targets
> Note: `--function` takes a generic function name like `memcpy` or `memset` but the actual function being tested is the llvm-libc implementation (e.g. `__llvm_libc::memcpy`).
The benchmarking process occurs in two steps:
### Stochastic mode
1. Benchmark the functions and produce a `json` file
2. Display (or renders) the `json` file
This is the preferred mode to use. The function parameters are randomized and the branch predictor is less likely to kick in.
Targets are of the form `<action>-libc-<function>-benchmark-<configuration>`
```shell
/tmp/build/bin/libc-benchmark-main \
--study-name="new memcpy" \
--function=memcpy \
--size-distribution-name="memcpy Google A" \
--num-trials=30 \
--output=/tmp/benchmark_result.json
```
-`action` is one of :
-`run`, runs the benchmark and writes the `json` file
-`display`, displays the graph on screen
-`render`, renders the graph on disk as a `png` file
-`function` is one of : `memcpy`, `memcmp`, `memset`
-`configuration` is one of : `small`, `big`
The `--size-distribution-name` flag is mandatory and points to one of the [predefined distribution](libc/benchmarks/MemorySizeDistributions.h).
## Benchmarking regimes
> Note: These distributions are gathered from several important binaries at Google (servers, databases, realtime and batch jobs) and reflect the importance of focusing on small sizes.
Using a profiler to observe size distributions for calls into libc functions, it
was found most operations act on a small number of bytes.
- Exercises sizes up to `32MiB` to test large operations
- Caching effects can show up here which prevents comparing different hosts
_<sup>1</sup> - The size refers to the size of the buffers to compare and not
the number of bytes until the first difference._
## Superposing curves
### Sweep mode
This mode is used to measure call latency per size for a certain range of sizes. Because it exercises the same size over and over again the branch predictor can kick in. It can still be useful to compare strength and weaknesses of particular implementations.
```shell
/tmp/build/bin/libc-benchmark-main \
--study-name="new memcpy" \
--function=memcpy \
--sweep-mode \
--sweep-max-size=128 \
--output=/tmp/benchmark_result.json
```
## Analysis tool
It is possible to **merge** several `json` files into a single graph. This is
useful to **compare** implementations.
### Setup
Make sure to have `matplotlib`, `pandas` and `seaborn` setup correctly:
```shell
apt-get install python3-pip
pip3 install matplotlib pandas seaborn
```
You may need `python3-gtk` or similar package to display the graphs.
In the following example we superpose the curves for `memcpy`, `memset` and
`memcmp`:
### Usage
```shell
> make -C /tmp/build run-libc-memcpy-benchmark-small run-libc-memcmp-benchmark-small run-libc-memset-benchmark-small
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
parser.add_argument("--mode", choices=["time", "cycles", "bytespercycle"], default="time", help="Use to display either 'time', 'cycles' or 'bytes/cycle'.")
parser.add_argument("files", nargs="+", help="The json files to read from.")