Skip to content

pyk/bench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bench

Tiny benchmarking library for Zig.

Features

  • CPU Counters: Measures CPU cycles, instructions, IPC, and cache misses directly from the kernel (Linux only).
  • Argument Support: Pass pre-calculated data to your functions to separate setup overhead from the benchmark loop.
  • Baseline Comparison: Easily compare multiple implementations against a reference function to see relative speedups or regressions.
  • Flexible Reporting: Access raw metric data programmatically to generate custom reports (JSON, CSV) or assert performance limits in CI.
  • Easy Throughput Metrics: Automatically calculates operations per second and data throughput (MB/s, GB/s) when payload size is provided.
  • Robust Statistics: Uses median and standard deviation to provide reliable metrics despite system noise.
  • Zero Dependencies: Implemented in pure Zig using only the standard library.

Installation

Fetch latest version:

zig fetch --save=bench https://github.com/pyk/bench/archive/main.tar.gz

Add bench as a dependency to your build.zig.

If you are using it only for tests/benchmarks, it is recommended to mark it as lazy:

.dependencies = .{
    .bench = .{
        .url = "...",
        .hash = "...",
        .lazy = true, // here
    },
}

Usage

Basic Run

To benchmark a single function, pass the allocator, a name, and the function pointer to run.

const res = try bench.run(allocator, "My Function", myFn, .{});
try bench.report({ .metrics = &.{res} });

Run with Arguments

You can generate test data before the benchmark starts and pass it via a tuple. This ensures the setup cost doesn't pollute your measurements.

// Setup data outside the benchmark
const input = try generateLargeString(allocator, 10_000);

// Pass input as a tuple
const res = try bench.run(allocator, "Parser", parseFn, .{input}, .{});

Comparing Implementations

You can run multiple benchmarks and compare them against a baseline. The baseline_index determines which result is used as the reference (1.00x).

const a = try bench.run(allocator, "Implementation A", implA, .{});
const b = try bench.run(allocator, "Implementation B", implB, .{});

try bench.report(.{
    .metrics = &.{ a, b },
    // Use the first metric (Implementation A) as the baseline
    .baseline_index = 0,
});

Measuring Throughput

If your function processes data (like copying memory or parsing strings), provide bytes_per_op to get throughput metrics (MB/s or GB/s).

const size = 1024 * 1024;
const res = try bench.run(allocator, "Memcpy 1MB", copyFn, .{
    .bytes_per_op = size,
});

// Report will now show GB/s instead of just Ops/s
try bench.report({ .metrics = &.{res} });

Configuration

You can tune the benchmark behavior by modifying the Options struct.

const res = try bench.run(allocator, "Heavy Task", heavyFn, .{
    .warmup_iters = 10,     // Default: 100
    .sample_size = 50,      // Default: 1000
});

Built-in Reporter

The default bench.report prints a human-readable table to stdout. It handles units (ns, us, ms, s) and coloring automatically.

Benchmark Summary: 3 benchmarks run
├─ NoOp        60ns      16.80M/s   [baseline]
│  └─ cycles: 14        instructions: 36        ipc: 2.51       miss: 0
├─ Sleep     1.06ms         944/s   17648.20x slower
│  └─ cycles: 4.1k      instructions: 2.9k      ipc: 0.72       miss: 17
└─ Busy     32.38us      30.78K/s   539.68x slower
   └─ cycles: 150.1k    instructions: 700.1k    ipc: 4.67       miss: 0

Custom Reporter

The run function returns a Metrics struct containing all raw data (min, max, median, variance, cycles, etc.). You can use this to generate JSON, CSV, or assert performance limits in CI.

const metrics = try bench.run(allocator, "MyFn", myFn, .{});

// Access raw fields directly
std.debug.print("Median: {d}ns, Max: {d}ns\n", .{
    metrics.median_ns,
    metrics.max_ns
});

Supported Metrics

This tiny benchmark library support (✅) various metrics:

Category Metric Description
Time ✅ Mean / Average Arithmetic average of all runs
Time ✅ Median The middle value (less sensitive to outliers)
Time ✅ Min / Max The absolute fastest and slowest runs
Time CPU vs Wall Time CPU time (active processing) vs Wall time (real world)
Throughput ✅ Ops/sec Operations per second
Throughput ✅ Bytes/sec Data throughput (MB/s, GB/s)
Throughput Items/sec Discrete items processed per second
Latency Percentiles p75, p99, p99.9. (e.g. "99% of requests were faster than X")
Latency ✅ Std Dev / Variance How much the results deviate from the average
Latency Outliers Detecting and reporting anomaly runs
Latency Confidence / Margin of Error e.g. "± 2.5%"
Latency Histogram Visual distribution of all runs
Memory Bytes Allocated Total heap memory requested per iteration
Memory Allocation Count Number of allocation calls
CPU ✅ Cycles CPU clock cycles used
CPU ✅ Instructions Total CPU instructions executed
CPU ✅ IPC Instructions Per Cycle (Efficiency)
CPU ✅ Cache Misses L1/L2 Cache misses
Comparative ✅ Speedup (x) "12.5x faster" (Current / Baseline).
Comparative Relative Diff (%) "+ 50%" or "- 10%".
Comparative Big O Complexity Analysis (O(n), O(log n)).
Comparative R² (Goodness of Fit) How well the data fits a linear model.

Other metrics will be added as needed. Feel free to send a pull request.

Notes

  • This library is designed to show you "what", not "why". I recommend using a proper profiling tool such as perf on linux + Firefox Profiler to answer "why".

  • doNotOptimizeAway is your friend. For example if you are benchmarking some scanner/tokenizer:

      while (true) {
          const token = try scanner.next();
          if (token == .end) break;
          total_ops += 1;
          std.mem.doNotOptimizeAway(token); // CRITICAL
      }
  • To get cycles, instructions, ipc (instructions per cycle) and cache_misses metrics on Linux, you may need to enable the kernel.perf_event_paranoid.

Prior Art

Development

Install the Zig toolchain via mise (optional):

mise trust
mise install

Run tests:

zig build test --summary all

Build library:

zig build

Enable/disable kernel.perf_event_paranoid for debugging:

# Disable
sudo sysctl -w kernel.perf_event_paranoid=3

# Enable
sudo sysctl -w kernel.perf_event_paranoid=-1

License

MIT. Use it for whatever.

About

Tiny benchmarking library for Zig

Topics

Resources

License

Stars

Watchers

Forks

Languages