Skip to content

feat: built-in benchmark#78

Merged
kacy merged 1 commit intomainfrom
feat/cli-benchmark
Feb 10, 2026
Merged

feat: built-in benchmark#78
kacy merged 1 commit intomainfrom
feat/cli-benchmark

Conversation

@kacy
Copy link
Copy Markdown
Owner

@kacy kacy commented Feb 10, 2026

summary

adds ember-cli benchmark with pipelining support, concurrent clients, and latency percentile reporting. depends on #77 (cluster subcommands) — merge that first.

usage:

ember-cli benchmark -n 100000 -c 50 -P 16 -d 64 -t set,get

options:

  • -n requests (default: 100,000)
  • -c concurrent clients (default: 50)
  • -P pipeline depth (default: 1)
  • -d value size in bytes (default: 64)
  • -t workloads to run: set, get, ping (default: set,get)
  • --keyspace unique keys (default: 100,000)
  • -q quiet mode

output:

=== ember benchmark ===
server:     127.0.0.1:6379
requests:   100,000
clients:    50
pipeline:   16
data size:  64 bytes

SET: 523,809 rps    p50: 120us  p99: 410us  p99.9: 1.23ms  max: 4.56ms
GET: 612,345 rps    p50: 100us  p99: 380us  p99.9: 1.01ms  max: 3.21ms

new files:

  • bench_conn.rs — lightweight pipelined TCP connection that pre-serializes a command once and writes it N times per batch
  • benchmark.rs — workload runner with barrier-synchronized clients, per-batch latency collection, and percentile computation

what was tested

  • 14 unit tests covering: BenchmarkArgs defaults, percentile computation (empty/single/multiple), format helpers, workload frame generation (correct RESP3 structure), bench connection command building
  • all 80 CLI tests pass (cargo test -p emberkv-cli)
  • full workspace: cargo clippy --workspace -- -D warnings, cargo fmt --all --check, cargo test --workspace

design considerations

  • each benchmark client gets its own BenchConnection with a 256 KiB read buffer for throughput
  • commands are pre-serialized into Bytes once, then written N times per pipeline batch to avoid re-serialization overhead
  • tokio::sync::Barrier synchronizes client start to minimize warmup skew
  • uses StdRng::from_os_rng() instead of ThreadRng for Send compatibility with tokio::spawn
  • GET workload pre-populates keys via a silent SET pass so reads hit actual data
  • latency is measured per-batch then divided by pipeline size for per-command estimates

adds `ember-cli benchmark` with pipelining support, concurrent
clients, and latency percentile reporting. workloads: SET, GET,
PING with configurable request count, client count, pipeline depth,
data size, and keyspace.

bench_conn provides a lightweight pipelined TCP connection that
pre-serializes commands once and writes them N times per batch.
GET workload pre-populates keys before measuring reads.

output shows ops/sec and p50/p99/p99.9/max latency per workload.
@kacy kacy force-pushed the feat/cli-benchmark branch from 9fabf6a to 5acac69 Compare February 10, 2026 02:32
@kacy kacy merged commit 4fa8eb2 into main Feb 10, 2026
7 checks passed
@kacy kacy deleted the feat/cli-benchmark branch February 10, 2026 02:32
kacy added a commit that referenced this pull request Feb 11, 2026
adds `ember-cli benchmark` with pipelining support, concurrent
clients, and latency percentile reporting. workloads: SET, GET,
PING with configurable request count, client count, pipeline depth,
data size, and keyspace.

bench_conn provides a lightweight pipelined TCP connection that
pre-serializes commands once and writes them N times per batch.
GET workload pre-populates keys before measuring reads.

output shows ops/sec and p50/p99/p99.9/max latency per workload.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant