Fair, reproducible benchmarks for LLM proxy gateways using VidaiMock as a consistent backend.
⚠️ Note: K6 is run with--out json. The JSON files generated can be as large as 3.5GB each on certain tests. Ensure you have enough disk space!
# 1️⃣ Setup
./scripts/setup.sh
# 2️⃣ Configure environment (With VidaiServer Binaries)
# cp configs/env/enterprise.env.template configs/env/enterprise.env
# cp configs/env/admin.env.template configs/env/admin.env
# source configs/env/enterprise.env && source configs/env/admin.env
# 3️⃣ Start services
./scripts/start_services.sh mac competitors # or 'linux' on Linux
# 4️⃣ Validate VidaiMock configuration
./scripts/validate_vidaimock.sh all
# 5️⃣ Run benchmarks
./run_benchmark.sh all # Full detailed (~2 hrs)
./run_benchmark.sh sanity # Quick validation (~5 min)
./run_benchmark.sh throughput # Full throughput test (~60 min)| Mode | Duration | Purpose |
|---|---|---|
sanity |
~5 min | Quick smoke test - validates all dimensions |
throughput |
~60 min | Latency & errors at each RPS level (200-12K) |
chaos |
~30 min | Resilience under error injection |
quick |
~15 min | Sanity + limited throughput |
all |
~2 hrs | Complete benchmark suite |
| Target | Description |
|---|---|
| Baseline | VidaiMock direct (no proxy) |
| VidaiServer L1 | SQLite, routing only |
| VidaiServer L2 | + Guardrails |
| VidaiServer L3 | + PostgreSQL telemetry |
| Bifrost | Go-based proxy |
| LiteLLM | Python proxy |
| Portkey | Node.js proxy |
VidaiMock supports per-request configuration via HTTP headers - no restart needed:
| Header | Example | Description |
|---|---|---|
X-Response-Size |
small, medium, large |
Response size (~1KB, ~11KB, ~42KB) |
X-Vidai-Latency |
200 |
Inject latency (ms) |
X-Vidai-Jitter |
0.2 |
Latency jitter (0-1) |
X-Vidai-Chaos-Drop |
10 |
Error injection (0-100%) |
This enables mixed-traffic tests without profile switching.
| Metric | Description |
|---|---|
| Max Stable RPS | Highest RPS with <5% errors |
| Proxy Overhead | Latency added vs baseline |
| Error Amplification | Gateway errors vs backend errors |
| Resilience Grade | How gateway handles backend failures |
Results saved to results/<timestamp>/:
sanity_*.json- Quick validation resultsthroughput_*.json- Per-step metrics and breaking pointschaos_*.json- Resilience analysis with gradescharts/- PNG visualizationscharts/SUMMARY.md- Markdown report
real_benchmark_v2/
├── run_benchmark.sh # Main entry point
├── configs/
│ ├── vidaimock/ # Mock templates
│ │ └── templates/openai/ # Response templates
│ ├── vidai/ # VidaiServer layer configs
│ ├── bifrost/ # Bifrost config (caching disabled)
│ └── litellm/ # LiteLLM config
├── k6/
│ ├── lib/ # Shared config & request helpers
│ ├── scenarios/
│ │ ├── sanity.js # Quick validation
│ │ ├── throughput/ # Stepwise RPS tests
│ │ └── chaos/ # Resilience tests
│ └── deprecated/ # Old scenarios
├── scripts/
│ ├── validate_vidaimock.sh # VidaiMock validation
│ └── analyze_results.py # Chart generation
├── docs/
│ └── VIDAIMOCK_API_REFERENCE.md # VidaiMock curl examples
├── bin/ # Binaries (VidaiServer, VidaiMock)
└── results/ # Output (timestamped)
- Docker
- k6 (
brew install k6orapt install k6) - Python 3 with matplotlib (for charts)
- Free ports: 3000, 4000, 5433, 8080, 8100, 8787
VIDAI_LICENSE_KEYandVIDAI_ADMIN_KEYenvironment variables
| Variable | Default | Description |
|---|---|---|
MAX_RPS |
12000 | Max RPS for throughput tests |
CHAOS_RATE |
10 | Error injection % for chaos tests |
CHAOS_RPS |
500 | RPS for chaos tests |
CHAOS_DURATION |
60s | Duration for chaos tests |
# Check services
curl -s http://localhost:8100/health # VidaiMock
curl -s http://localhost:3000/health # VidaiServer
# Validate VidaiMock features
./scripts/validate_vidaimock.sh all
# Clean restart
./scripts/stop_services.sh
docker compose down -v
./scripts/start_services.sh mac competitorsChaos tests distinguish expected errors from unexpected errors:
- Expected: HTTP 500 with "Simulated" in body (chaos injection)
- Unexpected: Any other error (real problem with the gateway)
This ensures accurate resilience grading.
- METHODOLOGY.md - Detailed test methodology
- docs/VIDAIMOCK_API_REFERENCE.md - VidaiMock curl examples