Skip to content

vidaiUK/Benchmarking

Repository files navigation

Benchmark Badge Version Badge

🚀 VidaiServer Benchmark Suite v2

Fair, reproducible benchmarks for LLM proxy gateways using VidaiMock as a consistent backend.

⚠️ Note: K6 is run with --out json. The JSON files generated can be as large as 3.5GB each on certain tests. Ensure you have enough disk space!


🏁 Quick Start

# 1️⃣ Setup
./scripts/setup.sh

# 2️⃣ Configure environment (With VidaiServer Binaries)
# cp configs/env/enterprise.env.template configs/env/enterprise.env
# cp configs/env/admin.env.template configs/env/admin.env
# source configs/env/enterprise.env && source configs/env/admin.env

# 3️⃣ Start services
./scripts/start_services.sh mac competitors   # or 'linux' on Linux

# 4️⃣ Validate VidaiMock configuration
./scripts/validate_vidaimock.sh all

# 5️⃣ Run benchmarks
./run_benchmark.sh all         # Full detailed (~2 hrs)
./run_benchmark.sh sanity      # Quick validation (~5 min)
./run_benchmark.sh throughput  # Full throughput test (~60 min)

🧪 Benchmark Modes

Mode Duration Purpose
sanity ~5 min Quick smoke test - validates all dimensions
throughput ~60 min Latency & errors at each RPS level (200-12K)
chaos ~30 min Resilience under error injection
quick ~15 min Sanity + limited throughput
all ~2 hrs Complete benchmark suite

🎯 Targets Tested

Target Description
Baseline VidaiMock direct (no proxy)
VidaiServer L1 SQLite, routing only
VidaiServer L2 + Guardrails
VidaiServer L3 + PostgreSQL telemetry
Bifrost Go-based proxy
LiteLLM Python proxy
Portkey Node.js proxy

🛠️ VidaiMock Per-Request Control

VidaiMock supports per-request configuration via HTTP headers - no restart needed:

Header Example Description
X-Response-Size small, medium, large Response size (~1KB, ~11KB, ~42KB)
X-Vidai-Latency 200 Inject latency (ms)
X-Vidai-Jitter 0.2 Latency jitter (0-1)
X-Vidai-Chaos-Drop 10 Error injection (0-100%)

This enables mixed-traffic tests without profile switching.


📊 Key Metrics

Metric Description
Max Stable RPS Highest RPS with <5% errors
Proxy Overhead Latency added vs baseline
Error Amplification Gateway errors vs backend errors
Resilience Grade How gateway handles backend failures

📁 Results

Results saved to results/<timestamp>/:

  • sanity_*.json - Quick validation results
  • throughput_*.json - Per-step metrics and breaking points
  • chaos_*.json - Resilience analysis with grades
  • charts/ - PNG visualizations
  • charts/SUMMARY.md - Markdown report

🗂️ Directory Structure

real_benchmark_v2/
├── run_benchmark.sh           # Main entry point
├── configs/
│   ├── vidaimock/             # Mock templates
│   │   └── templates/openai/  # Response templates
│   ├── vidai/                 # VidaiServer layer configs
│   ├── bifrost/               # Bifrost config (caching disabled)
│   └── litellm/               # LiteLLM config
├── k6/
│   ├── lib/                   # Shared config & request helpers
│   ├── scenarios/
│   │   ├── sanity.js          # Quick validation
│   │   ├── throughput/        # Stepwise RPS tests
│   │   └── chaos/             # Resilience tests
│   └── deprecated/            # Old scenarios
├── scripts/
│   ├── validate_vidaimock.sh  # VidaiMock validation
│   └── analyze_results.py     # Chart generation
├── docs/
│   └── VIDAIMOCK_API_REFERENCE.md  # VidaiMock curl examples
├── bin/                       # Binaries (VidaiServer, VidaiMock)
└── results/                   # Output (timestamped)

⚙️ Prerequisites

  • Docker
  • k6 (brew install k6 or apt install k6)
  • Python 3 with matplotlib (for charts)
  • Free ports: 3000, 4000, 5433, 8080, 8100, 8787
  • VIDAI_LICENSE_KEY and VIDAI_ADMIN_KEY environment variables

🌱 Environment Variables

Variable Default Description
MAX_RPS 12000 Max RPS for throughput tests
CHAOS_RATE 10 Error injection % for chaos tests
CHAOS_RPS 500 RPS for chaos tests
CHAOS_DURATION 60s Duration for chaos tests

🩺 Troubleshooting

# Check services
curl -s http://localhost:8100/health   # VidaiMock
curl -s http://localhost:3000/health   # VidaiServer

# Validate VidaiMock features
./scripts/validate_vidaimock.sh all

# Clean restart
./scripts/stop_services.sh
docker compose down -v
./scripts/start_services.sh mac competitors

🚦 Error Classification

Chaos tests distinguish expected errors from unexpected errors:

  • Expected: HTTP 500 with "Simulated" in body (chaos injection)
  • Unexpected: Any other error (real problem with the gateway)

This ensures accurate resilience grading.


📚 See Also

About

Benchmarking LLM gateways: Vidai(Rust) vs Bifrost (Go) vs Litellm (Python) vs Portkey (NodeJS)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors