🚀 VidaiServer Benchmark Suite v2

Fair, reproducible benchmarks for LLM proxy gateways using VidaiMock as a consistent backend.

⚠️ Note: K6 is run with --out json. The JSON files generated can be as large as 3.5GB each on certain tests. Ensure you have enough disk space!

🏁 Quick Start

# 1️⃣ Setup
./scripts/setup.sh

# 2️⃣ Configure environment (With VidaiServer Binaries)
# cp configs/env/enterprise.env.template configs/env/enterprise.env
# cp configs/env/admin.env.template configs/env/admin.env
# source configs/env/enterprise.env && source configs/env/admin.env

# 3️⃣ Start services
./scripts/start_services.sh mac competitors   # or 'linux' on Linux

# 4️⃣ Validate VidaiMock configuration
./scripts/validate_vidaimock.sh all

# 5️⃣ Run benchmarks
./run_benchmark.sh all         # Full detailed (~2 hrs)
./run_benchmark.sh sanity      # Quick validation (~5 min)
./run_benchmark.sh throughput  # Full throughput test (~60 min)

🧪 Benchmark Modes

Mode	Duration	Purpose
`sanity`	~5 min	Quick smoke test - validates all dimensions
`throughput`	~60 min	Latency & errors at each RPS level (200-12K)
`chaos`	~30 min	Resilience under error injection
`quick`	~15 min	Sanity + limited throughput
`all`	~2 hrs	Complete benchmark suite

🎯 Targets Tested

Target	Description
Baseline	VidaiMock direct (no proxy)
VidaiServer L1	SQLite, routing only
VidaiServer L2	+ Guardrails
VidaiServer L3	+ PostgreSQL telemetry
Bifrost	Go-based proxy
LiteLLM	Python proxy
Portkey	Node.js proxy

🛠️ VidaiMock Per-Request Control

VidaiMock supports per-request configuration via HTTP headers - no restart needed:

Header	Example	Description
`X-Response-Size`	`small`, `medium`, `large`	Response size (~1KB, ~11KB, ~42KB)
`X-Vidai-Latency`	`200`	Inject latency (ms)
`X-Vidai-Jitter`	`0.2`	Latency jitter (0-1)
`X-Vidai-Chaos-Drop`	`10`	Error injection (0-100%)

This enables mixed-traffic tests without profile switching.

📊 Key Metrics

Metric	Description
Max Stable RPS	Highest RPS with <5% errors
Proxy Overhead	Latency added vs baseline
Error Amplification	Gateway errors vs backend errors
Resilience Grade	How gateway handles backend failures

📁 Results

Results saved to results/<timestamp>/:

sanity_*.json - Quick validation results
throughput_*.json - Per-step metrics and breaking points
chaos_*.json - Resilience analysis with grades
charts/ - PNG visualizations
charts/SUMMARY.md - Markdown report

🗂️ Directory Structure

real_benchmark_v2/
├── run_benchmark.sh           # Main entry point
├── configs/
│   ├── vidaimock/             # Mock templates
│   │   └── templates/openai/  # Response templates
│   ├── vidai/                 # VidaiServer layer configs
│   ├── bifrost/               # Bifrost config (caching disabled)
│   └── litellm/               # LiteLLM config
├── k6/
│   ├── lib/                   # Shared config & request helpers
│   ├── scenarios/
│   │   ├── sanity.js          # Quick validation
│   │   ├── throughput/        # Stepwise RPS tests
│   │   └── chaos/             # Resilience tests
│   └── deprecated/            # Old scenarios
├── scripts/
│   ├── validate_vidaimock.sh  # VidaiMock validation
│   └── analyze_results.py     # Chart generation
├── docs/
│   └── VIDAIMOCK_API_REFERENCE.md  # VidaiMock curl examples
├── bin/                       # Binaries (VidaiServer, VidaiMock)
└── results/                   # Output (timestamped)

⚙️ Prerequisites

Docker
k6 (brew install k6 or apt install k6)
Python 3 with matplotlib (for charts)
Free ports: 3000, 4000, 5433, 8080, 8100, 8787
VIDAI_LICENSE_KEY and VIDAI_ADMIN_KEY environment variables

🌱 Environment Variables

Variable	Default	Description
`MAX_RPS`	12000	Max RPS for throughput tests
`CHAOS_RATE`	10	Error injection % for chaos tests
`CHAOS_RPS`	500	RPS for chaos tests
`CHAOS_DURATION`	60s	Duration for chaos tests

🩺 Troubleshooting

# Check services
curl -s http://localhost:8100/health   # VidaiMock
curl -s http://localhost:3000/health   # VidaiServer

# Validate VidaiMock features
./scripts/validate_vidaimock.sh all

# Clean restart
./scripts/stop_services.sh
docker compose down -v
./scripts/start_services.sh mac competitors

🚦 Error Classification

Chaos tests distinguish expected errors from unexpected errors:

Expected: HTTP 500 with "Simulated" in body (chaos injection)
Unexpected: Any other error (real problem with the gateway)

This ensures accurate resilience grading.

📚 See Also

METHODOLOGY.md - Detailed test methodology
docs/VIDAIMOCK_API_REFERENCE.md - VidaiMock curl examples

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 VidaiServer Benchmark Suite v2

🏁 Quick Start

🧪 Benchmark Modes

🎯 Targets Tested

🛠️ VidaiMock Per-Request Control

📊 Key Metrics

📁 Results

🗂️ Directory Structure

⚙️ Prerequisites

🌱 Environment Variables

🩺 Troubleshooting

🚦 Error Classification

📚 See Also

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
bin		bin
configs		configs
docs		docs
k6		k6
results/20260110_180825		results/20260110_180825
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
METHODOLOGY.md		METHODOLOGY.md
README.md		README.md
docker-compose.linux.yml		docker-compose.linux.yml
docker-compose.yml		docker-compose.yml
run_benchmark.sh		run_benchmark.sh

Folders and files

Latest commit

History

Repository files navigation

🚀 VidaiServer Benchmark Suite v2

🏁 Quick Start

🧪 Benchmark Modes

🎯 Targets Tested

🛠️ VidaiMock Per-Request Control

📊 Key Metrics

📁 Results

🗂️ Directory Structure

⚙️ Prerequisites

🌱 Environment Variables

🩺 Troubleshooting

🚦 Error Classification

📚 See Also

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages