Update vLLM-SR RouterArena submission by Xunzhuo · Pull Request #131 · RouteWorks/RouterArena

Xunzhuo · 2026-06-04T06:46:47Z

Summary

Update the vllm-sr RouterArena submission artifacts for the vLLM Semantic Router

This submission updates:

router_inference/config/vllm-sr.json
router_inference/predictions/vllm-sr.json
router_inference/predictions/vllm-sr-robustness.json

Notes

The router is not trained, fit, or tuned on RouterArena data.
The routing policy is a general vLLM Semantic Router recipe using deterministic signals/projections; it does not encode RouterArena sample IDs, gold answers, or generated-result lookup tables.
Full prediction generated_result fields are populated for all 8,400 regular entries with success=true.
Robustness predictions include 420 entries; per README, no robustness generated_result fields are required.
The vLLM Semantic Router service used for generation was served on AMD

Xunzhuo · 2026-06-04T06:47:02Z

/evaluate

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Updates the vLLM Semantic Router configuration to use a new RouterArena “recipe” with a refreshed model list and adds an explicit description, while removing endpoint and category-mapping fields.

Changes:

Replaces the previous model set with updated provider/model identifiers and changes the default model.
Removes router_endpoint, base_url, and category_model_mapping from the config.
Adds a descriptive description field clarifying the recipe and data-embedding constraints.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

github-actions · 2026-06-04T07:10:10Z

Router Evaluation Results

Router: vllm-sr
Dataset Split: full

RouterArena Metrics

Metric	Value
RouterArena Score	0.7420
Accuracy	75.07%
Total Cost	$1.340293
Avg Cost per Query	$0.000160
Avg Cost per 1K Queries	$0.1596
Number of Queries	8400
Abnormal Entries	0
Robustness Score	0.7690

Optimality Metrics

Metric	Value
Opt.Sel (Optimal Selection)	0.1809
Opt.Cost (Cost Efficiency)	0.2407
Opt.Acc (Accuracy vs Optimal)	0.8969

Evaluation completed by RouterArena automated workflow

Signed-off-by: xunzhuo <xunzhuo@vllm-semantic-router.ai>

Xunzhuo · 2026-06-04T11:09:17Z

/evaluate

github-actions · 2026-06-04T11:35:32Z

Router Evaluation Results

Router: vllm-sr
Dataset Split: full

RouterArena Metrics

Metric	Value
RouterArena Score	0.7538
Accuracy	75.97%
Total Cost	$0.921463
Avg Cost per Query	$0.000110
Avg Cost per 1K Queries	$0.1097
Number of Queries	8400
Abnormal Entries	0
Robustness Score	0.7310

Optimality Metrics

Metric	Value
Opt.Sel (Optimal Selection)	0.2012
Opt.Cost (Cost Efficiency)	0.2452
Opt.Acc (Accuracy vs Optimal)	0.8987

Evaluation completed by RouterArena automated workflow

vLLM Semantic Router resubmission (#131) re-evaluated at RouterArena score 0.7538 (was 0.6723). Updated its row and re-sorted ranks 1-9: Arena 67.23 -> 75.38 Accuracy 66.53 -> 75.97 Cost/1K $0.06 -> $0.11 Opt.Sel 84.66 -> 20.12 Opt.Cost 90.71 -> 24.52 Opt.Acc 89.24 -> 89.87 Robust 90.95 -> 73.10 At 75.38 vLLM-SR overtakes Sqwish (75.27) for #1; Sqwish, AgentForge, Nadir, Weave, OrcaRouter-Adaptive, Azure, R2-Router and Auto each shift down one rank. Ranks 10-20 unchanged. Metrics taken from the final /evaluate run on the merged submission (verified byte-identical to main). Co-authored-by: Louie Lu <yl231@datalab2.cs.rice.edu> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings June 4, 2026 06:46

Copilot AI reviewed Jun 4, 2026

View reviewed changes

Comment thread router_inference/config/vllm-sr.json Outdated

Comment thread router_inference/config/vllm-sr.json Outdated

Comment thread router_inference/config/vllm-sr.json

Update vLLM-SR RouterArena submission

6f63fea

Signed-off-by: xunzhuo <xunzhuo@vllm-semantic-router.ai>

Xunzhuo force-pushed the vllm/vllm-sr-v350-routerarena branch from 16c1395 to 6f63fea Compare June 4, 2026 11:09

yl231 approved these changes Jun 4, 2026

View reviewed changes

yl231 merged commit b7bd454 into RouteWorks:main Jun 4, 2026
6 checks passed

yl231 mentioned this pull request Jun 4, 2026

Leaderboard: update vLLM-SR to v350 metrics (#131), now #1 #133

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update vLLM-SR RouterArena submission#131

Update vLLM-SR RouterArena submission#131
yl231 merged 1 commit into
RouteWorks:mainfrom
Xunzhuo:vllm/vllm-sr-v350-routerarena

Xunzhuo commented Jun 4, 2026 •

edited

Loading

Uh oh!

Xunzhuo commented Jun 4, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 4, 2026

Uh oh!

Xunzhuo commented Jun 4, 2026

Uh oh!

github-actions Bot commented Jun 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Xunzhuo commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Notes

Uh oh!

Xunzhuo commented Jun 4, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 4, 2026

Router Evaluation Results

RouterArena Metrics

Optimality Metrics

Uh oh!

Xunzhuo commented Jun 4, 2026

Uh oh!

github-actions Bot commented Jun 4, 2026

Router Evaluation Results

RouterArena Metrics

Optimality Metrics

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Xunzhuo commented Jun 4, 2026 •

edited

Loading