MAF-19399: docs(website): add multi-model serving operations guide by hhk7734 · Pull Request #88 · moreh-dev/mif

hhk7734 · 2026-03-12T12:37:43Z

Summary

Add website/docs/operations/multi-model.mdx documenting how to serve multiple models through a single gateway endpoint using BBR and Heimdall schedulers
Guide covers: Gateway, Body-Based Router (BBR), per-model Heimdall instances with gateway.bbr.models, and InferenceService resources using vllm-hf-hub-offline templates
Reorder sidebar positions in the operations section

Test plan

Verified multi-model routing on p-cluster (hyeonki namespace) with Llama 3.2 1B and Qwen 3 1.7B on MI250
Confirmed BBR body-based routing works (model extracted from request body)
Confirmed direct X-Gateway-Model-Name header routing works without BBR
Review doc formatting renders correctly in Docusaurus

🤖 Generated with Claude Code

Add documentation for serving multiple models through a single gateway endpoint using BBR and Heimdall schedulers. The guide covers deploying Gateway, Body-Based Router, per-model Heimdall instances, and InferenceService resources with vllm-hf-hub-offline templates. Also reorder sidebar positions in the operations section. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

This PR adds a new operations guide for multi-model serving through a single gateway endpoint, using Body-Based Router (BBR) and Heimdall schedulers in the MoAI Inference Framework. It also reorders sidebar positions in the operations section.

Changes:

Add multi-model.mdx documenting full setup of multi-model routing with BBR, Heimdall, and InferenceService resources across GPU types
Reorder sidebar positions: latest-release moved to -999, container-image-caching-with-harbor moved to 30, new guide placed at 20

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File	Description
`website/docs/operations/multi-model.mdx`	New guide covering architecture, gateway, BBR, Heimdall, InferenceService deployment, request examples, and cleanup for multi-model serving
`website/docs/operations/latest-release.mdx`	Sidebar position changed from `1` to `-999` to pin it at the top
`website/docs/operations/container-image-caching-with-harbor/index.mdx`	Sidebar position changed from `4` to `30` to accommodate new ordering

You can also share your feedback on Copilot code review. Take the survey.

hhk7734 · 2026-03-12T12:43:50Z

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

🤖 Generated with Claude Code

hhk7734 requested a review from a team as a code owner March 12, 2026 12:37

hhk7734 requested review from bongwoobak, Copilot and hyeongyun0916 March 12, 2026 12:37

gitgod-bot assigned hhk7734 Mar 12, 2026

Copilot started reviewing on behalf of hhk7734 March 12, 2026 12:38 View session

Copilot AI reviewed Mar 12, 2026

View reviewed changes

hhk7734 merged commit 91b8cf0 into main Mar 12, 2026
8 checks passed

hhk7734 deleted the MAF-19399 branch March 12, 2026 12:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAF-19399: docs(website): add multi-model serving operations guide#88

MAF-19399: docs(website): add multi-model serving operations guide#88
hhk7734 merged 1 commit intomainfrom
MAF-19399

hhk7734 commented Mar 12, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

hhk7734 commented Mar 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hhk7734 commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

hhk7734 commented Mar 12, 2026

Code review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hhk7734 commented Mar 12, 2026 •

edited

Loading