extension/llm/server: serving foundations (schemas, errors, templating, tools) by mergennachin · Pull Request #19993 · pytorch/executorch

mergennachin · 2026-06-03T22:32:41Z

Add the OpenAI server's standalone control-plane building blocks, independent of
the HTTP layer and of any model runtime: OpenAI request/response schemas
(protocol.py), structured errors (errors.py), HF chat templating
(chat_template.py), and tool-call parsers (tool_parsers/) for Hermes-style JSON
and Qwen XML. The wire contract is documented in spec/README.md. Unit-tested
under tests/ (tool parsing).

The HTTP server builds on these.

Part of #20001

[ghstack-poisoned]

mergennachin · 2026-06-03T22:32:42Z

Stack from ghstack (oldest at bottom):

pytorch-bot · 2026-06-03T22:32:45Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19993

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 1842447 with merge base eeb0646 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

mergennachin added 2 commits June 3, 2026 15:32

[INITIAL] Update

7298429

[ghstack-poisoned]

[INITIAL] Update

75a6aa3

[ghstack-poisoned]

mergennachin requested a review from larryliu0820 as a code owner June 3, 2026 22:32

This was referenced Jun 3, 2026

extension/llm/server: worker-based OpenAI-compatible HTTP server #19994

Open

extension/llm/runner: Engine/Session C++ core + token-step primitives #19991

Open

extension/llm/runner: Python bindings for the Engine/Session API #19992

Closed

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 3, 2026

mergennachin requested review from Gasoonjia, GregoryComer, digantdesai, kirklandsign and psiddh June 3, 2026 22:34

mergennachin added 2 commits June 3, 2026 15:52

[UPDATE] Update

3361800

[ghstack-poisoned]

[UPDATE] Update

a99bd67

[ghstack-poisoned]

mergennachin mentioned this pull request Jun 3, 2026

extension/llm/server: document pi integration #19999

Open

mergennachin added 2 commits June 4, 2026 11:48

[UPDATE] Update

b3225d7

[ghstack-poisoned]

[UPDATE] Update

ae2802f

[ghstack-poisoned]

mergennachin mentioned this pull request Jun 4, 2026

examples/models/qwen3_5_moe: CUDA Engine/Session adapter + OpenAI serving #20043

Open

mergennachin marked this pull request as draft June 4, 2026 18:51

[UPDATE] Update

c56e62a

[ghstack-poisoned]

mergennachin changed the base branch from gh/mergennachin/3/head to gh/mergennachin/2/head June 4, 2026 22:14

mergennachin added 2 commits June 4, 2026 15:38

[UPDATE] Update

e35d01a

[ghstack-poisoned]

[UPDATE] Update

79efb21

[ghstack-poisoned]

mergennachin marked this pull request as ready for review June 5, 2026 18:56

[UPDATE] Update

1842447

[ghstack-poisoned]

mergennachin mentioned this pull request Jun 8, 2026

Qwen3.5-MoE CUDA V2 foundation: one model, many isolated sessions #20117

Open

mergennachin changed the title ~~extension/llm/server: serving foundations (schemas, tool parser, prefix cache)~~ extension/llm/server: serving foundations (schemas, errors, templating, tools) Jun 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

extension/llm/server: serving foundations (schemas, errors, templating, tools)#19993

extension/llm/server: serving foundations (schemas, errors, templating, tools)#19993
mergennachin wants to merge 10 commits into
gh/mergennachin/2/headfrom
gh/mergennachin/4/head

mergennachin commented Jun 3, 2026 •

edited

Loading

Uh oh!

mergennachin commented Jun 3, 2026 •

edited

Loading

Uh oh!

pytorch-bot Bot commented Jun 3, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mergennachin commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mergennachin commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19993

✅ No Failures

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mergennachin commented Jun 3, 2026 •

edited

Loading

mergennachin commented Jun 3, 2026 •

edited

Loading

pytorch-bot Bot commented Jun 3, 2026 •

edited

Loading