Fix router by tastelikefeet · Pull Request #69 · modelscope/twinkle

tastelikefeet · 2026-02-15T08:59:18Z

PR type

Bug Fix
New Feature
Document Updates
More Models or Datasets Support

PR information

Write the detail information belongs to this PR.

Experiment results

Paste your experiment result here(if needed).

gemini-code-assist · 2026-02-15T08:59:42Z

Summary of Changes

Hello @tastelikefeet, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refines the routing and request handling mechanisms within the server, specifically for multiplexed model serving. It standardizes the request identification header from X-Ray-Serve-Request-Id to serve_multiplexed_model_id across the client, server, and documentation. Additionally, it integrates new multiplexing logic for managing adapters and processors, and updates the list of explicitly supported models in the Tinker server. These changes aim to improve the robustness and clarity of the routing infrastructure.

Highlights

Header Renaming and Standardization: The request identification header X-Ray-Serve-Request-Id has been consistently renamed to serve_multiplexed_model_id across documentation, server-side validation, logging, and client-side header injection to standardize its usage.
Multiplexed Adapter/Processor Integration: New get_multiplexed_adapter methods, decorated with @serve.multiplexed, were introduced in tinker/model.py, twinkle/model.py, and twinkle/processor.py. These methods are now called during adapter and processor creation, integrating with a multiplexed serving framework.
Model Support List Update: The list of explicitly supported models in src/twinkle/server/tinker/server.py has been streamlined, removing several Qwen/Qwen2.5 models and retaining only Qwen/Qwen3-30B-A3B-Instruct-2507.
Asynchronous Function Updates: Several functions, including add_adapter_to_model in twinkle/model.py and create in twinkle/processor.py, have been updated to be asynchronous (async def) to accommodate the new multiplexing logic.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

docs/source_en/Usage Guide/Server and Client/Tinker-Compatible-Client.md
- Updated the name of the request header from X-Ray-Serve-Request-Id to serve_multiplexed_model_id in the English documentation.
docs/source_zh/使用指引/服务端和客户端/Tinker兼容客户端.md
- Updated the name of the request header from X-Ray-Serve-Request-Id to serve_multiplexed_model_id in the Chinese documentation.
src/twinkle/server/tinker/model.py
- Added a new get_multiplexed_adapter async method decorated with @serve.multiplexed.
- Called get_multiplexed_adapter within the _create_adapter logic.
src/twinkle/server/tinker/server.py
- Removed several Qwen/Qwen2.5 models from the default supported_models list.
- Updated logging to use serve_multiplexed_model_id instead of x-ray-serve-request-id.
src/twinkle/server/twinkle/model.py
- Added a new get_multiplexed_adapter async method decorated with @serve.multiplexed.
- Changed add_adapter_to_model to an async function.
- Called get_multiplexed_adapter within add_adapter_to_model.
src/twinkle/server/twinkle/processor.py
- Added a new get_multiplexed_adapter async method decorated with @serve.multiplexed.
- Changed create to an async function.
- Called get_multiplexed_adapter within the create method.
src/twinkle/server/utils/validation.py
- Updated the expected header name from X-Ray-Serve-Request-Id to serve_multiplexed_model_id for request validation.
- Modified error messages to reflect the new header name.
src/twinkle_client/init.py
- Updated the default header name for init_tinker_compat_client from X-Ray-Serve-Request-Id to serve_multiplexed_model_id.
src/twinkle_client/http/http_utils.py
- Updated the request ID header in HTTP utility functions from X-Ray-Serve-Request-Id to serve_multiplexed_model_id.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request primarily focuses on integrating Ray Serve's model multiplexing feature by replacing the X-Ray-Serve-Request-Id header with serve_multiplexed_model_id across the server and client code. This includes updating documentation, client-side header creation, and server-side validation and logging. Additionally, it introduces get_multiplexed_adapter methods in several server components, decorated with @serve.multiplexed, and calls them at appropriate points (e.g., adapter creation) to register models with Ray Serve's multiplexer. The changes appear to be consistent and correctly implement the intended feature. I have one suggestion regarding the modification of the default supported models list.

src/twinkle/server/tinker/server.py

src/twinkle/server/tinker/model.py

tastelikefeet added 2 commits February 15, 2026 16:36

to test

374cb18

fix

1548729

gemini-code-assist bot reviewed Feb 15, 2026

View reviewed changes

src/twinkle/server/tinker/server.py Show resolved Hide resolved

tastelikefeet added 5 commits February 15, 2026 17:13

add code doc

96112a8

wip

946810a

wip

95d474e

fix

6fee00d

wip

663ac67

tastelikefeet commented Feb 16, 2026

View reviewed changes

src/twinkle/server/tinker/model.py Outdated Show resolved Hide resolved

tastelikefeet added 2 commits February 16, 2026 09:41

wip

a5e2aef

wip

8f7e086

tastelikefeet merged commit 5cba3a1 into main Feb 16, 2026
3 of 4 checks passed

tastelikefeet deleted the fix/routing-0215 branch February 16, 2026 04:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix router#69

Fix router#69
tastelikefeet merged 9 commits intomainfrom
fix/routing-0215

tastelikefeet commented Feb 15, 2026

Uh oh!

gemini-code-assist bot commented Feb 15, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

tastelikefeet commented Feb 15, 2026

PR type

PR information

Experiment results

Uh oh!

gemini-code-assist bot commented Feb 15, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments