[Feat] Allow tokenizer_name override by tianmu-li · Pull Request #321 · mlcommons/endpoints

tianmu-li · 2026-05-20T18:38:31Z

What does this PR do?

Allow specifying a tokenizer name that's different from model name in the yaml file.
Example: OpenRouter keeps all model names lower case, so Qwen/Qwen3.6-35B-A3B becomes qwen/qwen3.6-35b. SambaNova strips org name, so it becomes Qwen3.6-35B-A3B. tokenizer name override allows using those endpoints.

Type of change

Bug fix
New feature
Documentation update
Refactor/cleanup

Related issues

Testing

Tests added/updated
All tests pass locally
Manual testing completed

Checklist

Code follows project style
Pre-commit hooks pass
Documentation updated (if needed)

Signed-off-by: Li, Tianmu <tianmu.li@intel.com>

github-actions · 2026-05-20T18:38:42Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

gemini-code-assist

Code Review

This pull request introduces the ability to override the default tokenizer by adding a tokenizer_name field to the ModelParams configuration. The benchmark execution logic now checks for this override, and corresponding updates have been made to the configuration templates and unit tests. Feedback suggests adding a CLI alias for the new field to maintain consistency with other parameters and improve command-line usability.

Copilot

Pull request overview

Adds support for configuring a tokenizer identifier independently of the configured model name, enabling token-metrics/ISL token counting to use a HuggingFace tokenizer repo/path even when the endpoint’s model naming differs (e.g., OpenRouter/SambaNova naming conventions).

Changes:

Extend ModelParams schema with an optional tokenizer_name override.
Update benchmark setup to prefer tokenizer_name when probing for tokenizer availability.
Add unit coverage for the new schema field and update full YAML templates to include it.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`tests/unit/config/test_schema.py`	Verifies `tokenizer_name` default and explicit override behavior in `ModelParams`.
`src/inference_endpoint/config/templates/online_template_full.yaml`	Documents the new `model_params.tokenizer_name` config option in the online full template.
`src/inference_endpoint/config/templates/offline_template_full.yaml`	Documents the new `model_params.tokenizer_name` config option in the offline full template.
`src/inference_endpoint/config/templates/concurrency_template_full.yaml`	Documents the new `model_params.tokenizer_name` config option in the concurrency full template.
`src/inference_endpoint/config/schema.py`	Adds `tokenizer_name` to `ModelParams`.
`src/inference_endpoint/commands/benchmark/execute.py`	Uses the override (when set) as the source for tokenizer existence probing and subsequent token-metrics enablement.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Copilot was unable to review this pull request because the user who requested the review is ineligible. To be eligible to request a review, you need a paid Copilot license, or your organization must enable Copilot code review.

arekay-nv · 2026-06-08T18:06:56Z

Review Council — Multi-AI Code Review

Reviewed by: Claude | Depth: quick | Codex: unavailable (branch not on remote)

No new issues found. The one concern identified (silent discard of an invalid user-supplied tokenizer_name override) is already covered by an existing inline comment at execute.py:407.

arekay-nv · 2026-06-08T18:56:08Z

@tianmu-li can you update this so that if a tokenizer override is incorrect, it is a hard failure. Silently using the default tokenizer can be challenging if the assumption is that a different tokenizer was expected to be used.

Signed-off-by: Li, Tianmu <tianmu.li@intel.com>

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

tianmu-li · 2026-06-08T20:56:36Z

@tianmu-li can you update this so that if a tokenizer override is incorrect, it is a hard failure. Silently using the default tokenizer can be challenging if the assumption is that a different tokenizer was expected to be used.

Added. Also added regression testing.

arekay-nv

Review Council — follow-up

Reviewing resolved status of prior comments. 2 items remain unresolved (posted inline). 1 item (smaller templates missing tokenizer_name) cannot be posted inline since those files aren't in this diff — see the existing comment on online_template_full.yaml for context.

Signed-off-by: Li, Tianmu <tianmu.li@intel.com>

arekay-nv

Thanks!

Allow tokenizer_name override

33efcce

Signed-off-by: Li, Tianmu <tianmu.li@intel.com>

tianmu-li requested review from a team and Copilot May 20, 2026 18:38

Merge branch 'main' into feat/separate_tokenizer_name

3242701

Copilot started reviewing on behalf of tianmu-li May 20, 2026 18:39 View session

gemini-code-assist Bot reviewed May 20, 2026

View reviewed changes

Comment thread src/inference_endpoint/config/schema.py Outdated

Copilot AI reviewed May 20, 2026

View reviewed changes

Comment thread src/inference_endpoint/commands/benchmark/execute.py Outdated

Comment thread src/inference_endpoint/config/schema.py Outdated

tianmu-li and others added 2 commits May 21, 2026 11:33

Merge branch 'main' into feat/separate_tokenizer_name

b7338b0

Merge branch 'main' into feat/separate_tokenizer_name

7dcc221

Copilot AI review requested due to automatic review settings June 8, 2026 17:57

Copilot AI reviewed Jun 8, 2026

tianmu-li added 2 commits June 8, 2026 19:44

Hard fail for invalid tokenizer name override

d4e2837

Signed-off-by: Li, Tianmu <tianmu.li@intel.com>

Add tests

068d378

Signed-off-by: Li, Tianmu <tianmu.li@intel.com>

Copilot AI review requested due to automatic review settings June 8, 2026 20:48

Copilot started reviewing on behalf of tianmu-li June 8, 2026 20:48 View session

Test converage

860ead2

Signed-off-by: Li, Tianmu <tianmu.li@intel.com>

Copilot AI reviewed Jun 8, 2026

View reviewed changes

Comment thread src/inference_endpoint/commands/benchmark/execute.py

Comment thread src/inference_endpoint/config/templates/online_template_full.yaml

arekay-nv reviewed Jun 8, 2026

View reviewed changes

Comment thread src/inference_endpoint/config/schema.py Outdated

Comment thread src/inference_endpoint/commands/benchmark/execute.py

Address review comments

83c947f

Signed-off-by: Li, Tianmu <tianmu.li@intel.com>

viraatc approved these changes Jun 8, 2026

View reviewed changes

arekay-nv approved these changes Jun 8, 2026

View reviewed changes

tianmu-li merged commit 53998c0 into mlcommons:main Jun 8, 2026
7 checks passed

github-actions Bot locked and limited conversation to collaborators Jun 8, 2026

Conversation

tianmu-li commented May 20, 2026

What does this PR do?

Type of change

Related issues

Testing

Checklist

Uh oh!

github-actions Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

arekay-nv commented Jun 8, 2026

Review Council — Multi-AI Code Review

Uh oh!

arekay-nv commented Jun 8, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

tianmu-li commented Jun 8, 2026

Uh oh!

arekay-nv left a comment

Choose a reason for hiding this comment

Review Council — follow-up

Uh oh!

Uh oh!

Uh oh!

arekay-nv left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

github-actions Bot commented May 20, 2026 •

edited

Loading