Skip to content

[Feat] Allow tokenizer_name override#321

Merged
tianmu-li merged 8 commits into
mlcommons:mainfrom
tianmu-li:feat/separate_tokenizer_name
Jun 8, 2026
Merged

[Feat] Allow tokenizer_name override#321
tianmu-li merged 8 commits into
mlcommons:mainfrom
tianmu-li:feat/separate_tokenizer_name

Conversation

@tianmu-li

Copy link
Copy Markdown
Collaborator

What does this PR do?

Allow specifying a tokenizer name that's different from model name in the yaml file.
Example: OpenRouter keeps all model names lower case, so Qwen/Qwen3.6-35B-A3B becomes qwen/qwen3.6-35b. SambaNova strips org name, so it becomes Qwen3.6-35B-A3B. tokenizer name override allows using those endpoints.

Type of change

  • Bug fix
  • New feature
  • Documentation update
  • Refactor/cleanup

Related issues

Testing

  • Tests added/updated
  • All tests pass locally
  • Manual testing completed

Checklist

  • Code follows project style
  • Pre-commit hooks pass
  • Documentation updated (if needed)

Signed-off-by: Li, Tianmu <tianmu.li@intel.com>
@tianmu-li tianmu-li requested review from a team and Copilot May 20, 2026 18:38
@github-actions

github-actions Bot commented May 20, 2026

Copy link
Copy Markdown

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the ability to override the default tokenizer by adding a tokenizer_name field to the ModelParams configuration. The benchmark execution logic now checks for this override, and corresponding updates have been made to the configuration templates and unit tests. Feedback suggests adding a CLI alias for the new field to maintain consistency with other parameters and improve command-line usability.

Comment thread src/inference_endpoint/config/schema.py Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for configuring a tokenizer identifier independently of the configured model name, enabling token-metrics/ISL token counting to use a HuggingFace tokenizer repo/path even when the endpoint’s model naming differs (e.g., OpenRouter/SambaNova naming conventions).

Changes:

  • Extend ModelParams schema with an optional tokenizer_name override.
  • Update benchmark setup to prefer tokenizer_name when probing for tokenizer availability.
  • Add unit coverage for the new schema field and update full YAML templates to include it.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/unit/config/test_schema.py Verifies tokenizer_name default and explicit override behavior in ModelParams.
src/inference_endpoint/config/templates/online_template_full.yaml Documents the new model_params.tokenizer_name config option in the online full template.
src/inference_endpoint/config/templates/offline_template_full.yaml Documents the new model_params.tokenizer_name config option in the offline full template.
src/inference_endpoint/config/templates/concurrency_template_full.yaml Documents the new model_params.tokenizer_name config option in the concurrency full template.
src/inference_endpoint/config/schema.py Adds tokenizer_name to ModelParams.
src/inference_endpoint/commands/benchmark/execute.py Uses the override (when set) as the source for tokenizer existence probing and subsequent token-metrics enablement.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/inference_endpoint/commands/benchmark/execute.py Outdated
Comment thread src/inference_endpoint/config/schema.py Outdated
Copilot AI review requested due to automatic review settings June 8, 2026 17:57

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review is ineligible. To be eligible to request a review, you need a paid Copilot license, or your organization must enable Copilot code review.

@arekay-nv

Copy link
Copy Markdown
Collaborator

Review Council — Multi-AI Code Review

Reviewed by: Claude | Depth: quick | Codex: unavailable (branch not on remote)

No new issues found. The one concern identified (silent discard of an invalid user-supplied tokenizer_name override) is already covered by an existing inline comment at execute.py:407.

@arekay-nv

Copy link
Copy Markdown
Collaborator

@tianmu-li can you update this so that if a tokenizer override is incorrect, it is a hard failure. Silently using the default tokenizer can be challenging if the assumption is that a different tokenizer was expected to be used.

tianmu-li added 2 commits June 8, 2026 19:44
Signed-off-by: Li, Tianmu <tianmu.li@intel.com>
Signed-off-by: Li, Tianmu <tianmu.li@intel.com>
Copilot AI review requested due to automatic review settings June 8, 2026 20:48
Signed-off-by: Li, Tianmu <tianmu.li@intel.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Comment thread src/inference_endpoint/commands/benchmark/execute.py
Comment thread src/inference_endpoint/config/templates/online_template_full.yaml
@tianmu-li

Copy link
Copy Markdown
Collaborator Author

@tianmu-li can you update this so that if a tokenizer override is incorrect, it is a hard failure. Silently using the default tokenizer can be challenging if the assumption is that a different tokenizer was expected to be used.

Added. Also added regression testing.

@arekay-nv arekay-nv left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Council — follow-up

Reviewing resolved status of prior comments. 2 items remain unresolved (posted inline). 1 item (smaller templates missing tokenizer_name) cannot be posted inline since those files aren't in this diff — see the existing comment on online_template_full.yaml for context.

Comment thread src/inference_endpoint/config/schema.py Outdated
Comment thread src/inference_endpoint/commands/benchmark/execute.py
Signed-off-by: Li, Tianmu <tianmu.li@intel.com>

@arekay-nv arekay-nv left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@tianmu-li tianmu-li merged commit 53998c0 into mlcommons:main Jun 8, 2026
7 checks passed
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 8, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants