[model] support mistral 2506 #6624

Jintao-Huang · 2025-11-17T03:39:16Z

No description provided.

gemini-code-assist · 2025-11-17T03:39:36Z

Summary of Changes

Hello @Jintao-Huang, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request primarily focuses on extending the system's capabilities to support the new mistral_2506 large language model. It involves defining the new model and template types, implementing the necessary model and tokenizer loading functions, and registering them within the framework. Additionally, the PR includes a significant refactoring of existing Mistral model template registrations, moving common template metadata into a dedicated dataclass for improved maintainability and consistency across Mistral models.

Highlights

New Mistral Model Integration: Added comprehensive support for the mistral_2506 model, including its model type, template type, and specific model and tokenizer loading logic.
Mistral Template Refactoring: Streamlined the registration of Mistral-related templates (mistral_2501, mistral_2503, devstral) by introducing a new Mistral3TemplateMeta dataclass for improved consistency and maintainability.
Dependency Management Update: Modified the mistral.py model loading to directly import Mistral3ForConditionalGeneration without a try-except block, indicating a more stable dependency assumption.
Model Architecture Cleanup: Removed mistral_2503 from MLLMModelArch, clarifying the distinction between model types and architecture definitions within the system.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds support for the mistral_2506 model. The changes include adding the new model type and template, along with the necessary registration and loading functions. The implementation has a few issues that need to be addressed. There appears to be a copy-paste error in the model IDs used for registration, which could lead to loading the wrong model. More critically, the template class for mistral_2506 has unimplemented methods, which will cause errors during tokenization. Additionally, there's a hardcoded filename for the tokenizer which could be improved for better maintainability. Overall, the changes are in the right direction but require fixes to be functional.

swift/llm/template/template/mistral.py

swift/llm/model/model/mistral.py

Jintao-Huang · 2025-11-17T06:14:29Z

/gemini review

gemini-code-assist

Code Review

This pull request adds support for the mistral_2506 model. The changes, including the addition of model types, templates, registration, and tests, are well-implemented. However, I've identified a couple of areas for improvement. There's an unused import that should be removed for code cleanliness. More importantly, I've found an issue where a stale date is used in the system prompts for some existing Mistral models, which could lead to incorrect model behavior over time. I've provided detailed feedback on how to address this to ensure the system prompts are generated dynamically.

gemini-code-assist · 2025-11-17T06:20:39Z

swift/llm/template/template/mistral.py

+today = datetime.now().strftime('%Y-%m-%d')
+
+mistral_2501_system = (
+    'You are Mistral Small 3, a Large Language Model (LLM) created by Mistral AI, a French startup '
+    'headquartered in Paris.\n'
+    f'Your knowledge base was last updated on 2023-10-01. The current date is {today}.\n\n'


The today variable is defined at the module level, which means it's only evaluated once when the module is imported. If the application runs for more than a day, the date in the system prompt will become stale. This can lead to incorrect behavior from the model. A similar issue was addressed for Mistral2506Template by dynamically generating the system prompt. Please apply a similar fix here for mistral_2501 and mistral_2503 templates.

This would involve:

Creating a get_mistral_2501_system() function that returns the system prompt with the current date.

Creating a Mistral2501Template(Template) class with an _swift_encode method that calls get_mistral_2501_system() to set the system prompt if it's not provided.

Updating the registration for mistral_2501 to use this new template class with default_system=None.

Changing Mistral2503Template to inherit from Mistral2501Template.

Updating the registration for mistral_2503 to set default_system=None.

gemini-code-assist · 2025-11-17T06:20:39Z

swift/llm/model/model/mistral.py

+                                     model_kwargs: Dict[str, Any],
+                                     load_model: bool = True,
+                                     **kwargs):
+    from mistral_common.tokens.tokenizers.mistral import MistralTokenizer


The MistralTokenizer is imported but not used in this function. It should be removed to keep the code clean.

update

d4d4af5

gemini-code-assist bot reviewed Nov 17, 2025

View reviewed changes

swift/llm/template/template/mistral.py Outdated Show resolved Hide resolved

swift/llm/model/model/mistral.py Outdated Show resolved Hide resolved

swift/llm/model/model/mistral.py Outdated Show resolved Hide resolved

Jintao-Huang added 2 commits November 17, 2025 14:01

update

2ae609d

update

551777c

hjh0119 approved these changes Nov 17, 2025

View reviewed changes

update

21b58fa

Jintao-Huang merged commit 7e1d218 into modelscope:main Nov 17, 2025
1 of 2 checks passed

gemini-code-assist bot reviewed Nov 17, 2025

View reviewed changes

vx120 pushed a commit to vx120/ms-swift that referenced this pull request Nov 19, 2025

[model] support mistral 2506 (modelscope#6624)

a98abf3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[model] support mistral 2506 #6624

[model] support mistral 2506 #6624

Uh oh!

Jintao-Huang commented Nov 17, 2025

Uh oh!

gemini-code-assist bot commented Nov 17, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Jintao-Huang commented Nov 17, 2025

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Nov 17, 2025

Uh oh!

gemini-code-assist bot Nov 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[model] support mistral 2506 #6624

[model] support mistral 2506 #6624

Uh oh!

Conversation

Jintao-Huang commented Nov 17, 2025

Uh oh!

gemini-code-assist bot commented Nov 17, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Jintao-Huang commented Nov 17, 2025

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants