Skip to content

Conversation

Jintao-Huang
Copy link
Collaborator

No description provided.

Copy link
Contributor

Summary of Changes

Hello @Jintao-Huang, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates support for the FP8 quantized variants of the Qwen3-Next 80B Instruct and Thinking models, which is crucial for optimizing their performance and resource usage. Concurrently, it updates the comprehensive list of supported models in the documentation, indicating broader compatibility and readiness for several Qwen3-Coder and Kimi-VL models. A minor but important adjustment to a dependency version constraint for Intern-S1 models ensures better compatibility and stability.

Highlights

  • Qwen3-Next FP8 Model Support: Added explicit support for the FP8 quantized versions of the Qwen3-Next-80B-A3B-Instruct and Qwen3-Next-80B-A3B-Thinking models, enabling more efficient deployment and inference.
  • Documentation Updates: Updated both English and Chinese documentation for supported models, reflecting the newly added Qwen3-Next FP8 models and marking several existing Qwen3-Coder and Kimi-VL models as fully supported.
  • Dependency Version Constraint Refinement: Refined the transformers library version requirement for Intern-S1 models to be more precise, specifying a range of >=4.55.2,<4.56.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for FP8 variants of qwen3_next models. It also includes several other unrelated updates: it changes Megatron support flags for Qwen3-Coder and Kimi-VL models, and pins the transformers dependency for Intern-S1 models. While it's generally better to keep pull requests focused on a single concern for clarity and easier reviews, the changes appear correct. The core change for qwen3_next FP8 support is sound. However, a key concern is the new dependency on a development version of transformers (>=4.57.0.dev), which could introduce instability for users.

@Jintao-Huang Jintao-Huang merged commit 2efd7de into modelscope:main Sep 22, 2025
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants