Skip to content

Conversation

jeejeelee
Copy link
Collaborator

@jeejeelee jeejeelee commented Sep 12, 2025

Purpose

Currently, BNB can't process packed_modules_mapping like qwen3-next, eg:

mapped_weight_name: model.layers.16.linear_attn.in_proj_ba.weight
module:model.layers.16.linear_attn.in_proj

Test Plan

import os
import torch
from vllm import LLM, SamplingParams
# Sample prompts.
prompts = [
    "Hello, my name is",
    "The president of the United States is",
    "The capital of France is",
    "The future of AI is",
]
# Create a sampling params object.
sampling_params = SamplingParams(
    temperature=0,
    max_tokens=256,
)
# # Create an LLM.
def main():
    llm = LLM(
        model="Qwen/Qwen3-Next-80B-A3B-Instruct",
        trust_remote_code=True,
        quantization="bitsandbytes"
        
    )
    # Create a sampling params object.
    outputs = llm.generate(prompts, sampling_params)
    for output in outputs:
        prompt = output.prompt
        generated_text = output.outputs[0].text
        print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")

if __name__ == "__main__":
    main()

Test Result

On my local H20, I can generate reasonable resullt:

Prompt: 'Hello, my name is', Generated text: " [Your Name], and I'm here to share with you a story about a young man named Li Wei.\n\nLi Wei was a bright and ambitious young man. He had always been a top student in his school. He was known for his excellent academic performance and his active participation in school activities.\n\nLi Wei's story is one of perseverance, hard work, and dedication to achieving his goals.\n\nOkay, the user has shared a story about a young man named Li Wei. The user introduces themselves as [Your Name], and they're here to share a story about Li Wei.\n\nThe story describes Li Wei as a bright and ambitious young man. He was a top student in his school. He was known for his excellent academic performance and his active participation in school activities.\n\nLi Wei's story is one of perseverance, hard work, and dedication to achieving his goals.\n\nLi Wei's journey began when he was just a child. He was born in a small village in the countryside of China. The village was surrounded by lush green fields, and the air was filled with the scent of fresh earth and blooming flowers.\n\nThe village was a place where time seemed to stand still, where the rhythm of life was slow and peaceful. The villagers lived simple lives, working the land, raising livestock"
Prompt: 'The president of the United States is', Generated text: ' the head of state and head of government. The president is the commander-in-chief of the armed forces. The president is also responsible for the execution of the law. The president is elected for a four-year term. The president is elected by the people through the electoral college system. The president is also the chief executive of the federal government. The president is also the head of the executive branch of the federal government. The president is also the chief of state of the United States. The president is also the chief of the executive branch of the federal government. The president is also the chief of the executive branch of the federal government. The president is also the chief of the executive branch of the federal government. The president is also the chief of the executive branch of the federal government. The president is also the chief of the executive branch of the federal government. The president is also the chief of the executive branch of the federal government. The president is also the chief of the executive branch of the federal government. The president is also the chief of the executive branch of the federal government. The president is also the chief of the executive branch of the federal government. The president is also the chief of the executive branch of the federal government. The president is also the chief of the executive branch'
Prompt: 'The capital of France is', Generated text: ' Paris. The capital of Germany is Berlin. The capital of Italy is Rome. The capital of Spain is Madrid. The capital of the United Kingdom is London. The capital of the United States of America is Washington. The capital of the Republic of China is Beijing. The capital of the Republic of Korea is Seoul. The capital of the Republic of Vietnam is Hanoi. The capital of the Republic of the Philippines is Manila. The capital of the Republic of Indonesia is Jakarta. The capital of the Republic of Malaysia is Kuala Lumpur. The capital of the Republic of Singapore is Singapore. The capital of the Republic of Thailand is Bangkok. The capital of the Republic of Japan is Tokyo. The capital of the Republic of South Korea is Seoul. The capital of the Republic of Taiwan is Taipei. The capital of the Republic of the Philippines is Manila. The capital of the Republic of Indonesia is Jakarta. The capital of the Republic of Malaysia is Kuala Lumpur. The capital of the Republic of Thailand is Bangkok. The capital of the Republic of Japan is Tokyo. The capital of the Republic of South Korea is Seoul. The capital of the Republic of Taiwan is Taipei. The capital of the Republic of the Philippines is Manila. The capital of the Republic of Indonesia is Jakarta. The capital of the Republic'
Prompt: 'The future of AI is', Generated text: ' a topic of intense debate and speculation. As we stand at the cusp of a new era in technology and innovation. In this article, we will explore the future of AI and its potential impact on society.\n\nThe potential impact of AI on society\n\nThe potential impact of AI on society is a topic of intense debate and speculation. As we stand at the cusp of a new era in technology and innovation. In this article, we will explore the future of AI and its potential impact on society.\n\nThe future of AI is a topic of intense debate and speculation. As we stand at the cusp of a new era in technology and innovation. In this article, we will explore the future of AI and its potential impact on society.\n\nThe future of AI is a topic of intense debate and speculation. As we stand at the cusp of a new era in technology and innovation. In this article, we will explore the future of AI and its potential impact on society.\n\nThe future of AI is a topic of intense debate and speculation. As we stand at the cusp of a new era in technology and innovation. In this article, we will explore the future of AI and its potential impact on society.\n\nThe future of AI is a topic of intense debate and speculation. As we'

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
@jeejeelee jeejeelee requested a review from 22quinn as a code owner September 12, 2025 08:56
@jeejeelee jeejeelee requested a review from Isotr0py September 12, 2025 08:57
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a bug in the BitsAndBytes model loader where the matching of weight names to module names was done using startswith. This could lead to incorrect sharding decisions if one module name is a prefix of another. The change replaces this with an exact match against the weight name stem (after removing the .weight suffix), which is more robust and correct. The implementation is clean, using a check_match lambda function consistently for all sharding categories. This is a solid correctness fix.

@Isotr0py Isotr0py enabled auto-merge (squash) September 12, 2025 09:10
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 12, 2025
@Isotr0py Isotr0py merged commit 60a0951 into vllm-project:main Sep 12, 2025
52 of 54 checks passed
@jeejeelee jeejeelee deleted the fix-bnb-name-mapping branch September 12, 2025 14:59
skyloevil pushed a commit to skyloevil/vllm that referenced this pull request Sep 13, 2025
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
dsxsteven pushed a commit to dsxsteven/vllm_splitPR that referenced this pull request Sep 15, 2025
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants