Skip to content

Conversation

Copy link

Copilot AI commented Dec 15, 2025

Thanks for asking me to work on this. I will get started on it and keep this PR's description up to date as I form a plan and make progress.

Original prompt

背景

当前仓库只支持到 Qwen2.5 模型,需要扩展支持 Qwen3 系列模型(包括文本和视觉语言多模态版本)。

需要完成的修改

1. 在 llama_cpp/llama_chat_format.py 中添加 Qwen3 文本对话格式

在现有的 format_qwen 函数附近(约第 1038 行),添加一个新的 qwen3 chat format:

@register_chat_format("qwen3")
def format_qwen3(
    messages: List[llama_types.ChatCompletionRequestMessage],
    **kwargs: Any,
) -> ChatFormatterResponse:
    _roles = dict(user="<|im_start|>user", assistant="<|im_start|>assistant")
    system_message = _get_system_message(messages) or "You are Qwen, a helpful assistant."
    system_template = "<|im_start|>system\n{system_message}"
    system_message = system_template.format(system_message=system_message)
    _messages = _map_roles(messages, _roles)
    _messages.append((_roles["assistant"], None))
    _sep = "<|im_end|>"
    _prompt = _format_chatml(system_message, _messages, _sep)
    return ChatFormatterResponse(prompt=_prompt, stop=["<|im_end|>", "<|endoftext|>"])

2. 在 llama_cpp/llama_chat_format.py 中添加 Qwen3VLChatHandler 类

在现有的 Qwen25VLChatHandler 类之后(约第 3520 行附近),添加新的 Qwen3VLChatHandler 类:

class Qwen3VLChatHandler(Llava15ChatHandler):
    DEFAULT_SYSTEM_MESSAGE = "You are Qwen, a helpful assistant."

    CHAT_FORMAT = (
        "{% for message in messages %}"
        "{% if loop.first and message['role'] != 'system' %}"
        "<|im_start|>system\n"
        "You are Qwen, a helpful assistant.<|im_end|>\n"
        "{% endif %}"
        "<|im_start|>{{ message['role'] }}\n"
        "{% if message['content'] is string %}"
        "{{ message['content'] }}<|im_end|>\n"
        "{% else %}"
        "{% for content in message['content'] %}"
        "{% if content['type'] == 'image_url' %}"
        "{% if content.image_url is string %}"
        "{{ content.image_url }}"
        "{% else %}"
        "{{ content.image_url.url }}"
        "{% endif %}"
        "{% elif content['type'] == 'text' %}"
        "{{ content['text'] }}"
        "{% endif %}"
        "{% endfor %}"
        "<|im_end|>\n"
        "{% endif %}"
        "{% endfor %}"
        "<|im_start|>assistant\n"
    )

    def __call__(self, **kwargs):
        llama = kwargs['llama']

        # Clear state for multiple runs
        llama.reset()
        llama._ctx.kv_cache_clear()
        llama.n_tokens = 0

        if hasattr(llama, 'input_ids'):
            llama.input_ids.fill(0)

        # Clear any handler state
        if hasattr(self, '_last_image_embed'):
            self._last_image_embed = None
            self._last_image_hash = None

        if self.verbose:
            messages = kwargs.get('messages', [])
            image_count = len(self.get_image_urls(messages))
            print(f"Qwen3VL - Cleared state, processing {image_count} images", file=sys.stderr)

        # Use parent implementation
        return super().__call__(**kwargs)

3. 修改 llama_cpp/server/model.py

load_llama_from_model_settings 函数中,找到处理 qwen2.5-vl 的代码块(约第 175-184 行),在其后添加对 qwen3-vl 的支持:

        elif settings.chat_format == "qwen3-vl":
            assert settings.clip_model_path is not None, "clip model not found"
            if settings.hf_model_repo_id is not None:
                chat_handler = (
                    llama_cpp.llama_chat_format.Qwen3VLChatHandler.from_pretrained(
                        repo_id=settings.hf_model_repo_id,
                        filename=settings.clip_model_path,
                        verbose=settings.verbose,
                    )
                )
            else:
                chat_handler = llama_cpp.llama_chat_format.Qwen3VLChatHandler(
                    clip_model_path=settings.clip_model_path, verbose=settings.verbose
                )

4. 更新 README.md 文档

在 README.md 的多模态模型表格中(约第 507 行附近),添加 Qwen3-VL 的说明:

| qwen3-vl | Qwen3VLChatHandler | qwen3-vl |

验收标准

  1. 可以使用 chat_format="qwen3" 加载 Qwen3 文本模型并正常对话
  2. 可以使用 chat_format="qwen3-vl" 配合 clip 模型加载 Qwen3-VL 多模态模型
  3. 服务器模式下可以正确识别和加载 qwen3-vl 格式

This pull request was created as a result of the following prompt from Copilot chat.

背景

当前仓库只支持到 Qwen2.5 模型,需要扩展支持 Qwen3 系列模型(包括文本和视觉语言多模态版本)。

需要完成的修改

1. 在 llama_cpp/llama_chat_format.py 中添加 Qwen3 文本对话格式

在现有的 format_qwen 函数附近(约第 1038 行),添加一个新的 qwen3 chat format:

@register_chat_format("qwen3")
def format_qwen3(
    messages: List[llama_types.ChatCompletionRequestMessage],
    **kwargs: Any,
) -> ChatFormatterResponse:
    _roles = dict(user="<|im_start|>user", assistant="<|im_start|>assistant")
    system_message = _get_system_message(messages) or "You are Qwen, a helpful assistant."
    system_template = "<|im_start|>system\n{system_message}"
    system_message = system_template.format(system_message=system_message)
    _messages = _map_roles(messages, _roles)
    _messages.append((_roles["assistant"], None))
    _sep = "<|im_end|>"
    _prompt = _format_chatml(system_message, _messages, _sep)
    return ChatFormatterResponse(prompt=_prompt, stop=["<|im_end|>", "<|endoftext|>"])

2. 在 llama_cpp/llama_chat_format.py 中添加 Qwen3VLChatHandler 类

在现有的 Qwen25VLChatHandler 类之后(约第 3520 行附近),添加新的 Qwen3VLChatHandler 类:

class Qwen3VLChatHandler(Llava15ChatHandler):
    DEFAULT_SYSTEM_MESSAGE = "You are Qwen, a helpful assistant."

    CHAT_FORMAT = (
        "{% for message in messages %}"
        "{% if loop.first and message['role'] != 'system' %}"
        "<|im_start|>system\n"
        "You are Qwen, a helpful assistant.<|im_end|>\n"
        "{% endif %}"
        "<|im_start|>{{ message['role'] }}\n"
        "{% if message['content'] is string %}"
        "{{ message['content'] }}<|im_end|>\n"
        "{% else %}"
        "{% for content in message['content'] %}"
        "{% if content['type'] == 'image_url' %}"
        "{% if content.image_url is string %}"
        "{{ content.image_url }}"
        "{% else %}"
        "{{ content.image_url.url }}"
        "{% endif %}"
        "{% elif content['type'] == 'text' %}"
        "{{ content['text'] }}"
        "{% endif %}"
        "{% endfor %}"
        "<|im_end|>\n"
        "{% endif %}"
        "{% endfor %}"
        "<|im_start|>assistant\n"
    )

    def __call__(self, **kwargs):
        llama = kwargs['llama']

        # Clear state for multiple runs
        llama.reset()
        llama._ctx.kv_cache_clear()
        llama.n_tokens = 0

        if hasattr(llama, 'input_ids'):
            llama.input_ids.fill(0)

        # Clear any handler state
        if hasattr(self, '_last_image_embed'):
            self._last_image_embed = None
            self._last_image_hash = None

        if self.verbose:
            messages = kwargs.get('messages', [])
            image_count = len(self.get_image_urls(messages))
            print(f"Qwen3VL - Cleared state, processing {image_count} images", file=sys.stderr)

        # Use parent implementation
        return super().__call__(**kwargs)

3. 修改 llama_cpp/server/model.py

load_llama_from_model_settings 函数中,找到处理 qwen2.5-vl 的代码块(约第 175-184 行),在其后添加对 qwen3-vl 的支持:

        elif settings.chat_format == "qwen3-vl":
            assert settings.clip_model_path is not None, "clip model not found"
            if settings.hf_model_repo_id is not None:
                chat_handler = (
                    llama_cpp.llama_chat_format.Qwen3VLChatHandler.from_pretrained(
                        repo_id=settings.hf_model_repo_id,
                        filename=settings.clip_model_path,
                        verbose=settings.verbose,
                    )
                )
            else:
                chat_handler = llama_cpp.llama_chat_format.Qwen3VLChatHandler(
                    clip_model_path=settings.clip_model_path, verbose=settings.verbose
                )

4. 更新 README.md 文档

在 README.md 的多模态模型表格中(约第 507 行附近),添加 Qwen3-VL 的说明:

| qwen3-vl | Qwen3VLChatHandler | qwen3-vl |

验收标准

  1. 可以使用 chat_format="qwen3" 加载 Qwen3 文本模型并正常对话
  2. 可以使用 chat_format="qwen3-vl" 配合 clip 模型加载 Qwen3-VL 多模态模型
  3. 服务器模式下可以正确识别和加载 qwen3-vl 格式

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI self-assigned this Dec 15, 2025
@MrChenLearnSpace MrChenLearnSpace marked this pull request as ready for review December 15, 2025 18:37
Copilot AI review requested due to automatic review settings December 15, 2025 18:37
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@MrChenLearnSpace MrChenLearnSpace merged commit 8bb2105 into main Dec 15, 2025
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants