Skip to content

Model MTP support + Backend Issue #941

@emi-dm

Description

@emi-dm

Hi! I found this issue related to the use of MTP with the VLLM backend in docker model runner! Since I'm here, I want to ask about the integration of VLLM metal, is it activate by default for mac with the vllm backend?

Another interesting thing is that I've installed the vllm backend and it looks like it is using llama.cpp backend, any issue with that? For that I followed this guide https://www.docker.com/blog/docker-model-runner-integrates-vllm/

Thanks!

Error log:

docker model run hf.co/unsloth/Qwen3.6-27B-MTP-GGUF:UD-Q4_K_XL
Unable to find model 'hf.co/unsloth/Qwen3.6-27B-MTP-GGUF:UD-Q4_K_XL' locally. Pulling from the server.
9c68785fa64f: Pull complete [==================================================>]  17.91GB/17.91GB
00f45cd696de: Pull complete [==================================================>]  931.1MB/931.1MB
b33563055168: Pull complete [==================================================>]  25.41kB/25.41kB
053533475129: Pull complete [==================================================>]  931.1MB/931.1MB
4085665ee36d: Pull complete [==================================================>]  17.91GB/17.91GB
Model pulled successfully
> background model preload failed: preload failed: status=500 body=unable to load runner: error waiting for runner to be ready: llama.cpp terminated unexpectedly: llama.cpp failed: failed to load model

Verbose output:
llama_model_load: error loading model: missing tensor 'blk.64.ssm_conv1d.weight'
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model '/Users/edm/.docker/models/bundles/sha256/ae07dd2945afaaf7034f795ec286ec4cf79e6843e23b75b7c0696e31b6d40244/model/model.gguf'
srv    load_model: failed to load model, '/Users/edm/.docker/models/bundles/sha256/ae07dd2945afaaf7034f795ec286ec4cf79e6843e23b75b7c0696e31b6d40244/model/model.gguf'
srv    operator(): operator(): cleaning up before exit...
main: exiting due to model loading error
    = 248065 '<|file_sep|>'
print_info: EOG token             = 248044 '<|endoftext|>'
print_info: EOG token             = 248046 '<|im_end|>'
print_info: EOG token             = 248063 '<|fim_pad|>'
print_info: EOG token             = 248064 '<|repo_name|>'
print_info: EOG token             = 248065 '<|file_sep|>'
print_info: max token length      = 256
load_tensors: loading model tensors, this can take a while... (mmap = false, direct_io = false)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions