Skip to content

Misc. bug: Using Vulkan device startup error in Dokcer #16617

@wszgrcy

Description

@wszgrcy

Name and Version

same binary
in ubuntu(host)

load_backend: loaded RPC backend from /home/aaa/asr-backend/dist/data/llama/llama/llama-b6715-bin-ubuntu-vulkan-x64/build/bin/libggml-rpc.so
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon Graphics (RADV PHOENIX) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
load_backend: loaded Vulkan backend from /home/chen/asr-backend/dist/data/llama/llama/llama-b6715-bin-ubuntu-vulkan-x64/build/bin/libggml-vulkan.so
load_backend: loaded CPU backend from /home/chen/asr-backend/dist/data/llama/llama/llama-b6715-bin-ubuntu-vulkan-x64/build/bin/libggml-cpu-icelake.so
version: 6715 (12bbc3fa)
built with cc (Ubuntu 11.4.0-1ubuntu1~22.04.2) 11.4.0 for x86_64-linux-gnu

in docker(loss cpu backend)
file

LICENSE            libggml-base.so             libggml-cpu-sapphirerapids.so  libggml-vulkan.so    llama-bench       llama-llava-cli     llama-qwen2vl-cli  rpc-server
LICENSE-curl       libggml-cpu-alderlake.so    libggml-cpu-skylakex.so        libggml.so           llama-cli         llama-minicpmv-cli  llama-run
LICENSE-httplib    libggml-cpu-haswell.so      libggml-cpu-sse42.so           libllama.so          llama-gemma3-cli  llama-mtmd-cli      llama-server
LICENSE-jsonhpp    libggml-cpu-icelake.so      libggml-cpu-x64.so             libmtmd.so           llama-gguf-split  llama-perplexity    llama-tokenize
LICENSE-linenoise  libggml-cpu-sandybridge.so  libggml-rpc.so                 llama-batched-bench  llama-imatrix     llama-quantize      llama-tts

log

load_backend: loaded RPC backend from /app/data/llama/llama/llama-b6715-bin-ubuntu-vulkan-x64/build/bin/libggml-rpc.so
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon 780M Graphics (RADV PHOENIX) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
load_backend: loaded Vulkan backend from /app/data/llama/llama/llama-b6715-bin-ubuntu-vulkan-x64/build/bin/libggml-vulkan.so
version: 6715 (12bbc3fa)
built with cc (Ubuntu 11.4.0-1ubuntu1~22.04.2) 11.4.0 for x86_64-linux-gnu

Operating systems

Other? (Please let us know in description), Linux

Which llama.cpp modules do you know to be affected?

llama-server

Command line

/app/data/llama/llama/llama-b6715-bin-ubuntu-vulkan-x64/build/bin/llama-server --n-gpu-layers 0  --verbose --model  /app/data/llama/models/wszgrcy-Hunyuan-MT-7B-
Hunyuan-MT-7B-Q4_K_M.gguf

Problem description & steps to reproduce

My operating situation is quite unique
Using AMD 780m (1103)
Under Ubuntu, Vulkan devices can be started normally and output
But it doesn't work in Docker, although Vulkan was successfully identified, there will be an error in the endllama_model_load: error loading model: make_cpu_buft_list: no CPU backend found

dockerfile

FROM ubuntu:25.10
RUN apt-get update && \
    apt-get install -y --no-install-recommends \
    ca-certificates libvulkan1 mesa-vulkan-drivers curl vulkan-tools && \
    rm -rf /var/lib/apt/lists/*

docker compose

  backend:
    image: build:1.0.0
    devices:
      - /dev/kfd
      - /dev/dri

log

First Bad Commit

No response

Relevant log output

root@01e8c9dbbbd7:/app# /app/data/llama/llama/llama-b6715-bin-ubuntu-vulkan-x64/build/bin/llama-server --n-gpu-layers 0  --verbose --model  /app/data/llama/models/wszgrcy-Hunyuan-MT-7B-
Hunyuan-MT-7B-Q4_K_M.gguf
load_backend: loaded RPC backend from /app/data/llama/llama/llama-b6715-bin-ubuntu-vulkan-x64/build/bin/libggml-rpc.so
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon 780M Graphics (RADV PHOENIX) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
load_backend: loaded Vulkan backend from /app/data/llama/llama/llama-b6715-bin-ubuntu-vulkan-x64/build/bin/libggml-vulkan.so
build: 6715 (12bbc3fa) with cc (Ubuntu 11.4.0-1ubuntu1~22.04.2) 11.4.0 for x86_64-linux-gnu
system info: n_threads = 16, n_threads_batch = 16, total_threads = 16

system_info: n_threads = 16 (n_threads_batch = 16) / 16 | 

main: binding port with default address family
main: HTTP server is listening, hostname: 127.0.0.1, port: 8080, http threads: 15
main: loading model
srv    load_model: loading model '/app/data/llama/models/wszgrcy-Hunyuan-MT-7B-Hunyuan-MT-7B-Q4_K_M.gguf'
llama_model_load_from_file_impl: using device Vulkan0 (AMD Radeon 780M Graphics (RADV PHOENIX)) (0000:01:00.0) - 18201 MiB free
llama_model_loader: loaded meta data with 30 key-value pairs and 354 tensors from /app/data/llama/models/wszgrcy-Hunyuan-MT-7B-Hunyuan-MT-7B-Q4_K_M.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = hunyuan-dense
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = Model
llama_model_loader: - kv   3:                         general.size_label str              = 7.5B
llama_model_loader: - kv   4:                               general.tags arr[str,1]       = ["translation"]
llama_model_loader: - kv   5:                          general.languages arr[str,36]      = ["zh", "en", "fr", "pt", "es", "ja", ...
llama_model_loader: - kv   6:                  hunyuan-dense.block_count u32              = 32
llama_model_loader: - kv   7:               hunyuan-dense.context_length u32              = 262144
llama_model_loader: - kv   8:             hunyuan-dense.embedding_length u32              = 4096
llama_model_loader: - kv   9:          hunyuan-dense.feed_forward_length u32              = 14336
llama_model_loader: - kv  10:         hunyuan-dense.attention.head_count u32              = 32
llama_model_loader: - kv  11:      hunyuan-dense.attention.head_count_kv u32              = 8
llama_model_loader: - kv  12:               hunyuan-dense.rope.freq_base f32              = 1200508032.000000
llama_model_loader: - kv  13: hunyuan-dense.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  14:         hunyuan-dense.attention.key_length u32              = 128
llama_model_loader: - kv  15:       hunyuan-dense.attention.value_length u32              = 128
llama_model_loader: - kv  16:            hunyuan-dense.rope.scaling.type str              = none
llama_model_loader: - kv  17:          hunyuan-dense.rope.scaling.factor f32              = 1.000000
llama_model_loader: - kv  18: hunyuan-dense.rope.scaling.original_context_length u32              = 262144
llama_model_loader: - kv  19:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  20:                         tokenizer.ggml.pre str              = hunyuan
llama_model_loader: - kv  21:                      tokenizer.ggml.tokens arr[str,128256]  = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv  22:                  tokenizer.ggml.token_type arr[i32,128256]  = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  23:                      tokenizer.ggml.merges arr[str,264306]  = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "...
llama_model_loader: - kv  24:                tokenizer.ggml.bos_token_id u32              = 127958
llama_model_loader: - kv  25:                tokenizer.ggml.eos_token_id u32              = 127960
llama_model_loader: - kv  26:            tokenizer.ggml.padding_token_id u32              = 127961
llama_model_loader: - kv  27:                    tokenizer.chat_template str              = {% set ns = namespace(has_head=true) ...
llama_model_loader: - kv  28:               general.quantization_version u32              = 2
llama_model_loader: - kv  29:                          general.file_type u32              = 15
llama_model_loader: - type  f32:  129 tensors
llama_model_loader: - type q4_K:  192 tensors
llama_model_loader: - type q6_K:   33 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type   = Q4_K - Medium
print_info: file size   = 4.30 GiB (4.92 BPW) 
init_tokenizer: initializing tokenizer for type 2
load: control token: 128165 '<|extra_203|>' is not marked as EOG
load: control token: 128164 '<|extra_202|>' is not marked as EOG
load: control token: 128155 '<|extra_193|>' is not marked as EOG
load: control token: 128152 '<|extra_190|>' is not marked as EOG
load: control token: 128147 '<|extra_185|>' is not marked as EOG
load: control token: 128143 '<|extra_181|>' is not marked as EOG
load: control token: 128142 '<|extra_180|>' is not marked as EOG
load: control token: 128141 '<|extra_179|>' is not marked as EOG
load: control token: 128138 '<|extra_176|>' is not marked as EOG
load: control token: 128136 '<|extra_174|>' is not marked as EOG
load: control token: 128134 '<|extra_172|>' is not marked as EOG
load: control token: 128133 '<|extra_171|>' is not marked as EOG
load: control token: 128129 '<|extra_167|>' is not marked as EOG
load: control token: 128127 '<|extra_165|>' is not marked as EOG
load: control token: 128126 '<|extra_164|>' is not marked as EOG
load: control token: 128124 '<|extra_162|>' is not marked as EOG
load: control token: 128120 '<|extra_158|>' is not marked as EOG
load: control token: 128119 '<|extra_157|>' is not marked as EOG
load: control token: 128117 '<|extra_155|>' is not marked as EOG
load: control token: 128115 '<|extra_153|>' is not marked as EOG
load: control token: 128114 '<|extra_152|>' is not marked as EOG
load: control token: 128110 '<|extra_148|>' is not marked as EOG
load: control token: 128109 '<|extra_147|>' is not marked as EOG
load: control token: 128107 '<|extra_145|>' is not marked as EOG
load: control token: 128105 '<|extra_143|>' is not marked as EOG
load: control token: 128102 '<|extra_140|>' is not marked as EOG
load: control token: 128100 '<|extra_138|>' is not marked as EOG
load: control token: 128099 '<|extra_137|>' is not marked as EOG
load: control token: 128097 '<|extra_135|>' is not marked as EOG
load: control token: 128096 '<|extra_134|>' is not marked as EOG
load: control token: 128095 '<|extra_133|>' is not marked as EOG
load: control token: 128092 '<|extra_130|>' is not marked as EOG
load: control token: 128090 '<|extra_128|>' is not marked as EOG
load: control token: 128089 '<|extra_127|>' is not marked as EOG
load: control token: 128085 '<|extra_123|>' is not marked as EOG
load: control token: 128080 '<|extra_118|>' is not marked as EOG
load: control token: 128077 '<|extra_115|>' is not marked as EOG
load: control token: 128074 '<|extra_112|>' is not marked as EOG
load: control token: 128068 '<|extra_106|>' is not marked as EOG
load: control token: 128066 '<|extra_104|>' is not marked as EOG
load: control token: 128065 '<|extra_103|>' is not marked as EOG
load: control token: 128063 '<|extra_101|>' is not marked as EOG
load: control token: 128062 '<|extra_100|>' is not marked as EOG
load: control token: 128061 '<|extra_99|>' is not marked as EOG
load: control token: 128056 '<|extra_94|>' is not marked as EOG
load: control token: 128050 '<|extra_88|>' is not marked as EOG
load: control token: 128046 '<|extra_84|>' is not marked as EOG
load: control token: 128044 '<|extra_82|>' is not marked as EOG
load: control token: 128041 '<|extra_79|>' is not marked as EOG
load: control token: 128038 '<|extra_76|>' is not marked as EOG
load: control token: 128037 '<|extra_75|>' is not marked as EOG
load: control token: 128034 '<|extra_72|>' is not marked as EOG
load: control token: 128032 '<|extra_70|>' is not marked as EOG
load: control token: 128030 '<|extra_68|>' is not marked as EOG
load: control token: 128028 '<|extra_66|>' is not marked as EOG
load: control token: 128027 '<|extra_65|>' is not marked as EOG
load: control token: 128025 '<|extra_63|>' is not marked as EOG
load: control token: 128024 '<|extra_62|>' is not marked as EOG
load: control token: 128023 '<|extra_61|>' is not marked as EOG
load: control token: 128022 '<|extra_60|>' is not marked as EOG
load: control token: 128018 '<|extra_56|>' is not marked as EOG
load: control token: 128016 '<|extra_54|>' is not marked as EOG
load: control token: 128013 '<|extra_51|>' is not marked as EOG
load: control token: 128011 '<|extra_49|>' is not marked as EOG
load: control token: 128010 '<|extra_48|>' is not marked as EOG
load: control token: 128008 '<|extra_46|>' is not marked as EOG
load: control token: 128005 '<|extra_43|>' is not marked as EOG
load: control token: 128003 '<|extra_41|>' is not marked as EOG
load: control token: 128000 '<|extra_38|>' is not marked as EOG
load: control token: 127995 '<|extra_33|>' is not marked as EOG
load: control token: 127994 '<|extra_32|>' is not marked as EOG
load: control token: 127992 '<|extra_30|>' is not marked as EOG
load: control token: 127988 '<|extra_26|>' is not marked as EOG
load: control token: 127980 '<|extra_18|>' is not marked as EOG
load: control token: 127979 '<|extra_17|>' is not marked as EOG
load: control token: 127978 '<|extra_16|>' is not marked as EOG
load: control token: 127974 '<|extra_12|>' is not marked as EOG
load: control token: 127973 '<|extra_11|>' is not marked as EOG
load: control token: 127972 '<|extra_10|>' is not marked as EOG
load: control token: 127968 '<|extra_6|>' is not marked as EOG
load: control token: 127966 '<|extra_4|>' is not marked as EOG
load: control token: 127965 '<|extra_3|>' is not marked as EOG
load: control token: 127964 '<|extra_2|>' is not marked as EOG
load: control token: 127963 '<|extra_1|>' is not marked as EOG
load: control token: 127961 '<|pad|>' is not marked as EOG
load: control token: 127959 '<|bos|>' is not marked as EOG
load: control token: 128001 '<|extra_39|>' is not marked as EOG
load: control token: 127990 '<|extra_28|>' is not marked as EOG
load: control token: 128145 '<|extra_183|>' is not marked as EOG
load: control token: 128048 '<|extra_86|>' is not marked as EOG
load: control token: 128051 '<|extra_89|>' is not marked as EOG
load: control token: 127983 '<|extra_21|>' is not marked as EOG
load: control token: 128049 '<|extra_87|>' is not marked as EOG
load: control token: 128151 '<|extra_189|>' is not marked as EOG
load: control token: 127967 '<|extra_5|>' is not marked as EOG
load: control token: 128130 '<|extra_168|>' is not marked as EOG
load: control token: 128007 '<|extra_45|>' is not marked as EOG
load: control token: 128004 '<|extra_42|>' is not marked as EOG
load: control token: 127960 '<|eos|>' is not marked as EOG
load: control token: 128160 '<|extra_198|>' is not marked as EOG
load: control token: 128159 '<|extra_197|>' is not marked as EOG
load: control token: 127999 '<|extra_37|>' is not marked as EOG
load: control token: 128106 '<|extra_144|>' is not marked as EOG
load: control token: 128088 '<|extra_126|>' is not marked as EOG
load: control token: 127997 '<|extra_35|>' is not marked as EOG
load: control token: 128146 '<|extra_184|>' is not marked as EOG
load: control token: 128113 '<|extra_151|>' is not marked as EOG
load: control token: 128104 '<|extra_142|>' is not marked as EOG
load: control token: 128101 '<|extra_139|>' is not marked as EOG
load: control token: 128035 '<|extra_73|>' is not marked as EOG
load: control token: 128057 '<|extra_95|>' is not marked as EOG
load: control token: 128148 '<|extra_186|>' is not marked as EOG
load: control token: 128059 '<|extra_97|>' is not marked as EOG
load: control token: 128108 '<|extra_146|>' is not marked as EOG
load: control token: 127977 '<|extra_15|>' is not marked as EOG
load: control token: 128029 '<|extra_67|>' is not marked as EOG
load: control token: 128139 '<|extra_177|>' is not marked as EOG
load: control token: 128002 '<|extra_40|>' is not marked as EOG
load: control token: 128091 '<|extra_129|>' is not marked as EOG
load: control token: 128098 '<|extra_136|>' is not marked as EOG
load: control token: 128081 '<|extra_119|>' is not marked as EOG
load: control token: 128122 '<|extra_160|>' is not marked as EOG
load: control token: 128078 '<|extra_116|>' is not marked as EOG
load: control token: 128070 '<|extra_108|>' is not marked as EOG
load: control token: 128156 '<|extra_194|>' is not marked as EOG
load: control token: 128054 '<|extra_92|>' is not marked as EOG
load: control token: 128014 '<|extra_52|>' is not marked as EOG
load: control token: 128083 '<|extra_121|>' is not marked as EOG
load: control token: 127981 '<|extra_19|>' is not marked as EOG
load: control token: 128162 '<|extra_200|>' is not marked as EOG
load: control token: 128131 '<|extra_169|>' is not marked as EOG
load: control token: 128060 '<|extra_98|>' is not marked as EOG
load: control token: 127984 '<|extra_22|>' is not marked as EOG
load: control token: 128132 '<|extra_170|>' is not marked as EOG
load: control token: 128009 '<|extra_47|>' is not marked as EOG
load: control token: 128125 '<|extra_163|>' is not marked as EOG
load: control token: 128140 '<|extra_178|>' is not marked as EOG
load: control token: 128047 '<|extra_85|>' is not marked as EOG
load: control token: 127971 '<|extra_9|>' is not marked as EOG
load: control token: 128039 '<|extra_77|>' is not marked as EOG
load: control token: 128072 '<|extra_110|>' is not marked as EOG
load: control token: 128052 '<|extra_90|>' is not marked as EOG
load: control token: 128103 '<|extra_141|>' is not marked as EOG
load: control token: 127962 '<|extra_0|>' is not marked as EOG
load: control token: 128087 '<|extra_125|>' is not marked as EOG
load: control token: 127996 '<|extra_34|>' is not marked as EOG
load: control token: 128128 '<|extra_166|>' is not marked as EOG
load: control token: 127976 '<|extra_14|>' is not marked as EOG
load: control token: 128154 '<|extra_192|>' is not marked as EOG
load: control token: 128020 '<|extra_58|>' is not marked as EOG
load: control token: 128042 '<|extra_80|>' is not marked as EOG
load: control token: 128055 '<|extra_93|>' is not marked as EOG
load: control token: 128093 '<|extra_131|>' is not marked as EOG
load: control token: 127989 '<|extra_27|>' is not marked as EOG
load: control token: 128067 '<|extra_105|>' is not marked as EOG
load: control token: 128123 '<|extra_161|>' is not marked as EOG
load: control token: 128116 '<|extra_154|>' is not marked as EOG
load: control token: 128121 '<|extra_159|>' is not marked as EOG
load: control token: 128075 '<|extra_113|>' is not marked as EOG
load: control token: 128006 '<|extra_44|>' is not marked as EOG
load: control token: 128163 '<|extra_201|>' is not marked as EOG
load: control token: 128064 '<|extra_102|>' is not marked as EOG
load: control token: 128058 '<|extra_96|>' is not marked as EOG
load: control token: 128012 '<|extra_50|>' is not marked as EOG
load: control token: 128094 '<|extra_132|>' is not marked as EOG
load: control token: 128031 '<|extra_69|>' is not marked as EOG
load: control token: 127970 '<|extra_8|>' is not marked as EOG
load: control token: 128017 '<|extra_55|>' is not marked as EOG
load: control token: 128112 '<|extra_150|>' is not marked as EOG
load: control token: 128040 '<|extra_78|>' is not marked as EOG
load: control token: 128015 '<|extra_53|>' is not marked as EOG
load: control token: 128019 '<|extra_57|>' is not marked as EOG
load: control token: 127987 '<|extra_25|>' is not marked as EOG
load: control token: 128026 '<|extra_64|>' is not marked as EOG
load: control token: 127993 '<|extra_31|>' is not marked as EOG
load: control token: 128084 '<|extra_122|>' is not marked as EOG
load: control token: 127975 '<|extra_13|>' is not marked as EOG
load: control token: 128144 '<|extra_182|>' is not marked as EOG
load: control token: 128073 '<|extra_111|>' is not marked as EOG
load: control token: 128111 '<|extra_149|>' is not marked as EOG
load: control token: 128118 '<|extra_156|>' is not marked as EOG
load: control token: 128033 '<|extra_71|>' is not marked as EOG
load: control token: 128076 '<|extra_114|>' is not marked as EOG
load: control token: 128086 '<|extra_124|>' is not marked as EOG
load: control token: 128069 '<|extra_107|>' is not marked as EOG
load: control token: 128153 '<|extra_191|>' is not marked as EOG
load: control token: 128161 '<|extra_199|>' is not marked as EOG
load: control token: 127958 '<|startoftext|>' is not marked as EOG
load: control token: 128157 '<|extra_195|>' is not marked as EOG
load: control token: 127982 '<|extra_20|>' is not marked as EOG
load: control token: 127998 '<|extra_36|>' is not marked as EOG
load: control token: 128071 '<|extra_109|>' is not marked as EOG
load: control token: 128021 '<|extra_59|>' is not marked as EOG
load: control token: 127985 '<|extra_23|>' is not marked as EOG
load: control token: 128043 '<|extra_81|>' is not marked as EOG
load: control token: 128045 '<|extra_83|>' is not marked as EOG
load: control token: 128149 '<|extra_187|>' is not marked as EOG
load: control token: 128036 '<|extra_74|>' is not marked as EOG
load: control token: 127969 '<|extra_7|>' is not marked as EOG
load: control token: 128150 '<|extra_188|>' is not marked as EOG
load: control token: 128158 '<|extra_196|>' is not marked as EOG
load: control token: 128082 '<|extra_120|>' is not marked as EOG
load: control token: 127986 '<|extra_24|>' is not marked as EOG
load: control token: 128053 '<|extra_91|>' is not marked as EOG
load: control token: 127991 '<|extra_29|>' is not marked as EOG
load: control token: 128135 '<|extra_173|>' is not marked as EOG
load: control token: 128079 '<|extra_117|>' is not marked as EOG
load: control token: 128137 '<|extra_175|>' is not marked as EOG
load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
load: printing all EOG tokens:
load:   - 127957 ('<|endoftext|>')
load:   - 127960 ('<|eos|>')
load: special tokens cache size = 209
load: token to piece cache size = 0.7868 MB
print_info: arch             = hunyuan-dense
print_info: vocab_only       = 0
print_info: n_ctx_train      = 262144
print_info: n_embd           = 4096
print_info: n_layer          = 32
print_info: n_head           = 32
print_info: n_head_kv        = 8
print_info: n_rot            = 128
print_info: n_swa            = 0
print_info: is_swa_any       = 0
print_info: n_embd_head_k    = 128
print_info: n_embd_head_v    = 128
print_info: n_gqa            = 4
print_info: n_embd_k_gqa     = 1024
print_info: n_embd_v_gqa     = 1024
print_info: f_norm_eps       = 0.0e+00
print_info: f_norm_rms_eps   = 1.0e-05
print_info: f_clamp_kqv      = 0.0e+00
print_info: f_max_alibi_bias = 0.0e+00
print_info: f_logit_scale    = 0.0e+00
print_info: f_attn_scale     = 0.0e+00
print_info: n_ff             = 14336
print_info: n_expert         = 0
print_info: n_expert_used    = 0
print_info: causal attn      = 1
print_info: pooling type     = 0
print_info: rope type        = 2
print_info: rope scaling     = none
print_info: freq_base_train  = 1200508032.0
print_info: freq_scale_train = 1
print_info: n_ctx_orig_yarn  = 262144
print_info: rope_finetuned   = unknown
print_info: model type       = 7B
print_info: model params     = 7.50 B
print_info: general.name     = Model
print_info: vocab type       = BPE
print_info: n_vocab          = 128256
print_info: n_merges         = 264306
print_info: BOS token        = 127958 '<|startoftext|>'
print_info: EOS token        = 127960 '<|eos|>'
print_info: EOT token        = 127957 '<|endoftext|>'
print_info: PAD token        = 127961 '<|pad|>'
print_info: LF token         = 198 'Ċ'
print_info: EOG token        = 127957 '<|endoftext|>'
print_info: EOG token        = 127960 '<|eos|>'
print_info: max token length = 256
load_tensors: loading model tensors, this can take a while... (mmap = true)
llama_model_load: error loading model: make_cpu_buft_list: no CPU backend found
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model '/app/data/llama/models/wszgrcy-Hunyuan-MT-7B-Hunyuan-MT-7B-Q4_K_M.gguf', try reducing --n-gpu-layers if you're running out of VRAM
srv    load_model: failed to load model, '/app/data/llama/models/wszgrcy-Hunyuan-MT-7B-Hunyuan-MT-7B-Q4_K_M.gguf'
srv    operator(): operator(): cleaning up before exit...
main: exiting due to model loading error

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions