Skip to content

Misc. bug: Server: LFM2-VL crashes on checkpoint restoring with image input #16590

@ngxson

Description

@ngxson

Name and Version

docker image: ghcr.io/ggml-org/llama.cpp:server-b6755

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-server

Command line

llama-server -hf LiquidAI/LFM2-VL-1.6B

Problem description & steps to reproduce

Run llama-server -hf LiquidAI/LFM2-VL-1.6B
Open webui http://localhost:8080
Upload a photo and enter a prompt
Once the model done generating text, click on re-generate, the server should crash:

main: server is listening on http://0.0.0.0:8080 - starting the main loop
srv  update_slots: all slots are idle
srv  params_from_: Chat format: Content-only
slot get_availabl: id  0 | task -1 | selected slot by LRU, t_last = -1
slot launch_slot_: id  0 | task 0 | processing task
slot update_slots: id  0 | task 0 | new prompt, n_ctx_slot = 4096, n_keep = 0, n_prompt_tokens = 270
slot update_slots: id  0 | task 0 | n_past = 0, memory_seq_rm [0, end)
slot update_slots: id  0 | task 0 | prompt processing progress, n_past = 9, n_tokens = 9, progress = 0.033333
slot update_slots: id  0 | task 0 | n_past = 9, memory_seq_rm [9, end)
srv  process_chun: processing image...
srv  process_chun: image processed in 20362 ms
slot update_slots: id  0 | task 0 | prompt processing progress, n_past = 270, n_tokens = 5, progress = 1.000000
slot update_slots: id  0 | task 0 | prompt done, n_past = 270, n_tokens = 5
slot update_slots: id  0 | task 0 | created context checkpoint 1 of 8 (pos_min = 264, pos_max = 264, size = 0.156 MiB)
srv  log_server_r: request: GET /health 127.0.0.1 200
slot print_timing: id  0 | task 0 | 
prompt eval time =   20707.07 ms /   270 tokens (   76.69 ms per token,    13.04 tokens per second)
       eval time =    5239.30 ms /   142 tokens (   36.90 ms per token,    27.10 tokens per second)
      total time =   25946.36 ms /   412 tokens
slot      release: id  0 | task 0 | stop processing: n_past = 411, truncated = 0
srv  update_slots: all slots are idle
srv  log_server_r: request: POST /v1/chat/completions 192.168.20.1 200
srv  log_server_r: request: GET /health 127.0.0.1 200
srv  log_server_r: request: GET /health 127.0.0.1 200
srv  params_from_: Chat format: Content-only
slot get_availabl: id  0 | task -1 | selected slot by LCP similarity, sim_best = 1.000 (> 0.100 thold), f_keep = 0.657
slot launch_slot_: id  0 | task 144 | processing task
slot update_slots: id  0 | task 144 | new prompt, n_ctx_slot = 4096, n_keep = 0, n_prompt_tokens = 270
slot update_slots: id  0 | task 144 | old: ... 
<|im_start|>assistant
slot update_slots: id  0 | task 144 | new: ... 
<|im_start|>assistant
slot update_slots: id  0 | task 144 |      708       6   64015     708
slot update_slots: id  0 | task 144 |      708       6   64015     708
slot update_slots: id  0 | task 144 | n_past = 270, cache_tokens.size() = 411, seq_id = 0, pos_min = 410, n_swa = 1
slot update_slots: id  0 | task 144 | restored context checkpoint (pos_min = 264, pos_max = 264, size = 0.156 MiB)
slot update_slots: id  0 | task 144 | n_past = 265, memory_seq_rm [265, end)
libggml-base.so(+0x183cb)[0x772c4dada3cb]
libggml-base.so(ggml_print_backtrace+0x21f)[0x772c4dada82f]
libggml-base.so(+0x2b20f)[0x772c4daed20f]
/lib/x86_64-linux-gnu/libstdc++.so.6(+0xae20c)[0x772c4d94220c]
/lib/x86_64-linux-gnu/libstdc++.so.6(+0xae277)[0x772c4d942277]
/lib/x86_64-linux-gnu/libstdc++.so.6(+0xae4d8)[0x772c4d9424d8]
/app/llama-server(+0x8b8d9)[0x5dd3a4a818d9]
/app/llama-server(+0xf4fd6)[0x5dd3a4aeafd6]
/app/llama-server(+0x9550f)[0x5dd3a4a8b50f]
/app/llama-server(+0x552b3)[0x5dd3a4a4b2b3]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x772c4d58dd90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x772c4d58de40]
/app/llama-server(+0x56d35)[0x5dd3a4a4cd35]
terminate called after throwing an instance of 'std::runtime_error'
  what():  Chunk not found

First Bad Commit

No response

Relevant log output

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions