-
Notifications
You must be signed in to change notification settings - Fork 14.1k
Description
Name and Version
version: 6717 (b260213)
built with Intel(R) oneAPI DPC++/C++ Compiler 2025.3.1 (2025.3.1.20251023) for x86_64-unknown-linux-gnu
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
No response
Command line
./llama-server -m ${MODEL_FILE}Problem description & steps to reproduce
Since the following change I have problems running gguf models (any model I tested - mostly qwen models) with SYCL backend.
b260213#diff-3b2a242ed9135b23435e9ac1c2302c321017416eb545c294ca4108ad6dab7079L4389
I have compiled the project using the script examples/sycl/build.sh
I am loading a model with it (e.g. Qwen2.5.1-Coder-7B-Instruct-Q8_0.gguf) and running a request.
Generally the output starts and after a few tokens it either stops or there is garbage added - like "import DebugCall09090909090..." with a lot of repeated characters.
I didn't have those problems before the change above.
SYCL Compiler version: 20250301