-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Doesn't cuda support IQ4_XS ? #6282
Labels
Comments
jiafjioajfijaiofnkan
changed the title
Does cuda not support IQ4_XS ?
Doesn't cuda support IQ4_XS ?
Mar 24, 2024
I can get it run on my P40 through. |
I check it again with latest commit, and still not work. Some logs are here:
From https://github.com/ggerganov/llama.cpp/blob/master/ggml-cuda/dmmv.cu#L804 it seems IQ* quants are not implemented. |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
use latest commit on master
GPU P100 on kaggle
Initialized success, when using completion on server, an assertion occured:
GGML_ASSERT: ggml-cuda.cu:9513: false
Found https://github.com/ggerganov/llama.cpp/blob/master/ggml-cuda.cu#L9513, and it seems IQ4_XS is not supported yet.
The text was updated successfully, but these errors were encountered: