cuda : fix RoPE after #2268 #3897

cebtenzzre · 2023-11-02T04:35:28Z

Follow-up to #2268

I had meant to test this change, but I had evidently forgotten the -ngl parameter. Then CI passed and I thought I was in the clear to merge. Oops.

Integer division semantics had slipped my mind because I've been writing too much python. And apparently row does not mean what I thought it did in this kernel. I don't think figuring out the details of ne00 != n_dims is worth the trouble if we don't have a model to actually test with, and I think the comment makes it sufficiently clear what is going on.

ggerganov · 2023-11-02T06:56:59Z

We have an assert for ne00 == n_dims, so if such model appears, it will fire:

llama.cpp/ggml-cuda.cu

Line 6564 in 2fffa0d

    
           GGML_ASSERT(ne00 == n_dims && "ne00 != n_dims is not implemented for CUDA yet");

cuda : fix RoPE after ggerganov#2268

fd04ac5

cebtenzzre requested a review from ggerganov November 2, 2023 04:37

ggerganov approved these changes Nov 2, 2023

View reviewed changes

ggerganov merged commit 2fffa0d into ggerganov:master Nov 2, 2023
32 checks passed

olexiyb pushed a commit to Sanctum-AI/llama.cpp that referenced this pull request Nov 23, 2023

cuda : fix RoPE after ggerganov#2268 (ggerganov#3897)

cdd4a93

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cuda : fix RoPE after #2268 #3897

cuda : fix RoPE after #2268 #3897

cebtenzzre commented Nov 2, 2023

ggerganov commented Nov 2, 2023

cuda : fix RoPE after #2268 #3897

cuda : fix RoPE after #2268 #3897

Conversation

cebtenzzre commented Nov 2, 2023

ggerganov commented Nov 2, 2023