Fix RWKV v6 model conversion #10913

MollySophia · 2024-12-20T07:10:39Z

Make sure to read the contributing guidelines before submitting a PR

It seems that some rwkv tensors are made FP16 rather than FP32 after specific commits. However, ggml_cuda_op_bin_bcast requires src1->type == FP32. As a result, newly converted RWKV models cannot run with cuda, while existing files aren't affected. This PR fixes the issue above.

Also add LLAMA_EXAMPLE_PERPLEXITY in the examples list of parameter --no-context-shift so that models without context shift support can do llama-perplexity again.

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

ggerganov

This is likely caused by this change where I removed the squeeze() during conversion:

https://github.com/ggerganov/llama.cpp/blob/0bf2d10c5514ff61b99897a4a5054f846e384e1e/convert_hf_to_gguf.py#L298-L301

MollySophia · 2024-12-20T09:04:02Z

This is likely caused by this change where I removed the squeeze() during conversion:

https://github.com/ggerganov/llama.cpp/blob/0bf2d10c5514ff61b99897a4a5054f846e384e1e/convert_hf_to_gguf.py#L298-L301

I see. So there's another way to fix this: squeeze them in rwkv6's modify_tensors(), rather than adding them to the F32 list?
Should both solve the issue. I wonder which way is better

ggerganov · 2024-12-20T09:22:31Z

Don't think there is any significant advantage one way or the other. Maybe squeezing in modify_tensors is a bit more localized.

MollySophia · 2024-12-20T09:24:54Z

Don't think there is any significant advantage one way or the other. Maybe squeezing in modify_tensors is a bit more localized.

Yeah that's what I meant. Let me change to use the more localized way then.

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

ggerganov · 2024-12-20T09:41:39Z

This is good to merge?

MollySophia · 2024-12-20T09:42:33Z

This is good to merge?

Yes. Thanks a lot for your time!

* Enable --no-context-shift for llama-perplexity example Signed-off-by: Molly Sophia <mollysophia379@gmail.com> * RWKV 6: Fix error in ggml_cuda_op_bin_bcast Signed-off-by: Molly Sophia <mollysophia379@gmail.com> --------- Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

Enable --no-context-shift for llama-perplexity example

ff3d226

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

github-actions bot added the python python script changes label Dec 20, 2024

ggerganov approved these changes Dec 20, 2024

View reviewed changes

RWKV 6: Fix error in ggml_cuda_op_bin_bcast

a20a94f

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

MollySophia force-pushed the fix-rwkv-converter branch from 8699330 to a20a94f Compare December 20, 2024 09:29

ggerganov merged commit 0a11f8b into ggml-org:master Dec 20, 2024
50 of 51 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix RWKV v6 model conversion #10913

Fix RWKV v6 model conversion #10913

Uh oh!

MollySophia commented Dec 20, 2024

Uh oh!

ggerganov left a comment

Uh oh!

MollySophia commented Dec 20, 2024

Uh oh!

ggerganov commented Dec 20, 2024

Uh oh!

MollySophia commented Dec 20, 2024

Uh oh!

ggerganov commented Dec 20, 2024

Uh oh!

MollySophia commented Dec 20, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix RWKV v6 model conversion #10913

Fix RWKV v6 model conversion #10913

Uh oh!

Conversation

MollySophia commented Dec 20, 2024

Uh oh!

ggerganov left a comment

Choose a reason for hiding this comment

Uh oh!

MollySophia commented Dec 20, 2024

Uh oh!

ggerganov commented Dec 20, 2024

Uh oh!

MollySophia commented Dec 20, 2024

Uh oh!

ggerganov commented Dec 20, 2024

Uh oh!

MollySophia commented Dec 20, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants