ggml-cpu: add RISC-V Vector support for RWKV WKV6 operation #17716

ixgbe · 2025-12-03T06:17:29Z

This PR adds RISC-V Vector (RVV) extension support for the RWKV WKV6 operation, enabling vectorized computation on RISC-V platforms.

Signed-off-by: Wang Yang <yangwang@iscas.ac.cn>

CISC · 2025-12-03T08:32:23Z

Please note that ubuntu-cpu-cmake(-rpc)-riscv64-native will fail test-tokenizers-ggml-vocabs until git-lfs is fixed on the runners, all other tests will run to completion though.

ggerganov · 2025-12-03T09:15:43Z

I'm thinking that we should deprecate these ops since they are very model-specific:

llama.cpp/ggml/include/ggml.h

Lines 2389 to 2417 in 37adc9c

    
           GGML_API struct ggml_tensor * ggml_rwkv_wkv6( 
        
                   struct ggml_context * ctx, 
        
                   struct ggml_tensor  * k, 
        
                   struct ggml_tensor  * v, 
        
                   struct ggml_tensor  * r, 
        
                   struct ggml_tensor  * tf, 
        
                   struct ggml_tensor  * td, 
        
                   struct ggml_tensor  * state); 
        
           GGML_API struct ggml_tensor * ggml_gated_linear_attn( 
        
                   struct ggml_context * ctx, 
        
                   struct ggml_tensor  * k, 
        
                   struct ggml_tensor  * v, 
        
                   struct ggml_tensor  * q, 
        
                   struct ggml_tensor  * g, 
        
                   struct ggml_tensor  * state, 
        
                   float scale); 
        
           GGML_API struct ggml_tensor * ggml_rwkv_wkv7( 
        
                   struct ggml_context * ctx, 
        
                   struct ggml_tensor  * r, 
        
                   struct ggml_tensor  * w, 
        
                   struct ggml_tensor  * k, 
        
                   struct ggml_tensor  * v, 
        
                   struct ggml_tensor  * a, 
        
                   struct ggml_tensor  * b, 
        
                   struct ggml_tensor  * state);

Probably not worth investing much effort in optimizing. Rather, look to implement them as combination of other fundamental ops.

ixgbe · 2025-12-03T09:56:19Z

I'm thinking that we should deprecate these ops since they are very model-specific:

llama.cpp/ggml/include/ggml.h

Lines 2389 to 2417 in 37adc9c

GGML_API struct ggml_tensor * ggml_rwkv_wkv6(

struct ggml_context * ctx,

struct ggml_tensor * k,

struct ggml_tensor * v,

struct ggml_tensor * r,

struct ggml_tensor * tf,

struct ggml_tensor * td,

struct ggml_tensor * state);

GGML_API struct ggml_tensor * ggml_gated_linear_attn(

struct ggml_context * ctx,

struct ggml_tensor * k,

struct ggml_tensor * v,

struct ggml_tensor * q,

struct ggml_tensor * g,

struct ggml_tensor * state,

float scale);

GGML_API struct ggml_tensor * ggml_rwkv_wkv7(

struct ggml_context * ctx,

struct ggml_tensor * r,

struct ggml_tensor * w,

struct ggml_tensor * k,

struct ggml_tensor * v,

struct ggml_tensor * a,

struct ggml_tensor * b,

struct ggml_tensor * state);

Probably not worth investing much effort in optimizing. Rather, look to implement them as combination of other fundamental ops.

Wanted to check in on this PR. Should I:

Wait for further discussion on the WKV op deprecation?
Make any changes to the current implementation?

Happy to follow whatever direction works best for the project. Thanks!

ggerganov · 2025-12-04T09:48:31Z

The best way is to try to implement these operations with other ggml ops. If we can do that, then we can replace the current implementations with the composite ones and make the deprecation process smoother. The main question is if they can be expressed as composite ops in a meaningful way.

ggml-cpu: add RISC-V Vector support for RWKV WKV6 operation

ffb4c38

Signed-off-by: Wang Yang <yangwang@iscas.ac.cn>

ixgbe requested a review from ggerganov as a code owner December 3, 2025 06:17

loci-dev mentioned this pull request Dec 3, 2025

UPSTREAM PR #17716: ggml-cpu: add RISC-V Vector support for RWKV WKV6 operation auroralabs-loci/llama.cpp#412

Open

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Dec 3, 2025

Merge branch 'ggml-org:master' into add_rvv_for_rwkv_wkv6

2b8c3be

ixgbe closed this Dec 4, 2025

ggerganov mentioned this pull request Dec 6, 2025

CANN: support gated linear attn #17814

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ggml-cpu: add RISC-V Vector support for RWKV WKV6 operation #17716

ggml-cpu: add RISC-V Vector support for RWKV WKV6 operation #17716

ixgbe commented Dec 3, 2025

Uh oh!

CISC commented Dec 3, 2025 •

edited

Loading

Uh oh!

ggerganov commented Dec 3, 2025

Uh oh!

ixgbe commented Dec 3, 2025

Uh oh!

ggerganov commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ggml-cpu: add RISC-V Vector support for RWKV WKV6 operation #17716

ggml-cpu: add RISC-V Vector support for RWKV WKV6 operation #17716

Conversation

ixgbe commented Dec 3, 2025

Uh oh!

CISC commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ggerganov commented Dec 3, 2025

Uh oh!

ixgbe commented Dec 3, 2025

Uh oh!

ggerganov commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CISC commented Dec 3, 2025 •

edited

Loading