You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
Thank you very much for your great project. I'm trying to replace the multiplication of Q and K in your project with my own matrix multiplication process, but I've encountered some problems during this process.
In the vit_encode_image function, you use ggml_mul_mat for the multiplication of Q and K:
structggml_tensor *KQ = ggml_mul_mat(ctx0, K, Q);
I want to replace this with a custom matrix multiplication process.
I want to replace this part with my own matrix multiplication implementation. In fact, I'm trying to check if it's feasible to transfer the computation to an FPGA.
However, I'm stuck. It seems that ggml needs to define a computational graph before execution. So at this point in the code, I can't directly get Q and K 's data.
Q = ggml_view_3d(ctx0, cur, hidden_size, W * H, B, cur->nb[1], cur->nb[2], 0 * cur->nb[3]);
Q = ggml_reshape_4d(ctx0, Q, n_enc_head_dim, num_attention_heads, W * H, B);
Q = ggml_cont(ctx0, ggml_permute(ctx0, Q, 0, 2, 1, 3));
Q = ggml_reshape_3d(ctx0, Q, n_enc_head_dim, W * H, B * num_attention_heads);
K = ggml_view_3d(ctx0, cur, hidden_size, W * H, B, cur->nb[1], cur->nb[2], 1 * cur->nb[3]);
K = ggml_reshape_4d(ctx0, K, n_enc_head_dim, num_attention_heads, W * H, B);
K = ggml_cont(ctx0, ggml_permute(ctx0, K, 0, 2, 1, 3));
K = ggml_reshape_3d(ctx0, K, n_enc_head_dim, W * H, B * num_attention_heads);
V = ggml_view_3d(ctx0, cur, hidden_size, W * H, B, cur->nb[1], cur->nb[2], 2 * cur->nb[3]);
V = ggml_reshape_4d(ctx0, V, n_enc_head_dim, num_attention_heads, W * H, B);
V = ggml_cont(ctx0, ggml_permute(ctx0, V, 1, 2, 0, 3)); // transposed
V = ggml_reshape_3d(ctx0, V, W * H, n_enc_head_dim, B * num_attention_heads);
Here are the relevant code snippets for processing Q, K, and V. I can't just write the QK multiplication call below.
I tried to use the ggml_map_custom2d function in ggml to define a custom operation. But it seems that this function requires the input and output tensors to have the same shape, which doesn't meet the requirements of matrix multiplication. I really don't know what to do next.
I'm new to ggml, could you please give me some advice? Thank you very much!
The text was updated successfully, but these errors were encountered:
Hello,
Thank you very much for your great project. I'm trying to replace the multiplication of Q and K in your project with my own matrix multiplication process, but I've encountered some problems during this process.
In the vit_encode_image function, you use ggml_mul_mat for the multiplication of Q and K:
I want to replace this with a custom matrix multiplication process.
I want to replace this part with my own matrix multiplication implementation. In fact, I'm trying to check if it's feasible to transfer the computation to an FPGA.
However, I'm stuck. It seems that ggml needs to define a computational graph before execution. So at this point in the code, I can't directly get Q and K 's data.
Here are the relevant code snippets for processing Q, K, and V. I can't just write the QK multiplication call below.
I tried to use the ggml_map_custom2d function in ggml to define a custom operation. But it seems that this function requires the input and output tensors to have the same shape, which doesn't meet the requirements of matrix multiplication. I really don't know what to do next.
I'm new to ggml, could you please give me some advice? Thank you very much!
The text was updated successfully, but these errors were encountered: