Fixes tensor addressing for ZOH and active mask in splitkv #53

LoserCheems · 2025-07-01T13:43:06Z

Refactors tensor creation to use pre-calculated offsets instead of local_tile operations, ensuring correct memory addressing for both block table and non-block table scenarios.

Updates pointer advancement logic to maintain proper synchronization between K, ZOH, and active mask tensors during block iteration.

Adds debug output to help diagnose memory addressing issues during development.

Refactors tensor creation to use pre-calculated offsets instead of local_tile operations, ensuring correct memory addressing for both block table and non-block table scenarios. Updates pointer advancement logic to maintain proper synchronization between K, ZOH, and active mask tensors during block iteration. Adds debug output to help diagnose memory addressing issues during development.

Copilot

Pull Request Overview

This PR fixes tensor addressing for ZOH and active mask in the splitkv kernel, refactoring tensor creation to use pre-calculated offsets and updating pointer advancement logic.

Introduces new offset calculations (col_offset_zoh and col_offset_am) to correctly compute tensor pointers.
Replaces local_tile calls with direct make_tensor constructs and updates pointer arithmetic for ZOH and active mask tensors.
Adds debug print statements for monitoring tensor pointer values during execution.

Copilot · 2025-07-01T13:43:45Z

csrc/src/flash_fwd_kernel.h

-        make_coord(m_block, _)
-    );  // (kBlockM, kBlockN, nblocksN)
+        make_stride(params.active_mask_row_stride, params.active_mask_col_stride)
+    );


Consider wrapping the debug print statements with a compile-time flag (e.g., #ifdef DEBUG) to avoid unintentional output in production builds.

Suggested change

);

);

#ifdef DEBUG

Copilot · 2025-07-01T13:43:45Z

csrc/src/flash_fwd_kernel.h

                                       binfo.actual_seqlen_k - n_block * kBlockN);
+    if (tidx == 0 && bidh == 0 && bidb == 0) {
+        printf("Before copy_ZOH: n_block=%d, seqlen_q_offset=%d, seqlen_k_offset=%d\n",
+            n_block, 
+            binfo.actual_seqlen_q - m_block * kBlockM,
+            binfo.actual_seqlen_k - n_block * kBlockN);
+    }


Similarly, consider guarding these debug prints with a compile-time condition to prevent extra overhead in production runs.

Suggested change

binfo.actual_seqlen_k - n_block * kBlockN);

if (tidx == 0 && bidh == 0 && bidb == 0) {

printf("Before copy_ZOH: n_block=%d, seqlen_q_offset=%d, seqlen_k_offset=%d\n",

n_block,

binfo.actual_seqlen_q - m_block * kBlockM,

binfo.actual_seqlen_k - n_block * kBlockN);

}

binfo.actual_seqlen_k - n_block * kBlockN);

#ifdef DEBUG

if (tidx == 0 && bidh == 0 && bidb == 0) {

printf("Before copy_ZOH: n_block=%d, seqlen_q_offset=%d, seqlen_k_offset=%d\n",

n_block,

binfo.actual_seqlen_q - m_block * kBlockM,

binfo.actual_seqlen_k - n_block * kBlockN);

}

#endif

LoserCheems requested review from Evanwu1125, SNHuan, Copilot and wubingheng111 and removed request for Copilot July 1, 2025 13:43

LoserCheems assigned LoserCheems, Copilot, Evanwu1125, SNHuan and wubingheng111 Jul 1, 2025

LoserCheems added the bug Something isn't working label Jul 1, 2025

Copilot AI reviewed Jul 1, 2025

View reviewed changes

LoserCheems merged commit d4ab537 into main Jul 1, 2025

LoserCheems deleted the Fix-bug branch October 27, 2025 08:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fixes tensor addressing for ZOH and active mask in splitkv #53

Fixes tensor addressing for ZOH and active mask in splitkv #53

Uh oh!

LoserCheems commented Jul 1, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jul 1, 2025

Uh oh!

Copilot AI Jul 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Fixes tensor addressing for ZOH and active mask in splitkv #53

Fixes tensor addressing for ZOH and active mask in splitkv #53

Uh oh!

Conversation

LoserCheems commented Jul 1, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Jul 1, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 1, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants