Skip to content

Conversation

@LoserCheems
Copy link
Collaborator

Updates tensor creation for ZOH and ActiveMask to include column stride parameters in offset calculations and stride configurations.

Enables more flexible memory layout patterns by allowing non-unit column strides in addition to existing batch and row strides.

Updates tensor creation for ZOH and ActiveMask to include column stride parameters in offset calculations and stride configurations.

Enables more flexible memory layout patterns by allowing non-unit column strides in addition to existing batch and row strides.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances tensor memory layout flexibility by adding support for non-unit column strides, updating both offset calculations and stride configurations for ZOH and ActiveMask tensor creation.

  • Updated zoh tensor creation to include a new column stride parameter in the offset and stride settings.
  • Updated active mask tensor creation similarly to incorporate column stride support.
Comments suppressed due to low confidence (2)

csrc/src/flash_fwd_kernel.h:183

  • Verify that the parameter ordering in the zoh_offset function correctly accommodates the new column stride parameter to maintain consistency with the stride configuration.
        make_gmem_ptr(reinterpret_cast<Element*>(params.zoh_ptr) + binfo.zoh_offset(params.zoh_batch_stride, params.zoh_row_stride, params.zoh_col_stride, bidb)),

csrc/src/flash_fwd_kernel.h:193

  • Ensure that the active_mask_offset function’s parameter ordering aligns with the new column stride support to avoid potential miscalculations in the tensor’s memory layout.
        make_gmem_ptr(reinterpret_cast<Element*>(params.active_mask_ptr) + binfo.active_mask_offset(params.active_mask_batch_stride, params.active_mask_row_stride, params.active_mask_col_stride, bidb)),

@LoserCheems LoserCheems merged commit 39bc17f into main Jun 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants