Skip to content

Conversation

@LoserCheems
Copy link
Collaborator

Correct function names for clarity, fix typos, and unify memory copy alignment for mask and bias operations to enhance code readability and type safety.

Corrects function calls from copy_Mask to copy_MN throughout the kernel implementation and fixes spelling error in comment from "Golobal" to "Global".

These changes ensure proper function naming consistency and improve code readability.
Changes function name from copy_Mask to copy_MN to improve code clarity and better indicate the function's role in copying matrix dimensions.
Unifies memory copy alignment to 128 bytes for both mask and bias operations in forward and backward kernel traits.

Removes inconsistent 64-byte alignment for bias copy atom in backward kernel and establishes consistent 128-byte alignment across all mask and bias copy operations.
Replaces incorrect SmemCopyAtomO with proper SmemCopyAtomMask and SmemCopyAtomBias types in tiled copy operations.

Ensures type safety and correct memory access patterns for mask and bias tensors in attention computation kernels.
@LoserCheems LoserCheems requested review from Evanwu1125, SNHuan, Copilot and wubingheng111 and removed request for Copilot July 14, 2025 14:22
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves code clarity and consistency in the attention kernel implementation by renaming a function for better semantic meaning, correcting a spelling error, and standardizing memory copy alignment parameters for mask and bias operations.

  • Renamed copy_Mask function to copy_MN to better reflect its purpose of copying matrix dimensions
  • Fixed spelling error from "Golobal" to "Global" in a comment
  • Unified memory copy alignment from mixed 64/128-bit to consistent 128-bit alignment for both mask and bias operations

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
csrc/src/utils.h Renamed function from copy_Mask to copy_MN for better semantic clarity
csrc/src/kernel_traits.h Added consistent 128-bit aligned copy atoms for mask and bias operations, removed duplicate definition
csrc/src/flash_fwd_kernel.h Updated function calls to use new name, fixed spelling error, and applied new copy atom types

);

// Golobal to Shared Memory operation
// Global to Shared Memory operation
Copy link

Copilot AI Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed spelling error from 'Golobal' to 'Global' in the comment.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants