Skip to content

Conversation

@LoserCheems
Copy link
Collaborator

Introduce bias gradient computation in the backward kernel, including dedicated offset calculations and tensor configurations. Enhance tensor declarations and memory layouts to support improved backward pass functionality.

Introduces dedicated offset calculation and tensor configuration for bias gradient computation.

Adds row_offset_dbias calculation using dbias-specific stride parameters and creates gdBias tensor with proper memory layout.

Reorganizes tensor declarations with improved formatting and adds shared memory tensors for mask, bias, and bias gradient operations to support the enhanced backward pass functionality.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Introduce bias gradient computation in the backward kernel by adding dedicated offset calculations and tensor configurations, while also refactoring tensor construction formatting.

  • Compute and apply row_offset_dbias for bias gradients
  • Declare gdBias tensor and extend shared memory layout with bias-related buffers
  • Reformat multi-line make_tensor calls for improved readability

typename Kernel_traits::SmemLayoutQdO{});
Tensor sQt = make_tensor(sQ.data(), typename Kernel_traits::SmemLayoutQdOtransposed{});
Tensor sQtNoSwizzle = make_tensor(sQ.data(), typename Kernel_traits::SmemLayoutQdOtransposedNoSwizzle{});
// Golobal memory tensor configuration
Copy link

Copilot AI Jul 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct the spelling of 'Golobal' to 'Global' in this comment.

Suggested change
// Golobal memory tensor configuration
// Global memory tensor configuration

Copilot uses AI. Check for mistakes.
Corrects spelling error in comment describing global memory tensor configuration across multiple kernel files.
@LoserCheems LoserCheems merged commit f80f339 into main Jul 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants