Skip to content

Conversation

@LoserCheems
Copy link
Collaborator

Adds type ignore comment to suppress import warnings for CUDA extension

Renames parameter from attn_mask to zero_hold_states for better semantic clarity

Updates test case to use more realistic sequence length configuration

Adds type ignore comment to suppress import warnings for CUDA extension

Renames parameter from attn_mask to zero_hold_states for better semantic clarity

Updates test case to use more realistic sequence length configuration
Copilot AI review requested due to automatic review settings June 30, 2025 13:56
@LoserCheems LoserCheems added the bug Something isn't working label Jun 30, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves code clarity by renaming a mask parameter, suppresses spurious import warnings for the CUDA extension, and updates a benchmark test to use more realistic sequence lengths.

  • Added a # type: ignore comment to the flash_dma_cuda import
  • Renamed attn_mask to zero_hold_states in dynamic_mask_attention_cuda signature
  • Adjusted the sequence-length tuple in test_forward_equivalence for a more realistic configuration
Comments suppressed due to low confidence (2)

benchmarks/benchmark_forward_equivalence.py:239

  • Update the inline comment to match the new parameter name, e.g. replace zoh: with zero_hold_states: (and consider expanding the abbreviation for clarity).
        zero_hold_states,         # zoh: [batch, num_kv_heads, seqlen_q, seqlen_k] - processed attention mask

benchmarks/benchmark_forward_equivalence.py:380

  • [nitpick] Consider adding boundary or edge-case tests (e.g., minimal or maximal valid sequence lengths) to ensure the attention implementation handles extremes correctly.
        (1, 2, 1, 511, 512, 128, True),

LoserCheems and others added 2 commits June 30, 2025 21:58
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Suppresses mypy import error for the flash_dma_cuda module to prevent type checking failures when the CUDA extension is not available or not properly configured in the development environment.
@LoserCheems LoserCheems merged commit b399bb5 into main Jun 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants