Skip to content

Conversation

@LoserCheems
Copy link
Collaborator

Renames q/k/v parameters to query/key/value in flash attention functions for better readability and standardization.

Updates parameter documentation to reflect the new naming convention and fixes GQA condition description to use <= instead of <.

Removes outdated footer reference to integration docs.

Renames q/k/v parameters to query/key/value in flash attention functions for better readability and standardization.

Updates parameter documentation to reflect the new naming convention and fixes GQA condition description to use <= instead of <.

Removes outdated footer reference to integration docs.
Copilot AI review requested due to automatic review settings August 9, 2025 03:36
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves API consistency by renaming abbreviated parameter names in flash attention functions to their full descriptive names (q/k/v → query/key/value). This enhances code readability and follows standard naming conventions.

  • Renames abbreviated parameter names to full descriptive names in function signatures
  • Updates parameter documentation to reflect the new naming convention
  • Fixes GQA condition description and removes outdated footer reference

Comment on lines +86 to +87
- key: (B, K, H_kv, D). Same dtype/device as query; GQA when H_kv <= H
- value: (B, K, H_kv, D). Same dtype/device as query; GQA when H_kv <= H
Copy link

Copilot AI Aug 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GQA condition description should clarify what 'H' refers to. It should specify 'H_kv <= num_heads' or reference the query tensor's head dimension to avoid ambiguity.

Suggested change
- key: (B, K, H_kv, D). Same dtype/device as query; GQA when H_kv <= H
- value: (B, K, H_kv, D). Same dtype/device as query; GQA when H_kv <= H
- key: (B, K, H_kv, D). Same dtype/device as query; GQA when H_kv <= H (number of query heads in the query tensor)
- value: (B, K, H_kv, D). Same dtype/device as query; GQA when H_kv <= H (number of query heads in the query tensor)

Copilot uses AI. Check for mistakes.
Comment on lines +86 to +87
- key: (B, K, H_kv, D). Same dtype/device as query; GQA when H_kv <= H
- value: (B, K, H_kv, D). Same dtype/device as query; GQA when H_kv <= H
Copy link

Copilot AI Aug 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GQA condition description should clarify what 'H' refers to. It should specify 'H_kv <= num_heads' or reference the query tensor's head dimension to avoid ambiguity.

Suggested change
- key: (B, K, H_kv, D). Same dtype/device as query; GQA when H_kv <= H
- value: (B, K, H_kv, D). Same dtype/device as query; GQA when H_kv <= H
- key: (B, K, H_kv, D). Same dtype/device as query; GQA when H_kv <= H (number of query heads)
- value: (B, K, H_kv, D). Same dtype/device as query; GQA when H_kv <= H (number of query heads)

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants