Enables Split-KV avoidance and updates error messages #74

LoserCheems · 2025-07-25T15:44:15Z

Uncomments code that forces num_splits to 1 and resets accumulator tensors to avoid extra memory overhead from Split-KV operations.

Updates error messages to reference "FlashDynamicMaskAttention" instead of "FlashAttention" for consistency with the library's actual implementation.

Uncomments code that forces num_splits to 1 and resets accumulator tensors to avoid extra memory overhead from Split-KV operations. Updates error messages to reference "FlashDynamicMaskAttention" instead of "FlashAttention" for consistency with the library's actual implementation.

Copilot

Pull Request Overview

This PR enables Split-KV avoidance by uncommenting code that forces the number of splits to 1 and resets accumulator tensors, while also updating error messages for consistency with the FlashDynamicMaskAttention library implementation.

Uncomments code to force num_splits = 1 and reset accumulator tensors to avoid Split-KV memory overhead
Updates error messages to reference "FlashDynamicMaskAttention" instead of "FlashAttention"

Copilot · 2025-07-25T15:44:29Z

csrc/flash_api.cpp

    auto q_dtype = q.dtype();
-    TORCH_CHECK(q_dtype == torch::kFloat16 || q_dtype == torch::kBFloat16,
-                "FlashAttention only support fp16 and bf16 data type");
+    TORCH_CHECK(q_dtype == torch::kFloat16 || q_dtype == torch::kBFloat16, "FlashDynamicMaskAttention only support fp16 and bf16 data type");


Grammar error: "only support" should be "only supports" to maintain subject-verb agreement.

Suggested change

TORCH_CHECK(q_dtype == torch::kFloat16 || q_dtype == torch::kBFloat16, "FlashDynamicMaskAttention only support fp16 and bf16 data type");

TORCH_CHECK(q_dtype == torch::kFloat16 || q_dtype == torch::kBFloat16, "FlashDynamicMaskAttention only supports fp16 and bf16 data type");

Copilot AI review requested due to automatic review settings July 25, 2025 15:44

Copilot AI reviewed Jul 25, 2025

View reviewed changes

LoserCheems self-assigned this Jul 25, 2025

LoserCheems merged commit 68e6034 into main Jul 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enables Split-KV avoidance and updates error messages #74

Enables Split-KV avoidance and updates error messages #74

Uh oh!

LoserCheems commented Jul 25, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jul 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	TORCH_CHECK(q_dtype == torch::kFloat16 \|\| q_dtype == torch::kBFloat16, "FlashDynamicMaskAttention only support fp16 and bf16 data type");
	TORCH_CHECK(q_dtype == torch::kFloat16 \|\| q_dtype == torch::kBFloat16, "FlashDynamicMaskAttention only supports fp16 and bf16 data type");

Enables Split-KV avoidance and updates error messages #74

Enables Split-KV avoidance and updates error messages #74

Uh oh!

Conversation

LoserCheems commented Jul 25, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants