Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions csrc/flash_api.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -255,6 +255,14 @@ std::tuple<at::Tensor, at::Tensor> set_params_splitkv(
TORCH_CHECK(params.num_splits <= 128, "num_splits > 128 not supported");
}

// Temporarily disable Split-KV, because some bugs are still being fixed.
Copy link

Copilot AI Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Add a TODO with the relevant issue or ticket reference and an expected removal timeline to ensure this temporary workaround is tracked and cleaned up once bugs are resolved.

Suggested change
// Temporarily disable Split-KV, because some bugs are still being fixed.
// Temporarily disable Split-KV, because some bugs are still being fixed.
// TODO: Track resolution of Split-KV bugs in issue #12345. Expected removal: Q4 2025.

Copilot uses AI. Check for mistakes.
// See: https://github.com/SmallDoges/flash-dmattn/issues/47
// Regardless of how it is set externally, always set num_splits back to 1.
// This is to avoid the extra memory overhead of Split-KV.
Copy link

Copilot AI Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider emitting a runtime warning or log entry to inform users that Split-KV has been disabled, preventing silent changes in behavior.

Suggested change
// This is to avoid the extra memory overhead of Split-KV.
// This is to avoid the extra memory overhead of Split-KV.
TORCH_WARN("Split-KV has been temporarily disabled due to unresolved bugs. num_splits is set to 1.");

Copilot uses AI. Check for mistakes.
params.num_splits = 1;
softmax_lse_accum.reset();
out_accum.reset();

return std::make_tuple(softmax_lse_accum, out_accum);
}

Expand Down