Skip to content

Conversation

@LoserCheems
Copy link
Collaborator

@LoserCheems LoserCheems commented Jul 1, 2025

#47
Forces num_splits to 1 and resets accumulator tensors to avoid memory overhead while bugs in the Split-KV implementation are being fixed.

This ensures stability by bypassing the problematic feature until issues are resolved.

Forces num_splits to 1 and resets accumulator tensors to avoid memory overhead while bugs in Split-KV implementation are being fixed.

This ensures stability by bypassing the problematic feature until issues are resolved.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Temporarily disables the Split-KV feature by forcing num_splits to 1 and clearing accumulators to avoid memory overhead while underlying bugs are fixed.

  • Overrides params.num_splits to 1 unconditionally
  • Resets softmax_lse_accum and out_accum before returning

Comment on lines 258 to 261
// Temporarily disable Split-KV, because some bugs are still being fixed.
// Regardless of how it is set externally, always set num_splits back to 1.
// This is to avoid the extra memory overhead of Split-KV.
params.num_splits = 1;
Copy link

Copilot AI Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overriding params.num_splits after the accumulators are likely allocated does not prevent the extra memory overhead. Move this override before any allocation logic and skip creating large accumulators entirely when Split-KV is disabled.

Suggested change
// Temporarily disable Split-KV, because some bugs are still being fixed.
// Regardless of how it is set externally, always set num_splits back to 1.
// This is to avoid the extra memory overhead of Split-KV.
params.num_splits = 1;

Copilot uses AI. Check for mistakes.
TORCH_CHECK(params.num_splits <= 128, "num_splits > 128 not supported");
}

// Temporarily disable Split-KV, because some bugs are still being fixed.
Copy link

Copilot AI Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Add a TODO with the relevant issue or ticket reference and an expected removal timeline to ensure this temporary workaround is tracked and cleaned up once bugs are resolved.

Suggested change
// Temporarily disable Split-KV, because some bugs are still being fixed.
// Temporarily disable Split-KV, because some bugs are still being fixed.
// TODO: Track resolution of Split-KV bugs in issue #12345. Expected removal: Q4 2025.

Copilot uses AI. Check for mistakes.

// Temporarily disable Split-KV, because some bugs are still being fixed.
// Regardless of how it is set externally, always set num_splits back to 1.
// This is to avoid the extra memory overhead of Split-KV.
Copy link

Copilot AI Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider emitting a runtime warning or log entry to inform users that Split-KV has been disabled, preventing silent changes in behavior.

Suggested change
// This is to avoid the extra memory overhead of Split-KV.
// This is to avoid the extra memory overhead of Split-KV.
TORCH_WARN("Split-KV has been temporarily disabled due to unresolved bugs. num_splits is set to 1.");

Copilot uses AI. Check for mistakes.
Documents the GitHub issue tracking the Split-KV bugs that led to temporarily disabling the feature. This provides better context for future developers about why the functionality is disabled and where to find related discussion.
@LoserCheems LoserCheems merged commit 72172d6 into main Jul 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants