Skip to content

Conversation

@danieldk
Copy link
Member

@danieldk danieldk commented Nov 6, 2025

This PR:

  1. Refactored the code to improve performance.
  2. Added support for PagedKV functionality.
  3. Test results in transformers' UTs are consistent with CUDA.

I have successfully built this PR locally using nix.

Original PR: #65

Copy link
Collaborator

@MekkCyber MekkCyber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Niiice !

@danieldk
Copy link
Member Author

danieldk commented Nov 6, 2025

The macOS error is unrelated. Let's fix that separately to avoid the expensive flash-attn2 build times.

@MekkCyber
Copy link
Collaborator

Yes perfect!

@danieldk danieldk merged commit b760047 into main Nov 6, 2025
2 of 3 checks passed
@danieldk danieldk deleted the fa2-xpu-refactor-pagedkv branch November 6, 2025 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants