-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Remove V0 attention backends #25351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove V0 attention backends #25351
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request undertakes a significant refactoring to remove the legacy V0 attention backends, streamlining the codebase to exclusively use V1 backends. The changes are comprehensive, involving the deletion of numerous V0 backend files, updating tests to align with V1 backends, and modifying platform-specific code to raise errors for requests of removed backends. This consolidation simplifies the attention backend infrastructure. The changes appear consistent and well-executed, with the addition of RuntimeError
for V0 backend requests providing a clear failure mode for this breaking change.
9e88a2e
to
bb26845
Compare
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> Signed-off-by: charlifu <charlifu@amd.com>
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> Signed-off-by: yewentao256 <zhyanwentao@126.com>
Summary
Testing
https://chatgpt.com/codex/tasks/task_b_68d02a781064832dacffd35e5f979636