-
-
Notifications
You must be signed in to change notification settings - Fork 10.7k
[unrevert] Add batch invariant kernel override for FlashInfer backend [2/n] #26373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Bram Wasti <bwasti@meta.com>
297ecc2
to
fcfa198
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request reinstates support for batch invariant kernel overrides for the FlashInfer backend, which is a crucial feature for ensuring deterministic outputs regardless of batch composition. The changes primarily involve updating the FlashInfer dependency and plumbing through new parameters to enable a deterministic mode in the backend. Additionally, the batch invariance tests have been significantly improved to be more robust, configurable, and to cover the FlashInfer backend. My review focuses on the correctness and robustness of these changes. The implementation for enabling batch invariance in FlashInfer appears solid. I've identified a couple of minor issues in the updated test file where a hardcoded value should be replaced with a configurable variable to make the test more robust. Overall, the changes are well-implemented and the test enhancements are excellent.
Signed-off-by: Bram Wasti <bwasti@meta.com>
fcfa198
to
5045121
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix the pre-commit issue so that we can get this landed
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Bram Wasti <bwasti@fb.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the work!
… [2/n] (vllm-project#26373) Signed-off-by: Bram Wasti <bwasti@meta.com> Signed-off-by: Bram Wasti <bwasti@fb.com> Signed-off-by: 1994 <1994@users.noreply.github.com>
… [2/n] (vllm-project#26373) Signed-off-by: Bram Wasti <bwasti@meta.com> Signed-off-by: Bram Wasti <bwasti@fb.com> Signed-off-by: Dhruvil Bhatt <bhattdbh@amazon.com>
… [2/n] (vllm-project#26373) Signed-off-by: Bram Wasti <bwasti@meta.com> Signed-off-by: Bram Wasti <bwasti@fb.com> Signed-off-by: bbartels <benjamin@bartels.dev>
This change reinstates already approved + landed #25769 based on the latest bump to flashinfer:
#26326
It should not land before #26326