-
-
Notifications
You must be signed in to change notification settings - Fork 11.5k
[CI/Build Don't add FLASHINFER backend in test_cpu_offloading.py #29229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI/Build Don't add FLASHINFER backend in test_cpu_offloading.py #29229
Conversation
Signed-off-by: Randall Smith <ransmith@amd.com>
Signed-off-by: Randall Smith <ransmith@amd.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
The primary change correctly makes the FLASHINFER backend conditional on the CUDA platform, which resolves the described CI failure on ROCm. This is a good fix. However, a debug print statement has been introduced in one of the test files, which should be removed before this pull request is merged.
| params = SamplingParams(temperature=0, bad_words=[bad_words_1, bad_words_2]) | ||
| output = llm.generate(PROMPT, params) | ||
| new_text = output[0].outputs[0].text | ||
| print(f"new_text={new_text}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: Randall Smith <ransmith@amd.com>
…m-project#29229) Signed-off-by: Randall Smith <ransmith@amd.com> Co-authored-by: Randall Smith <ransmith@amd.com>
…m-project#29229) Signed-off-by: Randall Smith <ransmith@amd.com> Co-authored-by: Randall Smith <ransmith@amd.com>
…m-project#29229) Signed-off-by: Randall Smith <ransmith@amd.com> Co-authored-by: Randall Smith <ransmith@amd.com> Signed-off-by: Runkai Tao <rt572@physics.rutgers.edu>
This fixes a test failure where
tests/v1/kv_offload/test_cpu_offloading.pyadds theFLASHINFERbackend to the test, but ROCm platform does not supportflashinferlibrary. Doing this allows the test to be successful in AMD CI. The test runs to completion and the result is:1 passed, 3 warnings