tests: enable kv_unified for test-backend-sampler#20645
Merged
taronaeo merged 1 commit intoggml-org:masterfrom Mar 18, 2026
Merged
tests: enable kv_unified for test-backend-sampler#20645taronaeo merged 1 commit intoggml-org:masterfrom
taronaeo merged 1 commit intoggml-org:masterfrom
Conversation
danbev
approved these changes
Mar 16, 2026
Member
This was just to use something other than 0 which I used in most other test, and I did not take this into consideration. |
Member
|
Hm, it should work even with |
Member
|
We are actually not enabling the unified KV cache. This patch should fix it: diff --git a/tests/test-backend-sampler.cpp b/tests/test-backend-sampler.cpp
index d4cd62c71..58361ae80 100644
--- a/tests/test-backend-sampler.cpp
+++ b/tests/test-backend-sampler.cpp
@@ -89,6 +89,7 @@ struct test_context {
cparams.n_batch = 512;
cparams.samplers = configs.data();
cparams.n_samplers = configs.size();
+ cparams.kv_unified = true;
// If n_seq_max is not specified, calculate it from configs
if (n_seq_max < 0) { |
Member
Author
|
Sorry for the delay. Thank you, that patch fixed it! I'll update this PR to reflect that change :) |
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
9d385f3 to
86a7113
Compare
Member
Author
|
Merge in a few hours if no further comments :) |
Ethan-a2
pushed a commit
to Ethan-a2/llama.cpp
that referenced
this pull request
Mar 20, 2026
…org#20645) Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
While running tests to ensure that my local server is ready to be onboard as a GGML CI runner, I ran into an odd CUDA OOM error for the dist sampling test as shown:
Comparing the dist sampling test across all the other available tests, I noticed thatseq_id = 0has been used everywhere excepttest_backend_dist_sampling. Settingseq_idfrom189to0seem to have solved the OOM error.cc: @danbev; please let me know ifseq_id = 189was intentional or if there is another approach to solving this :)Edit: PR has been updated to change only
kv_unified = trueas suggested #20645 (comment).