Skip to content

Conversation

@ggerganov
Copy link
Member

@ggerganov ggerganov commented Nov 30, 2025

cont #17276

Graph reallocations by the backend scheduler can be expected in various cases when the graph topology becomes different from the one that was used initially after constructing the scheduler. For example, the scheduler of llama_context will likely reallocate when:

The less expected cases are the ones similar to the case in #17143 where it's more difficult to predict that a reallocation would occur since the graph topology remains the same.

The GGML_SCHED_NO_REALLOC macro is now targeted towards detecting the "unexpected" reallocations.

Also, can now override the reallocation debug behavior via a new environment variable GGML_SCHED_DEBUG_REALLOC:

# abort only on unexpected reallocations (i.e. same as building with -DGGML_SCHED_NO_REALLOC=ON)
GGML_SCHED_DEBUG_REALLOC=1 ...

# abort on all reallocations
GGML_SCHED_DEBUG_REALLOC=2 ...

@ggerganov ggerganov force-pushed the gg/sched-debug-realloc branch from 37be189 to 43f7063 Compare November 30, 2025 11:23
@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Nov 30, 2025
@ggerganov ggerganov merged commit 90c72a6 into master Dec 1, 2025
72 of 74 checks passed
@ggerganov ggerganov deleted the gg/sched-debug-realloc branch December 1, 2025 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants