Skip to content

Conversation

@JohannesGaessler
Copy link
Collaborator

Fixes #16762 . As correctly pointed out by Aman, the problem is that the pointer to the buffer is being overwritten when looping over the split files. As a consequence the backend buffers are currently being leaked.

More generally, for the combination of mmap and split files there can be more than one backend buffer being associated with a ggml context which wasn't being correctly represented by the type vector<pair<ggml_context_ptr, ggml_buffer_ptr>>. I changed the type to vector<pair<ggml_context_ptr, vector<ggml_buffer_ptr>>> instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Misc. bug: Metal_Mapped buffer size now incorrectly reporting total on split models (possible memory issues beyond reporting)

3 participants