Replies: 1 comment
-
I don't think we can add an option to take ownership of the buffer because we don't know what function needs to be used to free it. I think that a good option could be to use the |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm in the process of implementing gguf_init_from_buffer and need some guidance on the expected behavior.
Here are the options I'm considering:
Given the potential for large model sizes and the significant memory overhead involved, I am particularly concerned about the efficiency of the current approach where buffers are duplicated. This can result in double the RAM usage temporarily, which is not ideal.
I'm looking for input on the most appropriate approach that aligns with the overall design principles of
ggml
.Beta Was this translation helpful? Give feedback.
All reactions