Support loading GGUF model from memory buffer (not just file) #17309

calebnwokocha · 2025-11-16T23:37:57Z

calebnwokocha
Nov 16, 2025

Problem: Currently, llama_model_load_from_file(...) requires a file path. Llama.cpp uses mmap or reads from disk.

Proposal: Add a function like llama_model_load_from_buffer(const void* buf, size_t size, llama_context_params params) to allow loading GGUF from an in-memory buffer.

Use Cases:

Embeddable executables that embed the GGUF model as data (not external file).
Dynamically loaded model blobs (e.g., downloaded into memory, then loaded).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support loading GGUF model from memory buffer (not just file) #17309

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Support loading GGUF model from memory buffer (not just file) #17309

Uh oh!

calebnwokocha Nov 16, 2025

Replies: 0 comments

calebnwokocha
Nov 16, 2025