Expose `ggml_backend_load()` and `ggml_backend_load_all()` to make use of builds with `GGML_BACKEND_DL=ON` and `GGML_CPU_ALL_VARIANTS=ON`

I just tried compiling llama-cpp-python with `GGML_BACKEND_DL=ON` and `GGML_CPU_ALL_VARIANTS=ON` to make use of this nice feature with dynamic dispatch to a dynamically loaded backend, which e.g. made it possible to build llama.cpp once but dynamically choose the best backend for the current CPU, i.e. for x86_64 depending on whether certain instructions like AVX2 or AVX512 are available choose the best backend for the current microarchitecture level.

Compiling worked for me so far on Ubuntu 24.04 LTS and when inspecting the wheel I see the backend dynamic libraries like `bin/libggml-cpu-x64.so`, `libggml-cpu-sse42.so`, `libggml-cpu-haswell.so` and so on. So that is good already.

But when loading a model with llama-cpp-python I get this error:
`llama_model_load_from_file_impl: no backends are loaded. hint: use ggml_backend_load() or ggml_backend_load_all() to load a backend before calling this function` but these functions are not exposed yet via the bindings.

I think this would be a really great thing to add. That would make the CPU wheels for llama-cpp-python way better, because it wouldn't be stuck with base x86_64 instructions and could thus be way more performant for cases where the wheel cannot be compiled at installation time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Expose `ggml_backend_load()` and `ggml_backend_load_all()` to make use of builds with `GGML_BACKEND_DL=ON` and `GGML_CPU_ALL_VARIANTS=ON` #2069

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Expose ggml_backend_load() and ggml_backend_load_all() to make use of builds with GGML_BACKEND_DL=ON and GGML_CPU_ALL_VARIANTS=ON #2069

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Expose `ggml_backend_load()` and `ggml_backend_load_all()` to make use of builds with `GGML_BACKEND_DL=ON` and `GGML_CPU_ALL_VARIANTS=ON` #2069