-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Open
Description
🚀 The feature
llama cpp is one of the widely used LLM inference tool that has native support for more than 95% GGUF models.
Although llama server supports open ai style API, its hard to run Main LLM and embedding model.
Python Library llama-cpp-python allows to run both models simultaneously. It will be really useful if there is native support for llama cpp using the llama-cpp-python library.
Motivation, pitch
llama cpp also supports Embedding models which makes it suitable for local and on device (EDGE) AI. Extending support to llama cpp will encourage local and edge ai use case to adapt mem0 capabilities.
realJohnDee
Metadata
Metadata
Assignees
Labels
No labels