v1.1.3.3
-
Automatically monitors the underlying Llama.cpp API server process for driving the local LLM model. Introduced a more flexible communication method over the network from the IPC method through Queue and Event in the existing process pool.
-
Local embedding via Llama.cpp model and huggingface embedding model. For the former, you need to set the
embedding=True
option when definingLlamaCppModel
. For the latter, you need to installpytorch
additionally and set a huggingface repository such asintfloat/e5-large-v2
in the value ofLOCAL_EMBEDDING_MODEL
in the.env
file.