Replies: 1 comment 1 reply
-
|
Arff... nevermind, seems like I just missed the RAG_EMBEDDING_MODEL envvars from the documentation... |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone,
I've few question regarding RAG and OpenWebUI in OFFLINE context.
If I'm not making any mistake, when we set the
RAG_EMBEDDING_MODELenvvar with let saysentence-transformers/all-MiniLM-L6-v2OpenWebUI is looking for:/app/backend/data/cache/embedding/models/To my understanding, it means OpenWebUI will run the model itself without leveraging any backend model engine right?
Is there a way to let OpenWebUI delegate this model run to our backend (ollama currently)?
If so, how can I set OpenWebUI to do so? I mean, do I just need to load ollama with the model?
Currently our OpenWebUI container is on a host that doesn't have any GPU but it is using our ollama backend deployment which itself IS hosted on GPUs based hosts.
Both zones doesn't have access to the internet at all, but we load models on ollama on our own.
Right now, we did loaded Qwen3:8B / Qwen3-embedding:8B and Qwen3-reranker:8B sucessfully, but we would be sure OpenWebUI can indeed use the embedding and reranker models from the ollama instances.
Thanks everyone!
Beta Was this translation helpful? Give feedback.
All reactions