This guide describes how the Factory AI DROID CLI uses the local LiteLLM gateway to reach both Ollama (via the litellm proxy) and Ollama Cloud fallback models.
- Base URL:
http://127.0.0.1:4000/v1/ - LiteLLM requires a bearer token. Direnv materialises the master key into
~/.config/lab-secrets/litellm.env(LITELLM_MASTER_KEY). - Ensure
LITELLM_MASTER_KEYandLITELLM_SALT_KEYlive insidesecrets/global.enc.envand runsource .envrcbefore starting DROID or the LiteLLM systemd unit.
The gateway exposes the catalogue described in
ai/backend/ai-backend-unified/config/litellm-unified.yaml, including fallback
chains across local, cloud, llama.cpp, and vLLM backends.
Add a custom model entry in ~/.factory/config.json (or a project override):
{
"custom_models": [
{
"model_display_name": "qwen3-coder (LiteLLM)",
"model": "qwen3-coder:480b-cloud",
"base_url": "http://127.0.0.1:4000/v1/",
"api_key": "env:LITELLM_MASTER_KEY",
"provider": "generic-chat-completion-api",
"max_tokens": 128000
}
]
}Use additional entries for other aliases (e.g. qwen2.5-coder:7b,
llama3.1:latest) pointing at the same base URL. The env: prefix instructs
DROID to pull the key from the environment and send it as Authorization: Bearer <value>.
-
Start the service with
ai/services/openwebui/scripts/run_litellm.sh; it writes a systemd unit that targetsai/backend/ai-backend-unified/runtime. -
Validate redis/backends using
lab/bin/check_ai_health.shandai/services/openwebui/scripts/validate_backends.shbefore invoking DROID. -
Unauthenticated requests to
/v1/*should respond with HTTP 401. Example check:curl -s -H "Authorization: Bearer $LITELLM_MASTER_KEY" \ http://127.0.0.1:4000/v1/models | jq '.data[].id'
scripts/bootstrap.sh: copies the sample config into~/.factory/config.dand checks for the master key.scripts/healthcheck.sh: validates LiteLLM auth (401/200) and optionally redis.scripts/exec.sh: wrapper arounddroid execthat ensures the master key is available.- After a master-key rotation, rerun
scripts/bootstrap.shso~/.factory/config.d/lab-litellm.jsonand~/.config/lab-secrets/litellm.envpick up the new token.
- Secrets remain in the encrypted
secrets/global.enc.env; never commit the plaintext master key. - To rotate: update
LITELLM_MASTER_KEYwithsops --set, runsource .envrc, and restart the service (systemctl --user restart litellm.service). Notify clients (including DROID configs) to reload the key. - CI jobs invoking
droid execshould inject the master key through the pipeline secret store prior to execution.
- Consider enabling LiteLLM encrypted persistence (requires the salt key) for logs/metrics once automation expands.
- Keep the integration roadmap aligned with the hybrid
litellm-hybrid.yaml.appliedprofile used by OpenWebUI.