Skip to content

Model Providers

Jacob Centner edited this page Apr 10, 2026 · 3 revisions

Model Providers

Sentinel uses a pluggable model provider system for LLM operations (judge, semantic-drift, test-coherence, synthesis). Three providers are built in.

Provider comparison

Provider Local? Cost Setup complexity Best for
Ollama Yes Free (hardware only) Low Default, privacy-sensitive repos
OpenAI-compatible No Per-token Medium Higher capability analysis
Azure No Per-token Medium Enterprise environments

Ollama (default)

Runs models locally via Ollama. No data leaves your machine.

[sentinel]
provider = "ollama"
model = "qwen3.5:4b"
ollama_url = "http://localhost:11434"

Setup

ollama pull qwen3.5:4b    # default model (~2.5 GB)
ollama serve               # start the server

Recommended models

Model Size Capability tier Use case
qwen3.5:4b ~2.5 GB basic Default — judge + basic detectors
qwen3.5:8b ~5 GB basic–standard Better judgment quality
qwen3:14b ~9 GB standard Synthesis + enhanced analysis

OpenAI-compatible

Works with OpenAI API and any compatible endpoint (LiteLLM, vLLM, etc.).

[sentinel]
provider = "openai"
model = "gpt-4o-mini"
api_base = "https://api.openai.com/v1"
api_key_env = "OPENAI_API_KEY"
model_capability = "standard"

Set the API key in your environment:

export OPENAI_API_KEY=sk-...

Azure

Works with Azure AI / Azure OpenAI endpoints.

[sentinel]
provider = "azure"
model = "gpt-5.4-nano"
api_base = "https://your-resource.services.ai.azure.com"
api_key_env = "AZURE_API_KEY"
model_capability = "standard"

Running without an LLM

Sentinel works without any LLM provider. Use:

sentinel scan /repo --skip-judge --skip-llm

Or in config:

[sentinel]
skip_judge = true
skip_llm = true

This runs only deterministic and heuristic detectors. The two fully LLM-dependent detectors (semantic-drift, test-coherence) produce no findings. Docs-drift's deterministic sub-checks (stale refs, dep drift) still run — only its optional LLM doc-code comparison is skipped.

Health checks

Sentinel checks provider health before using it. If the provider is unreachable, LLM-assisted detectors gracefully skip and the judge step is skipped.

Run sentinel doctor to verify provider connectivity.

Retry logic

Cloud providers (OpenAI, Azure) automatically retry on:

  • Rate limits (429)
  • Server errors (500, 502, 503, 504)
  • Timeouts and connection errors

Uses exponential backoff with 2 retries. Respects Retry-After headers.

See also

Clone this wiki locally