Model Providers

Sentinel uses a pluggable model provider system for LLM operations (judge, semantic-drift, test-coherence, synthesis). Three providers are built in. Your choice of model determines which Capability Tiers to configure.

Provider comparison

Provider	Local?	Cost	Setup complexity	Best for
Ollama	Yes	Free (hardware only)	Low	Default, privacy-sensitive repos
OpenAI-compatible	No	Per-token	Medium	Higher capability analysis
Azure	No	Per-token	Medium	Enterprise environments

Ollama (default)

Runs models locally via Ollama. No data leaves your machine.

[sentinel]
provider = "ollama"
model = "qwen3.5:4b"
ollama_url = "http://localhost:11434"

Setup

ollama pull qwen3.5:4b    # default model (~2.5 GB)
ollama serve               # start the server

Recommended models

Model	Size	Capability tier	Use case
`qwen3.5:4b`	~2.5 GB	basic	Default — judge + basic detectors
`qwen3.5:8b`	~5 GB	basic–standard	Better judgment quality
`qwen3:14b`	~9 GB	standard	Synthesis + enhanced analysis

OpenAI-compatible

Works with OpenAI API and any compatible endpoint (LiteLLM, vLLM, etc.).

[sentinel]
provider = "openai"
model = "gpt-4o-mini"
api_base = "https://api.openai.com/v1"
api_key_env = "OPENAI_API_KEY"
model_capability = "standard"

Set the API key in your environment:

export OPENAI_API_KEY=sk-...

Azure

Works with Azure AI / Azure OpenAI endpoints.

[sentinel]
provider = "azure"
model = "gpt-5.4-nano"
api_base = "https://your-resource.services.ai.azure.com"
api_key_env = "AZURE_API_KEY"
model_capability = "standard"

Running without an LLM

Sentinel works without any LLM provider. Use:

sentinel scan /repo --skip-judge --skip-llm

Or in config:

[sentinel]
skip_judge = true
skip_llm = true

This runs only deterministic and heuristic detectors. The two fully LLM-dependent detectors (semantic-drift, test-coherence) produce no findings. Docs-drift's deterministic sub-checks (stale refs, dep drift) still run — only its optional LLM doc-code comparison is skipped.

Health checks

Sentinel checks provider health before using it. If the provider is unreachable, LLM-assisted detectors gracefully skip and the judge step is skipped.

Run sentinel doctor to verify provider connectivity.

Retry logic

Cloud providers (OpenAI, Azure) automatically retry on:

Rate limits (429)
Server errors (500, 502, 503, 504)
Timeouts and connection errors

Uses exponential backoff with 2 retries. Respects Retry-After headers.

Model Providers

Model Providers

Provider comparison

Ollama (default)

Setup

Recommended models

OpenAI-compatible

Azure

Running without an LLM

Health checks

Retry logic

See also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally