Skip to content

Capability Tiers

Jacob Centner edited this page Apr 10, 2026 · 2 revisions

Capability Tiers

Sentinel's capability tier system lets detectors adapt their behavior based on your model's strength. A small local model gives you basic signals; a larger model unlocks richer analysis — automatically.

The four tiers

Tier Config value Typical models What it unlocks
None "none" Deterministic detectors only. No LLM calls at all.
Basic "basic" Qwen3.5 4B, Phi-3.5 mini LLM judge, semantic-drift, test-coherence (binary signals)
Standard "standard" Qwen3.5 8B+, GPT-5.4-nano Enhanced analysis mode, finding cluster synthesis
Advanced "advanced" GPT-5.4-mini, Claude Haiku 4.5 Deep semantic analysis (reserved for future detectors)

The default is basic — designed for the target hardware (8 GB VRAM, Qwen3.5 4B).

How tiers affect the pipeline

Detector execution (advisory)

Each detector declares a required tier. The runner compares it against your configured tier at scan time:

  • If your tier meets or exceeds the requirement: the detector runs normally.
  • If your tier is below the requirement: the detector still runs but Sentinel logs a warning. Results may be lower quality.

This is intentional. The tier system is advisory, not a gate — it warns about mismatches rather than silently skipping detectors. The only way to disable detectors is explicitly via --skip-llm, --skip-judge, or disabled_detectors.

Built-in detector requirements

Detector Required tier Why
todo-scanner none Pure regex pattern matching
complexity none AST-based cyclomatic complexity
dead-code none Heuristic unreachable code detection
dep-audit none Package version/advisory checks
docs-drift none File modification timestamp comparison
git-hotspots none Git log frequency analysis
lint-runner none Wraps ruff/flake8
eslint-runner none Wraps ESLint
go-linter none Wraps golangci-lint
rust-clippy none Wraps cargo clippy
stale-env none Config file staleness checks
unused-deps none Import/dependency cross-reference
semantic-drift basic LLM compares docs against code
test-coherence basic LLM compares tests against implementation

LLM judge (no tier gate)

The judge runs whenever a provider is available and --skip-judge is not set. It has no capability tier check — it works with any model, though larger models produce better severity/confidence calibration.

Cluster synthesis (hard gate at standard)

This is the only hard gate in the system. When Sentinel groups related findings into clusters, the synthesis step (which asks the LLM to summarize cluster themes) is completely skipped if your tier is below standard:

  • Below standard: Findings are grouped by heuristic clustering but get no LLM synthesis summary. The cluster's description is the title of the highest-confidence finding.
  • Standard or above: The LLM generates a narrative summary explaining the cluster theme and connecting the related findings.

Enhanced vs basic mode

The two LLM-assisted detectors (semantic-drift and test-coherence) adapt their behavior based on the configured tier. This happens automatically — no additional configuration needed.

semantic-drift

Aspect Basic mode (basic tier) Enhanced mode (standard+ tier)
Prompt Binary: "does this doc match the code?" Structured: severity, reason, and specific mismatches
Doc context 800 characters max 1,500 characters max
Code context 2,000 characters max 3,000 characters max
Output needs_review (yes/no) + reason needs_review + severity + reason + specifics[]
Base confidence 0.60 0.75

test-coherence

Aspect Basic mode (basic tier) Enhanced mode (standard+ tier)
Prompt Binary: "do these tests cover this code?" Structured: gaps, severity, and specific missing coverage
Test context 1,500 characters max 3,000 characters max
Implementation context 1,500 characters max 3,000 characters max
Output needs_review (yes/no) + reason needs_review + severity + reason + gaps[]
Base confidence 0.60 0.75

Enhanced mode produces richer findings that cite specific mismatches or gaps rather than a general "needs review" signal. This makes the morning report more actionable without requiring follow-up investigation.

Setting your tier

Global setting

# sentinel.toml
[sentinel]
model_capability = "basic"  # or "none", "standard", "advanced"

CLI override

sentinel scan /repo --capability standard

Per-detector override

You can set a different tier for specific detectors if they use a different provider. This is useful when routing LLM detectors to a cloud model while keeping the judge on a local model:

[sentinel]
model_capability = "basic"  # global: local 4B model

[sentinel.detector_providers.semantic-drift]
provider = "openai"
model = "gpt-4o-mini"
api_base = "https://api.openai.com/v1"
api_key_env = "OPENAI_API_KEY"
model_capability = "standard"  # this detector gets enhanced mode

When a per-detector override includes model_capability, that value is used instead of the global setting — but only for that detector's adaptive behavior and the runner's tier warning. The synthesis gate always uses the global tier.

Choosing the right tier

Your setup Recommended tier What you get
No LLM / skip_llm = true none 12 deterministic detectors, no judge
Qwen3.5 4B (default) basic All 14 detectors + judge, binary LLM signals
Qwen3.5 8B or small cloud model standard Enhanced detector output + cluster synthesis
Frontier cloud model advanced Same as standard today (future detectors will use this)

Setting the tier too high doesn't unlock extra features — it just tells Sentinel to use larger prompts and expect structured output. If your model can't handle those prompts, you'll get parse failures that fall back to basic behavior. It's better to be honest about your model's capability.

How it works internally

  1. Config loading: sentinel.tomlSentinelConfig.model_capability (validated against none|basic|standard|advanced)
  2. Runner setup: Converts the string to a CapabilityTier enum value
  3. Per-detector loop: Compares detector.capability_tier against the effective tier (per-detector override or global)
  4. Adaptive detectors: Read model_capability from scan context and select basic or enhanced code path
  5. Post-scan: Runner checks if tier ≥ standard before calling synthesis

The tier comparison uses an internal ordering: NONE(0) < BASIC(1) < STANDARD(2) < ADVANCED(3).

Related pages

Clone this wiki locally