Detector: Test Coherence

Compares test functions against their implementation counterparts using an LLM to detect stale or meaningless tests.

Property	Value
Name	`test-coherence`
Tier	LLM_ASSISTED
Languages	Python
External tool	None
LLM required	Yes — minimum `basic` capability tier
Confidence	0.60 (basic), 0.75 (enhanced)

What it detects

Tests that no longer meaningfully validate the code they claim to test:

Test assertions that don't match the current function behavior
Tests that test an old API signature
Tests that are trivially always-passing
Tests that duplicate other tests without adding coverage

How it works

Finds test files matching test_*.py or *_test.py
Pairs test files with implementation files via:
- Naming convention: test_config.py → config.py
- Import analysis: follows imports in the test file
Extracts matched (test function, implementation function) pairs via AST
Sends each pair to the LLM for coherence assessment
LLM judges whether the test meaningfully validates the current implementation

Limits

Parameter	Value
Max pairs per file	5 (prevents runaway LLM costs)
Min function lines	3 (skips trivial functions)
Truncation (basic)	1500 chars
Truncation (enhanced)	3000 chars

Capability tiers

Tier	Behavior
`basic`	Binary coherent/stale judgment, smaller context
`standard`+	Nuanced assessment with severity levels, larger context

Example finding

[TEST-COHERENCE] tests/test_config.py:test_load_config → src/sentinel/config.py:load_config
  Test checks for 'model' field default of 'llama2' but the implementation
  default was changed to 'qwen3.5:4b'. Test is stale.
  Severity: MEDIUM, Confidence: 0.60

Configuration

[sentinel]
model_capability = "basic"     # minimum for this detector
skip_llm = false              # must be false

Known limitations

Python-only
Requires a healthy LLM provider
Quality depends on model capability
Only pairs tests via naming convention and imports — misses indirect test coverage
Not yet validated at scale against real-world repos
Max 5 pairs per file may miss issues in files with many functions

Local Repo Sentinel · MIT License

Home

Getting Started

Reference

Detectors

Advanced

Workflow

Detector: Test Coherence

Detector: Test Coherence

What it detects

How it works

Limits

Capability tiers

Example finding

Configuration

Known limitations

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally