Skip to content

feat(audit): --quality and --ocr-text vision-based checks#82

Merged
gfargo merged 1 commit into
mainfrom
feat/vision-audit
May 11, 2026
Merged

feat(audit): --quality and --ocr-text vision-based checks#82
gfargo merged 1 commit into
mainfrom
feat/vision-audit

Conversation

@gfargo
Copy link
Copy Markdown
Owner

@gfargo gfargo commented May 11, 2026

Closes #81. See issue. (--inconsistent deferred — embedding-based, separate design.)

Two new audit checks powered by Ollama vision:

- --quality   flags blurry / low-contrast / poorly-composed images
- --ocr-text  flags images that visually contain a supplied text string

Both run per-item Ollama calls (~10s each), so neither is included
in the default "run all checks" behavior — must be opted into. If
Ollama isn't reachable, the check is skipped with a warning rather
than failing the whole audit.

--inconsistent (cross-library style outliers) is deferred; it requires
embedding-based clustering and is a separate design conversation.

- New AuditFinding types: 'quality' and 'ocr-match'
- New helpers detectQualityIssues / detectOcrMatches using
  generateCaption with strict YES/NO prompts
- audit MCP tool gains `quality` (bool) and `ocrText` (string) fields
- JSON output `summary` extended with `quality` + `ocrMatch` counts
- Text output grouping extended with new sections

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@gfargo gfargo merged commit 2eed1ea into main May 11, 2026
4 checks passed
@gfargo gfargo deleted the feat/vision-audit branch May 11, 2026 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(audit): --quality and --ocr-text vision-based checks

1 participant