Log and export quality and connection diagnostics that feed the evaluation harness. ## Acceptance criteria - [ ] Logs and can export: categorization results, false positives, false negatives, and connector/connection diagnostics. - [ ] Metrics feed the evaluation harness. ### Test acceptance criteria - [ ] CI asserts false-positive rate <5% and false-negative rate <2% against the #1230 corpus; the eval harness fails when either is exceeded. Consumed by the LLM-judge (#1279) and CI (#1112).
Log and export quality and connection diagnostics that feed the evaluation harness.
Acceptance criteria
Test acceptance criteria
Consumed by the LLM-judge (#1279) and CI (#1112).