Skip to content

Commit dea5600

Browse files
jddunnclaude
andcommitted
feat(emergent): add EmergentJudge — LLM-as-judge for forged tool evaluation
Three evaluation modes scaled to risk level: - reviewCreation(): full code audit + test validation via single LLM call - validateReuse(): pure programmatic schema conformance (zero LLM calls) - reviewPromotion(): dual-judge panel (safety + correctness), both must approve Includes 26 passing tests covering all approval, rejection, malformed JSON, schema validation, and multi-judge scenarios. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 3271b13 commit dea5600

3 files changed

Lines changed: 1142 additions & 0 deletions

File tree

0 commit comments

Comments
 (0)