Adversarial clinical AI evals suite : physician-curated golden test cases exposing safety-critical failure modes in healthcare AI pipelines. SOAP notes, LLM judge, ambient scribe.
ai-safety clinical-ai healthcare-ai ai-agent medical-nlp evals soap-notes-writer llm-evaluation llm-judge clinical-documentation clinical-scribe
-
Updated
Jul 2, 2026 - Python