Skip to content

v5.0.0 — Modular Runtime, HITL Approval & Evaluation Scoring

Latest

Choose a tag to compare

@aiqualitylab aiqualitylab released this 03 May 11:05
· 109 commits to main since this release

Splits the runtime into focused modules (qa_config, qa_runtime, qa_workflow), adds a --approve flag for human-in-the-loop test review before save, and introduces HTML replay utilities for failure investigation. Evaluation now includes ROUGE/similarity scoring in the NLP baseline and an overall quality score in Ragas.