llmci 0.3.0
Post-0.2.0 follow-ups: deeper gate trust, RAG faithfulness, red-team mutation, and multimodal evals.
Highlights
- Composite judge caching — agent outcome/trajectory LLM calls share
.llmci/cache/judges/ - Calibration trend history —
--save-snapshotappends to a history log with trend table - Gate warnings —
llmci runwarns on missing baselines or significance misconfig - Per-claim faithfulness — RAG
decompose_claims: truefor atomic grounding checks - LLM attack mutation —
llmci redteam generate --mutatefor broader adversarial coverage - Multimodal targets —
images/audiofields on dataset rows for direct API evals - Example 18 — multimodal vision eval (
examples/18-multimodal-vision)
Install: pip install llmci==0.3.0
Full changelog: https://github.com/llmci-cli/llmci/blob/main/CHANGELOG.md#030---2026-06-06