Skip to content

llmci 0.3.0

Choose a tag to compare

@alexminnaar alexminnaar released this 06 Jun 20:45

Post-0.2.0 follow-ups: deeper gate trust, RAG faithfulness, red-team mutation, and multimodal evals.

Highlights

  • Composite judge caching — agent outcome/trajectory LLM calls share .llmci/cache/judges/
  • Calibration trend history--save-snapshot appends to a history log with trend table
  • Gate warningsllmci run warns on missing baselines or significance misconfig
  • Per-claim faithfulness — RAG decompose_claims: true for atomic grounding checks
  • LLM attack mutationllmci redteam generate --mutate for broader adversarial coverage
  • Multimodal targetsimages / audio fields on dataset rows for direct API evals
  • Example 18 — multimodal vision eval (examples/18-multimodal-vision)

Install: pip install llmci==0.3.0

Full changelog: https://github.com/llmci-cli/llmci/blob/main/CHANGELOG.md#030---2026-06-06