Releases · llmci-cli/llmci

Release list

llmci 0.4.1 Latest

Latest

alexminnaar released this 07 Jun 18:49

v0.4.1

5715a14

Patch release: proxy cost pricing for direct targets.

Added

settings.price_overrides — per-model input_per_token / output_per_token USD rates when litellm cannot compute cost (internal LLM proxies).

Install: pip install llmci==0.4.1

See CHANGELOG.

Assets 2

llmci 0.4.0

alexminnaar released this 06 Jun 21:08

v0.4.0

4db49b0

Cross-provider migration, few-shot strategy, and PII allow-list for safety gates.

Added

Cross-provider migration — llmci migrate accepts provider/model refs with per-side base URLs
Few-shot migration strategy — --strategy few_shot inlines train examples as demos
PII allow-list — pii_leakage criteria accept allow_list (literal or regex: entries)

See CHANGELOG for full details.

Assets 2

llmci 0.3.0

alexminnaar released this 06 Jun 20:45

v0.3.0

1d9e7b5

Post-0.2.0 follow-ups: deeper gate trust, RAG faithfulness, red-team mutation, and multimodal evals.

Highlights

Composite judge caching — agent outcome/trajectory LLM calls share .llmci/cache/judges/
Calibration trend history — --save-snapshot appends to a history log with trend table
Gate warnings — llmci run warns on missing baselines or significance misconfig
Per-claim faithfulness — RAG decompose_claims: true for atomic grounding checks
LLM attack mutation — llmci redteam generate --mutate for broader adversarial coverage
Multimodal targets — images / audio fields on dataset rows for direct API evals
Example 18 — multimodal vision eval (examples/18-multimodal-vision)

Install: pip install llmci==0.3.0

Full changelog: https://github.com/llmci-cli/llmci/blob/main/CHANGELOG.md#030---2026-06-06

Assets 2

llmci 0.2.0

alexminnaar released this 06 Jun 20:31

v0.2.0

b8354aa

Major release: CI gate trust, deeper eval quality, safety/red-team, plugin API, and seventeen runnable examples.

Highlights

CI gate hardening

Flake resistance (samples_per_example, significance gating)
Response caching for direct API targets
Cost/token metrics (cost_mean, tokens_*)
Portable reports: JUnit, SARIF, JSON, HTML

Eval quality

RAG judge (faithfulness, relevance, retrieval metrics)
Pairwise judge with position-swap bias control
Judge calibration & drift detection (per-criterion support)
Output diffs vs baseline in reports
Structured-output (JSON Schema) judge

Safety & plugins

Safety judge (PII, toxicity, jailbreak)
Red-team attack generator (llmci redteam generate)
Plugin API: custom judges, metrics, and report sinks

Examples

examples/11–17, including integrated pre-merge gate with committed baselines

Install: pip install llmci==0.2.0

Full changelog: https://github.com/llmci-cli/llmci/blob/main/CHANGELOG.md#020---2026-06-06

Assets 2

llmci 0.1.9

alexminnaar released this 01 Jun 00:37

v0.1.9

9680172

Added

Release metadata consistency check for package version, action install version, and changelog links.
Manual real-LLM example workflow for API-key-dependent examples.
GitHub Action inputs for explicit config paths, discovered config runs, and baseline updates.

Fixed

Duplicate llmci PR comments from parallel matrix jobs are merged into one canonical comment and stale duplicates are cleaned up.

Install: pip install llmci==0.1.9

Assets 2

llmci 0.1.8

alexminnaar released this 31 May 21:29

v0.1.8

8a619ac

Added

--include and --exclude filters for llmci discover and llmci run --all.

Changed

Dogfood matrix evals now use LLMCI_REPORT_SLICE so PR comments merge into one combined report.

Assets 2

llmci 0.1.7

alexminnaar released this 31 May 21:11

v0.1.7

3340b10

Added

llmci discover to list config files in a repository.
llmci run --all to run every discovered config.

Assets 2

llmci 0.1.6

alexminnaar released this 31 May 20:45

v0.1.6

0190049

Added

llmci run --config <path> to run evals from an alternate config file.

Assets 2

llmci 0.1.5

alexminnaar released this 25 May 23:28

v0.1.5

913f14f

Fixed

S3 dataset URI validation runs before the optional boto3 import.
PyPI publish workflow grants contents: read so checkout works alongside trusted publishing.

Install: pip install llmci==0.1.5

Assets 2

llmci 0.1.4

alexminnaar released this 25 May 23:24

v0.1.4

77272c9

Added

Remote eval datasets — dataset accepts s3:// and https:// URIs (string or {source, cache}). S3 requires pip install 'llmci[s3]'. Cached under .llmci/cache/datasets/ by default.

Changed

Repository and package metadata URLs updated for the llmci-cli GitHub organization.

Install: pip install llmci==0.1.4

Assets 2

Uh oh!

Releases: llmci-cli/llmci

Release list

llmci 0.4.1

Added

Uh oh!

llmci 0.4.0

Added

Uh oh!

llmci 0.3.0

Highlights

Uh oh!

llmci 0.2.0

Highlights

Uh oh!

llmci 0.1.9

Added

Fixed

Uh oh!

llmci 0.1.8

Added

Changed

Uh oh!

llmci 0.1.7

Added

Uh oh!

llmci 0.1.6

Added

Uh oh!

llmci 0.1.5

Fixed

Uh oh!

llmci 0.1.4

Added

Changed

Uh oh!