Skip to content

v0.3.0

Choose a tag to compare

@github-actions github-actions released this 28 May 17:26
· 7 commits to main since this release
8ff2cfd

Build

  • Drop path override on rig-memvid dev-dep so release-plz can resolve crates.io fallback

CI

  • Add release-plz workflow and config

Added

  • Bootstrap confidence intervals on MetricReport::mean: new MetricCi
    type plus MetricReport::bootstrap_ci(iterations, level, seed) and
    builder MetricReport::with_bootstrap_ci(...). Uses a deterministic
    SplitMix64 stream (no rand dependency) so the same seed reproduces
    the same interval. MetricCi is #[serde(default, skip_serializing_if = "Option::is_none")] so existing JSON reports
    stay schema-compatible. MetricDelta carries current_ci /
    baseline_ci alongside the existing mean delta.
  • Non-zero CI exit support on ReportDiff: is_clean(&gate) -> bool
    and exit_code(&gate) -> i32 (0 = pass, 1 = regression). Lets eval
    binaries gate CI with std::process::exit(diff.exit_code(&gate))
    without rebuilding the regression-walking logic at every call site.
  • observe feature: emit MultiReport and ReportDiff as
    rig-tap-compatible JSON envelopes on the rig_tap tracing target
    without depending on rig-tap. Introduces EvalEnvelope + EvalKind
    (eval.retrieval_report, eval.regression_diff),
    report_envelopes, diff_envelopes, and tracing helpers
    emit_report / emit_diff. Each event carries the full JSON envelope
    plus stable scalar rig_tap.kind / rig_tap.metric /
    rig_tap.regressed / rig_tap.conversation_id attributes so existing
    OpenTelemetry collectors route eval reports the same way they route
    prompt and tool events.
  • FreshnessReport and FreshnessQueryRollup for rolling
    StalenessReport / ConflictReport outputs into MultiReport.
    MultiReport::with_freshness attaches the dataset/per-query rollup, while
    with_freshness_metrics also appends score-like
    freshness.stale_free_rate@k and freshness.conflict_free_rate@k metric
    rows so freshness regressions trip the existing RegressionGate / diff
    path.
  • Opt-in MinHash-style chunk near-duplicate linting via
    NearDuplicateLintConfig, NearDuplicatePair,
    ChunkStats::near_duplicate_pairs, and
    ChunkLintWarning::NearDuplicateChunks. The default remains disabled so
    existing ingestion gates keep their current warning set until hosts opt in.