Skip to content

v0.0.2

Choose a tag to compare

@lucifertrj lucifertrj released this 10 May 22:27
· 18 commits to main since this release

Evret 0.0.2

Added

  • [new-metric] Added ERR@k metric for cascade-style graded relevance evaluation.
  • [new-metric] Added RBP@k metric with tunable persistence/user-patience weighting.
  • Structured logging utilities: get_logger, configure_logging, and JSON log formatting.
  • Added tracing and monitoring notebook

Changed

  • design change in evaluation dataset semantics from relevant_doc_ids toward expected_answers.
  • Improved TokenOverlapJudge matching logic, including negation handling and better overlap scoring.
  • Reworked quickstart, architecture, dataset-format, metrics, and judge docs.

Full Changelog: v0.0.1...v0.0.2