Release Release v0.3.1: add evaluation, diagnostics, and reembed mix tasks · nshkrdotcom/portfolio_manager

v0.3.1
272a2af
Verified

This commit was signed with the committer’s verified signature.

nshkrdotcom nshkrdotcom

SSH Key Fingerprint: 7E9kicni4Zs9x0ZdPw3mRTQmdFtF9t4LDAbO0Ve5vZA
Verified
Learn about vigilant mode
Choose a tag to compare

Filter

View all tags

Release v0.3.1: add evaluation, diagnostics, and reembed mix tasks

v0.3.1
272a2af
Choose a tag to compare

Filter

View all tags
Verified

This commit was signed with the committer’s verified signature.

nshkrdotcom nshkrdotcom

SSH Key Fingerprint: 7E9kicni4Zs9x0ZdPw3mRTQmdFtF9t4LDAbO0Ve5vZA
Verified
Learn about vigilant mode

nshkrdotcom tagged this 31 Dec 02:30

This release introduces four new CLI tasks for RAG system operations
and quality assurance workflows.

New Mix Tasks:

mix portfolio.eval.generate
  - Generate synthetic test cases from document chunks using LLM
  - Supports sample-size, collection, and source-id filtering
  - Persists generated test cases to database for evaluation runs

mix portfolio.eval.run
  - Execute retrieval evaluation against test cases
  - Supports semantic, fulltext, and hybrid search modes
  - Outputs Recall@K, Precision@K, MRR, and Hit Rate metrics
  - CI integration via fail-under threshold flag
  - JSON and table output formats

mix portfolio.reembed
  - Batch re-embed chunks with current embedding configuration
  - Collection filtering for targeted re-embedding operations
  - Progress reporting with verbose flag
  - Dry-run mode for previewing operations

mix portfolio.diagnostics
  - Display collection, document, and chunk counts
  - Show embedding coverage statistics
  - Report failed document counts
  - Output configuration summary
  - JSON format support for tooling integration

Additional Changes:
  - Comprehensive test coverage for all new tasks
  - Updated CHANGELOG and README documentation
  - Dependency updates for claude_agent_sdk, codex_sdk, gemini_ex
  - Switched to local path dependencies for development

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!