This release introduces four new CLI tasks for RAG system operations
and quality assurance workflows.
New Mix Tasks:
mix portfolio.eval.generate
- Generate synthetic test cases from document chunks using LLM
- Supports sample-size, collection, and source-id filtering
- Persists generated test cases to database for evaluation runs
mix portfolio.eval.run
- Execute retrieval evaluation against test cases
- Supports semantic, fulltext, and hybrid search modes
- Outputs Recall@K, Precision@K, MRR, and Hit Rate metrics
- CI integration via fail-under threshold flag
- JSON and table output formats
mix portfolio.reembed
- Batch re-embed chunks with current embedding configuration
- Collection filtering for targeted re-embedding operations
- Progress reporting with verbose flag
- Dry-run mode for previewing operations
mix portfolio.diagnostics
- Display collection, document, and chunk counts
- Show embedding coverage statistics
- Report failed document counts
- Output configuration summary
- JSON format support for tooling integration
Additional Changes:
- Comprehensive test coverage for all new tasks
- Updated CHANGELOG and README documentation
- Dependency updates for claude_agent_sdk, codex_sdk, gemini_ex
- Switched to local path dependencies for development