Skip to content

v0.6.0-rc36

Pre-release
Pre-release

Choose a tag to compare

@buger buger released this 01 Aug 17:19
· 438 commits to main since this release
d0c1833
Add BERT reranking functionality with comprehensive examples (#90)

* Fix required terms enforcement in Elasticsearch-style queries

This commit fixes a critical bug where required terms (marked with +) were not
being properly enforced in OR expressions. Previously, queries like "+github actions"
would incorrectly match documents containing only "actions" but not "github".

Changes:
- Add check_all_required_terms_present() to enforce Lucene semantics
- Update evaluate_with_has_required() to check required terms first
- Fix pattern generation to include excluded terms for proper filtering
- Update tests to match correct Lucene/Elasticsearch behavior

The fix ensures that ALL required terms must be present for a document to match,
regardless of query structure (including OR expressions). This brings the behavior
in line with standard Lucene/Elasticsearch semantics.

Fixes the issue where "+github actions" was returning documents without "github".

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Add BERT reranking functionality with comprehensive examples

- Add BERT reranker module using Candle framework for MS-MARCO models
- Implement cross-encoder architecture for improved document ranking
- Add comprehensive example with multiple model support and benchmarking
- Include Python reference implementation for comparison and validation
- Add detailed documentation covering models, performance, and usage
- Extend result ranking system to support BERT-based reranking
- Add optional dependencies with proper feature gating
- Include test suites and benchmark scripts for validation

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Remove large vocab.txt files and fix formatting

- Remove examples/reranker/models/*/vocab.txt files (30k+ lines each)
- Add vocab.txt to .gitignore to prevent future commits
- Apply cargo fmt fixes for consistent formatting

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Trigger pre-commit checks to verify formatting

Ensure all formatting and linting checks pass correctly.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Verify pre-commit formatting checks work correctly

Fix formatting issue and confirm pre-commit hook properly validates code.
All formatting checks now pass as expected.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>