Skip to content

v0.8.0

Choose a tag to compare

@jaepil jaepil released this 05 Mar 04:28
· 26 commits to main since this release
5da6427

What's New

  • normalize parameter for AttentionLogOddsWeights -- per-signal logit normalization is now a first-class model parameter. When normalize=True, applies per-column min-max normalization in logit space before the weighted sum, equalizing signal scales (same scaling as balanced_log_odds_fusion).

    • __call__: normalizes logit columns across all candidates for a given query
    • fit: accepts optional query_ids parameter to normalize within each query group; without query_ids, normalizes the whole batch as a single group
    • update: normalizes logit columns when input is 2D (assumes same query)
  • Broadcasting fix -- AttentionLogOddsWeights.__call__ now correctly broadcasts single query features across batched probability inputs

  • Benchmark simplification -- removed the external sigmoid trick from benchmarks/hybrid_beir.py; normalization is now handled internally via normalize=True with per-query query_ids for correct train/inference parity

BEIR Benchmark (NDCG@10)

Method ArguAna FiQA NFCorpus SciDocs SciFact Average
BM25 36.13 25.31 31.82 15.63 68.02 35.38
Convex 40.01 37.10 35.60 19.67 73.37 41.15
Bayesian-Balanced 37.27 40.58 35.73 21.42 72.47 41.50
Attn-NR 37.22 40.53 35.42 21.91 73.24 41.67

Attn-NR achieves the highest zero-shot NDCG@10 average (+6.28 vs BM25).

Install

pip install bayesian-bm25==0.8.0

Full Changelog: https://github.com/cognica-io/bayesian-bm25/blob/main/HISTORY.md#080-2026-03-05