Skip to content

v0.5.0

Choose a tag to compare

@jaepil jaepil released this 26 Feb 02:46
· 37 commits to main since this release
08fafad

FusionDebugger for transparent pipeline inspection

FusionDebugger records every intermediate value through the full Bayesian BM25 probability pipeline (likelihood, prior, posterior, fusion) so that the final fused score can be fully explained.

New features

  • FusionDebugger class (bayesian_bm25.debug)
    • trace_bm25(), trace_vector(), trace_fusion(), trace_document(), trace_not()
    • compare() with dominant signal and crossover detection
    • format_trace(), format_summary(), format_comparison()
  • All four fusion methods as method parameter: log_odds, prob_and, prob_or, prob_not
    • prob_and: records log_probs and log_prob_sum intermediates
    • prob_or: records complements, log_complements, log_complement_sum intermediates
    • prob_not: computes prod(1 - p_i) -- probability that NONE of the signals indicate relevance
  • Hierarchical (nested) fusion -- trace_fusion() outputs compose into arbitrary trees such as AND(OR(title, body), vector, NOT(spam))
  • Weighted log-odds fusion via weights parameter
  • 92 new tests (413 total), 12-example script (examples/fusion_debugger.py)

Install

pip install bayesian-bm25==0.5.0