Skip to content

v0.1.15 — extended thinking on FaithfulnessJudge

Choose a tag to compare

@silversurfer562 silversurfer562 released this 15 May 11:04
· 163 commits to main since this release
7f2a86e

Opt-in extended thinking on FaithfulnessJudge. Parser handles both tool_use and text block response shapes (thinking + forced tool_choice incompatible on Claude 4).

Public surface:

  • FaithfulnessJudge.score(..., use_thinking=False, thinking_budget_tokens=32768)
  • FaithfulnessResult.thinking_used: bool
  • CLI: attune-rag-benchmark --with-faithfulness --thinking [--thinking-budget N]
  • Env: ATTUNE_RAG_FAITHFULNESS_THINKING, ATTUNE_RAG_FAITHFULNESS_THINKING_BUDGET

Back-compat: default use_thinking=False produces byte-identical request to v0.1.14.

SDK: anthropic>=0.95,<1.0 floor for stable thinking + tool-use on Claude 4.

See PR #15 for details.