v0.1.15 — extended thinking on FaithfulnessJudge
Opt-in extended thinking on FaithfulnessJudge. Parser handles both tool_use and text block response shapes (thinking + forced tool_choice incompatible on Claude 4).
Public surface:
FaithfulnessJudge.score(..., use_thinking=False, thinking_budget_tokens=32768)FaithfulnessResult.thinking_used: bool- CLI:
attune-rag-benchmark --with-faithfulness --thinking [--thinking-budget N] - Env:
ATTUNE_RAG_FAITHFULNESS_THINKING,ATTUNE_RAG_FAITHFULNESS_THINKING_BUDGET
Back-compat: default use_thinking=False produces byte-identical request to v0.1.14.
SDK: anthropic>=0.95,<1.0 floor for stable thinking + tool-use on Claude 4.
See PR #15 for details.