feat(tools): Think-Augmented Function Calling for improved parameter accuracy (TAFC, #1861) by bug-ops · Pull Request #2038 · bug-ops/zeph

bug-ops · 2026-03-20T12:36:25Z

Summary

Implement Think-Augmented Function Calling (TAFC) — a research-driven feature that improves LLM parameter accuracy by allowing the model to reason about parameter values before committing them.

Source: arXiv 2601.18282 — "Think-Augmented Function Calling: Improving LLM Parameter Accuracy Through Embedded Reasoning"

Results (ToolBench benchmark):

Average win rate: 69.6% vs 18.2% (standard function calling)
Claude Sonnet: +3.0 pp → 60.3% pass rate
Qwen2.5-72B: +4.4 pp → 49.0%
Largest relative gains at small model tier (7B–8B)

Implementation

Schema augmentation: Add optional _tafc_think field to tool definitions (complexity-based)
Client-side filter: Strip _tafc_think* keys before tool dispatch — full backward compatibility
Config: [tools.tafc] TOML sub-table with enabled (default false) and complexity_threshold (default 0.6)
Integration: CLI flag --tafc, TUI command /tafc, --init wizard step
Zero architectural changes: Schema-only augmentation, no provider-specific code needed

Architecture Decisions

Single injection point: tool_def_to_definition() in zeph-core (all tools get TAFC augmentation)
Complexity heuristic: Structural JSON analysis (depth, combinators, enum count, flat params) with configurable threshold
Subagent isolation: TAFC disabled for inline tool loops (intentional, private message scope per CRIT-02)
Ephemeral semantics: Think fields stripped from history before memory persistence — reasoning serves its purpose once params are generated

Validation Results

✓ Architect: Design approved with clarifications
✓ Critic (Architecture): Approved with conditions (subagent isolation, prefix consistency)
✓ Developer: Implementation complete, all fixes applied
✓ Tester: Coverage validated (13→18 unit tests, all edge cases)
✓ Impl-Critic: Code review approved (all CRIT/HIGH findings fixed)
✓ Reviewer: Final approval (all integration points present)

Key Fixes Applied

CRIT-01: Added tafc_failed tracking + [error] ToolOutput synthesis for empty params after stripping
CRIT-02: Documented inline loop bypass with design rationale
HIGH-01: Added bounds validation (0.0–1.0), NaN/Infinity clamping
HIGH-02: Extended complexity score with flat_params_score (0.2 weight)
HIGH-03: Documented audit trail ephemeral semantics
SEC-01: Case-insensitive prefix matching (handles _TAFC_THINK variants)

Test Coverage

Total tests: 5958 (5 new for TAFC fixes)
Coverage: schema augmentation, filter correctness, complexity detection, empty params, config validation
Edge cases: tau boundary (0.59, 0.60, 0.61), null parameters, case-insensitive variants, NaN/Infinity
Integration: tool dispatch with TAFC enabled/disabled, memory persistence, CLI/TUI flags

Files Changed

crates/zeph-core/src/agent/builder.rs — apply bounds validation
crates/zeph-core/src/agent/mod.rs — document inline loop bypass
crates/zeph-core/src/agent/tool_execution/mod.rs — flat params score, audit trail docs, case-insensitive key check
crates/zeph-core/src/agent/tool_execution/native.rs — tafc_failed tracking + error synthesis
crates/zeph-core/src/agent/tool_execution/tests.rs — 5 new tests
crates/zeph-tools/src/config.rs — TafcConfig with validated() bounds check

Pre-Commit Checks

✓ cargo +nightly fmt --check
✓ cargo clippy --workspace --features full -- -D warnings
✓ cargo nextest run --config-file .github/nextest.toml --workspace --features full --lib --bins (5958 tests)

Phase 2 (Future)

Per-parameter decomposition (_tafc_think_P for each complex param)
Semantic complexity detection (beyond structural analysis)
TAFC for subagents (currently isolated to avoid confusion)
Adaptive threshold based on model + task type
Telemetry: track think field usage and impact on success rate

Closes

#1861

Injects a `_tafc_think` reasoning field into tool schemas with structural complexity >= tau (default 0.6). Complexity is measured by schema depth, presence of anyOf/oneOf/allOf combinators, and enum variant count. Think fields are stripped before tool execution and before persisting to memory to avoid token inflation. - TafcConfig { enabled, complexity_threshold } in [tools.tafc] - schema_complexity() heuristic with structural JSON analysis - augment_with_tafc() and strip_tafc_fields() in tool_execution/mod.rs - TAFC augmentation in process_response_native_tools() (native tools path) - strip before ToolCall execution and before MessagePart persistence - WARN + Err when model produces only think fields (HIGH-02) - --tafc CLI flag; /tafc TUI command (tafc:status); --init wizard step - Subagent scope (run_inline_tool_loop) excluded (CRIT-01) - 13 unit tests covering schema augmentation, stripping, config, edges

…1, HIGH-02, HIGH-03, SEC-01) - CRIT-01: add tafc_failed tracking and [error] ToolOutput synthesis for empty params - CRIT-02: document inline loop bypass (intentional design) - HIGH-01: add TafcConfig::validated() with bounds [0.0, 1.0] and NaN/Inf clamping - HIGH-02: extend schema_complexity with flat_params_score for many-param patterns - HIGH-03: expand audit trail doc comment (ephemeral semantics) - SEC-01: implement case-insensitive is_tafc_key() helper Add 5 new unit tests for fixed areas. All 5958 tests pass.

…alling Resolved conflicts: - tool_orchestrator.rs: keep both TafcConfig (TAFC) and ToolResultCache (main) - CHANGELOG.md: keep TAFC entry - config.rs, lib.rs, native.rs: take upstream (structural crate extraction changes) Main branch has advanced significantly with: - zeph-config, zeph-vault, zeph-experiments, zeph-sanitizer, zeph-subagent, zeph-orchestration crate extractions - AnchoredSummary for structured context compaction - ToolResultCache for session-scoped tool result caching - Dynamic tool schema filtering TAFC feature (issue #1861) coexists cleanly with these changes.

- Replace tool_def_to_definition with tool_def_to_definition_with_tafc in process_response_native_tools, gating on TafcConfig from ToolOrchestrator - Strip _tafc_think fields from tool call inputs before execution in handle_native_tool_calls; skip tool calls that produce only think fields - Fix clippy: add #[must_use] to TafcConfig::validated(), rewrite condition to avoid unnecessary boolean not, add backticks to doc comment - Export TafcConfig from zeph-tools crate public API

bug-ops added 2 commits March 20, 2026 12:49

bug-ops added P1 High ROI, low complexity — do next sprint research Research-driven improvement labels Mar 20, 2026

github-actions bot added documentation Improvements or additions to documentation rust Rust code changes core zeph-core crate enhancement New feature or request size/XL Extra large PR (500+ lines) labels Mar 20, 2026

bug-ops enabled auto-merge (squash) March 20, 2026 12:44

bug-ops added 4 commits March 20, 2026 13:47

style: apply cargo +nightly fmt

c18ef65

merge: pull latest origin/main into PR branch

4c02aa8

github-actions bot removed the documentation Improvements or additions to documentation label Mar 20, 2026

bug-ops linked an issue Mar 20, 2026 that may be closed by this pull request

research(tools): Think-Augmented Function Calling for improved parameter accuracy (TAFC) #1861

Closed

bug-ops merged commit 836219f into main Mar 20, 2026
25 checks passed

bug-ops deleted the feat/m27/1861-think-augmented-function-calling branch March 20, 2026 13:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tools): Think-Augmented Function Calling for improved parameter accuracy (TAFC, #1861)#2038

feat(tools): Think-Augmented Function Calling for improved parameter accuracy (TAFC, #1861)#2038
bug-ops merged 6 commits intomainfrom
feat/m27/1861-think-augmented-function-calling

bug-ops commented Mar 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bug-ops commented Mar 20, 2026

Summary

Implementation

Architecture Decisions

Validation Results

Key Fixes Applied

Test Coverage

Files Changed

Pre-Commit Checks

Phase 2 (Future)

Closes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant