Skip to content

feat(tools): Think-Augmented Function Calling for improved parameter accuracy (TAFC, #1861)#2038

Merged
bug-ops merged 6 commits intomainfrom
feat/m27/1861-think-augmented-function-calling
Mar 20, 2026
Merged

feat(tools): Think-Augmented Function Calling for improved parameter accuracy (TAFC, #1861)#2038
bug-ops merged 6 commits intomainfrom
feat/m27/1861-think-augmented-function-calling

Conversation

@bug-ops
Copy link
Owner

@bug-ops bug-ops commented Mar 20, 2026

Summary

Implement Think-Augmented Function Calling (TAFC) — a research-driven feature that improves LLM parameter accuracy by allowing the model to reason about parameter values before committing them.

Source: arXiv 2601.18282 — "Think-Augmented Function Calling: Improving LLM Parameter Accuracy Through Embedded Reasoning"

Results (ToolBench benchmark):

  • Average win rate: 69.6% vs 18.2% (standard function calling)
  • Claude Sonnet: +3.0 pp → 60.3% pass rate
  • Qwen2.5-72B: +4.4 pp → 49.0%
  • Largest relative gains at small model tier (7B–8B)

Implementation

  • Schema augmentation: Add optional _tafc_think field to tool definitions (complexity-based)
  • Client-side filter: Strip _tafc_think* keys before tool dispatch — full backward compatibility
  • Config: [tools.tafc] TOML sub-table with enabled (default false) and complexity_threshold (default 0.6)
  • Integration: CLI flag --tafc, TUI command /tafc, --init wizard step
  • Zero architectural changes: Schema-only augmentation, no provider-specific code needed

Architecture Decisions

  1. Single injection point: tool_def_to_definition() in zeph-core (all tools get TAFC augmentation)
  2. Complexity heuristic: Structural JSON analysis (depth, combinators, enum count, flat params) with configurable threshold
  3. Subagent isolation: TAFC disabled for inline tool loops (intentional, private message scope per CRIT-02)
  4. Ephemeral semantics: Think fields stripped from history before memory persistence — reasoning serves its purpose once params are generated

Validation Results

Architect: Design approved with clarifications
Critic (Architecture): Approved with conditions (subagent isolation, prefix consistency)
Developer: Implementation complete, all fixes applied
Tester: Coverage validated (13→18 unit tests, all edge cases)
Impl-Critic: Code review approved (all CRIT/HIGH findings fixed)
Reviewer: Final approval (all integration points present)

Key Fixes Applied

  • CRIT-01: Added tafc_failed tracking + [error] ToolOutput synthesis for empty params after stripping
  • CRIT-02: Documented inline loop bypass with design rationale
  • HIGH-01: Added bounds validation (0.0–1.0), NaN/Infinity clamping
  • HIGH-02: Extended complexity score with flat_params_score (0.2 weight)
  • HIGH-03: Documented audit trail ephemeral semantics
  • SEC-01: Case-insensitive prefix matching (handles _TAFC_THINK variants)

Test Coverage

  • Total tests: 5958 (5 new for TAFC fixes)
  • Coverage: schema augmentation, filter correctness, complexity detection, empty params, config validation
  • Edge cases: tau boundary (0.59, 0.60, 0.61), null parameters, case-insensitive variants, NaN/Infinity
  • Integration: tool dispatch with TAFC enabled/disabled, memory persistence, CLI/TUI flags

Files Changed

  • crates/zeph-core/src/agent/builder.rs — apply bounds validation
  • crates/zeph-core/src/agent/mod.rs — document inline loop bypass
  • crates/zeph-core/src/agent/tool_execution/mod.rs — flat params score, audit trail docs, case-insensitive key check
  • crates/zeph-core/src/agent/tool_execution/native.rs — tafc_failed tracking + error synthesis
  • crates/zeph-core/src/agent/tool_execution/tests.rs — 5 new tests
  • crates/zeph-tools/src/config.rs — TafcConfig with validated() bounds check

Pre-Commit Checks

cargo +nightly fmt --check
cargo clippy --workspace --features full -- -D warnings
cargo nextest run --config-file .github/nextest.toml --workspace --features full --lib --bins (5958 tests)

Phase 2 (Future)

  • Per-parameter decomposition (_tafc_think_P for each complex param)
  • Semantic complexity detection (beyond structural analysis)
  • TAFC for subagents (currently isolated to avoid confusion)
  • Adaptive threshold based on model + task type
  • Telemetry: track think field usage and impact on success rate

Closes

#1861

bug-ops added 2 commits March 20, 2026 12:49
Injects a `_tafc_think` reasoning field into tool schemas with
structural complexity >= tau (default 0.6). Complexity is measured
by schema depth, presence of anyOf/oneOf/allOf combinators, and enum
variant count. Think fields are stripped before tool execution and
before persisting to memory to avoid token inflation.

- TafcConfig { enabled, complexity_threshold } in [tools.tafc]
- schema_complexity() heuristic with structural JSON analysis
- augment_with_tafc() and strip_tafc_fields() in tool_execution/mod.rs
- TAFC augmentation in process_response_native_tools() (native tools path)
- strip before ToolCall execution and before MessagePart persistence
- WARN + Err when model produces only think fields (HIGH-02)
- --tafc CLI flag; /tafc TUI command (tafc:status); --init wizard step
- Subagent scope (run_inline_tool_loop) excluded (CRIT-01)
- 13 unit tests covering schema augmentation, stripping, config, edges
…1, HIGH-02, HIGH-03, SEC-01)

- CRIT-01: add tafc_failed tracking and [error] ToolOutput synthesis for empty params
- CRIT-02: document inline loop bypass (intentional design)
- HIGH-01: add TafcConfig::validated() with bounds [0.0, 1.0] and NaN/Inf clamping
- HIGH-02: extend schema_complexity with flat_params_score for many-param patterns
- HIGH-03: expand audit trail doc comment (ephemeral semantics)
- SEC-01: implement case-insensitive is_tafc_key() helper

Add 5 new unit tests for fixed areas. All 5958 tests pass.
@bug-ops bug-ops added P1 High ROI, low complexity — do next sprint research Research-driven improvement labels Mar 20, 2026
@github-actions github-actions bot added documentation Improvements or additions to documentation rust Rust code changes core zeph-core crate enhancement New feature or request size/XL Extra large PR (500+ lines) labels Mar 20, 2026
@bug-ops bug-ops enabled auto-merge (squash) March 20, 2026 12:44
bug-ops added 4 commits March 20, 2026 13:47
…alling

Resolved conflicts:
- tool_orchestrator.rs: keep both TafcConfig (TAFC) and ToolResultCache (main)
- CHANGELOG.md: keep TAFC entry
- config.rs, lib.rs, native.rs: take upstream (structural crate extraction changes)

Main branch has advanced significantly with:
- zeph-config, zeph-vault, zeph-experiments, zeph-sanitizer, zeph-subagent, zeph-orchestration crate extractions
- AnchoredSummary for structured context compaction
- ToolResultCache for session-scoped tool result caching
- Dynamic tool schema filtering

TAFC feature (issue #1861) coexists cleanly with these changes.
- Replace tool_def_to_definition with tool_def_to_definition_with_tafc
  in process_response_native_tools, gating on TafcConfig from ToolOrchestrator
- Strip _tafc_think fields from tool call inputs before execution in
  handle_native_tool_calls; skip tool calls that produce only think fields
- Fix clippy: add #[must_use] to TafcConfig::validated(), rewrite condition
  to avoid unnecessary boolean not, add backticks to doc comment
- Export TafcConfig from zeph-tools crate public API
@github-actions github-actions bot removed the documentation Improvements or additions to documentation label Mar 20, 2026
@bug-ops bug-ops merged commit 836219f into main Mar 20, 2026
25 checks passed
@bug-ops bug-ops deleted the feat/m27/1861-think-augmented-function-calling branch March 20, 2026 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core zeph-core crate enhancement New feature or request P1 High ROI, low complexity — do next sprint research Research-driven improvement rust Rust code changes size/XL Extra large PR (500+ lines)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

research(tools): Think-Augmented Function Calling for improved parameter accuracy (TAFC)

1 participant