Skip to content

v0.7.1 — managed-tool duplicate replay

Choose a tag to compare

@mostlydev mostlydev released this 16 Jun 19:44
60949f1

Fixes

  • Replay cached result on duplicate managed tool calls — when a model repeats an identical managed tool call, cllama now replays the earlier model-facing result for the new tool_call_id instead of returning a data-less HTTP 409. Previously a model that never received the earlier data on the repeat could re-issue the same call until the managed-tool round budget was exhausted, never finalizing. Gated by CLLAMA_MANAGED_DUPLICATE_POLICY (replay, default, or reject for the legacy 409 path).
  • Consecutive-duplicate streak cutoff — after CLLAMA_MANAGED_DUPLICATE_STREAK_CUTOFF consecutive identical duplicate calls (default 3), cllama forces graceful finalization by disabling tools and injecting a final-answer instruction before MaxRounds is reached, covering models that loop even after receiving the replayed result.

Both OpenAI-compatible and Anthropic mediation paths are handled. Session-history tool traces now record duplicate_streak and policy, and a duplicate_managed_tool_call_finalization:<tool> intervention is logged when the cutoff fires.