-
Notifications
You must be signed in to change notification settings - Fork 12
Fix infinite tool calling loop with Meta Llama models #50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix infinite tool calling loop with Meta Llama models #50
Conversation
…ction This commit improves the original PR oracle#50 fix by replacing the overly restrictive single-tool limitation with intelligent multi-step support. Changes: - Add `max_sequential_tool_calls` parameter (default: 8) to OCIGenAIBase - Implement intelligent loop detection algorithm that identifies when the same tool is called repeatedly with identical arguments - Replace unconditional tool_choice="none" with conditional logic: * Allow tool_choice="auto" when within limits and no loop detected * Force tool_choice="none" only when limit exceeded or loop detected Benefits: - ✅ Prevents infinite loops (original PR oracle#50 goal) - ✅ Enables multi-step tool orchestration (new capability) - ✅ Fully backward compatible (default parameter, no breaking changes) - ✅ Configurable per use case (users can adjust max limit) - ✅ Domain-agnostic (works for any tool-calling application) Example use cases now supported: - Diagnostic workflows requiring 3-5 sequential tool calls - Data analysis pipelines with multiple data fetching steps - Complex reasoning tasks requiring iterative tool usage 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>
83af651 to
668ff5e
Compare
## Problem Meta Llama and other models were stuck in infinite tool calling loops after receiving tool results. The previous fix set tool_choice="none" unconditionally after any tool result, which prevented legitimate multi-step tool orchestration patterns. ## Solution Implemented intelligent tool_choice management that: 1. Allows models to continue calling tools for multi-step workflows 2. Prevents infinite loops via max_sequential_tool_calls limit (default: 8) 3. Detects infinite loops by identifying repeated tool calls with identical arguments ## Changes - Added max_sequential_tool_calls parameter to OCIGenAIBase (default: 8) - Enhanced GenericProvider.messages_to_oci_params() with _should_allow_more_tool_calls() - Loop detection checks for same tool called with same args in succession - Safety limit prevents runaway tool calling beyond configured maximum ## Backward Compatibility ✅ Fully backward compatible - no breaking changes - New parameter is optional with sensible default (8) - Existing code continues to work without modifications - Previous infinite loop fix remains active as fallback ## Technical Details The fix passes max_sequential_tool_calls from ChatOCIGenAI to Provider via kwargs, allowing the provider to determine whether to set tool_choice="none" (force stop) or tool_choice="auto" (allow continuation).
## Test Coverage ### 1. Basic Tool Calling Tests (test_tool_calling_no_infinite_loop) Tests 4 models to ensure basic tool calling works without infinite loops: - meta.llama-4-scout-17b-16e-instruct - meta.llama-3.3-70b-instruct - cohere.command-a-03-2025 - cohere.command-r-plus-08-2024 Verifies: - Tool is called when needed - Model stops after receiving tool results - No infinite loops occur ### 2. Model-Specific Tests - test_meta_llama_tool_calling: Validates Meta Llama models specifically - test_cohere_tool_calling: Validates Cohere models return expected content ### 3. Multi-Step Tool Orchestration Test (test_multi_step_tool_orchestration) Simulates realistic diagnostic workflows with 6 tools (2 models tested): - meta.llama-4-scout-17b-16e-instruct - cohere.command-a-03-2025 Tools simulate monitoring scenarios: - check_status: Current resource health - get_events: Recent failure events - get_metrics: Historical trends - check_changes: Recent deployments - create_alert: Incident creation - take_action: Remediation actions Verifies: - Agent makes multiple tool calls (2-8) - Respects max_sequential_tool_calls limit - Eventually stops (no infinite loops) - Handles OCI limitation (1 tool call at a time) ## Test Results All 8 tests passing across 4 models: ✅ Basic tool calling (4 models × 1 test = 4 tests) ✅ Model-specific tests (2 tests) ✅ Multi-step orchestration (2 models × 1 test = 2 tests) ## Documentation Added comprehensive test documentation including: - Prerequisites (OCI auth, environment setup) - Running instructions - What each test verifies - Model compatibility notes
668ff5e to
8aa86ce
Compare
🔄 PR Updated - Clean History & Comprehensive TestingThis PR has been reorganized into 2 clean commits with full test coverage and documentation. 📋 SummaryEnhances tool calling to support multi-step orchestration while preventing infinite loops through intelligent Before: Model gets stuck in infinite loop calling same tool repeatedly ✅ Test ResultsAll Integration Tests Passing (8/8)pytest tests/integration_tests/chat_models/test_tool_calling.py -vResults:
Execution Time: 63.73s (1:03) 🔍 What Was Tested1. Basic Tool Calling (4 models)
2. Multi-Step Tool Orchestration (2 models)Simulates realistic diagnostic workflow with 6 tools:
Validates:
🛡️ Backward Compatibility✅ No breaking changes - Fully backward compatible
📊 Code Quality✅ Ruff linting: All checks passed 🏗️ Implementation DetailsKey Changes
Loop Detection Algorithmdef _should_allow_more_tool_calls(messages, max_tool_calls):
# Safety limit: count total tool calls
if tool_call_count >= max_tool_calls:
return False
# Infinite loop detection: same tool + same args in succession
if signature in recent_calls[-2:]:
return False # Loop detected!
return True # Allow continuation📝 CommitsCommit 1: Fix implementation Clean, atomic commits ready for review. |
|
Might go in the next release? @YouNeedCryDear |
|
@fede-kamel cced: @YouNeedCryDear My package version is oci 2.162.0 |
The test was failing after rebase because it used non-existent OCI SDK classes (models.Tool) and had incorrect expectations about when tool_choice is set to 'none'. Changes: 1. Replace OCI SDK mock objects with Python function (following pattern from other tests in the file) 2. Update test to trigger actual tool_choice=none behavior by exceeding max_sequential_tool_calls limit (3 tool calls) 3. Fix _prepare_request call signature (add stop parameter) 4. Pass bound model kwargs to _prepare_request (required for tools) 5. Update docstring to accurately describe what's being tested The test now correctly validates that tool_choice is set to ToolChoiceNone when the max_sequential_tool_calls limit is reached, preventing infinite tool calling loops. Related to PR oracle#50 (infinite loop fix) and PR oracle#53 (tool call optimization). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
✅ Test Fixed in PR #53@paxiaatucsdedu The Issue: The test was using Fix: PR #53 updates the test to:
See the fix here: #53 The test now passes successfully! Once PR #53 is merged, you'll be able to run the test without issues. Test results after the fix: tests/unit_tests/chat_models/test_oci_generative_ai.py::test_tool_choice_none_after_tool_results PASSED ✅ |
The test was failing after rebase because it used non-existent OCI SDK classes (models.Tool) and had incorrect expectations about when tool_choice is set to 'none'. Changes: 1. Replace OCI SDK mock objects with Python function (following pattern from other tests in the file) 2. Update test to trigger actual tool_choice=none behavior by exceeding max_sequential_tool_calls limit (3 tool calls) 3. Fix _prepare_request call signature (add stop parameter) 4. Pass bound model kwargs to _prepare_request (required for tools) 5. Update docstring to accurately describe what's being tested The test now correctly validates that tool_choice is set to ToolChoiceNone when the max_sequential_tool_calls limit is reached, preventing infinite tool calling loops. Related to PR oracle#50 (infinite loop fix) and PR oracle#53 (tool call optimization). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
✅ Test Fix Available - PR #57@paxiaatucsdedu The New PR created: #57 This is a test-only fix that can be merged immediately to unblock the team. The PR includes:
The fix has been separated from the performance optimization PR #53 for expedited merge. |
…#57) The test was failing after rebase because it used non-existent OCI SDK classes (models.Tool) and had incorrect expectations about when tool_choice is set to 'none'. Changes: 1. Replace OCI SDK mock objects with Python function (following pattern from other tests in the file) 2. Update test to trigger actual tool_choice=none behavior by exceeding max_sequential_tool_calls limit (3 tool calls) 3. Fix _prepare_request call signature (add stop parameter) 4. Pass bound model kwargs to _prepare_request (required for tools) 5. Update docstring to accurately describe what's being tested The test now correctly validates that tool_choice is set to ToolChoiceNone when the max_sequential_tool_calls limit is reached, preventing infinite tool calling loops. Related to PR #50 (infinite loop fix) and PR #53 (tool call optimization). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>
Problem
Meta Llama and other models were stuck in infinite tool calling loops after receiving tool results. The previous fix unconditionally set
tool_choice="none"after any tool result, which prevented legitimate multi-step tool orchestration patterns.Example Issue:
Solution
Implemented intelligent tool_choice management that:
max_sequential_tool_callslimit (default: 8)Changes
Core Implementation
max_sequential_tool_callsparameter toOCIGenAIBase(default: 8)GenericProvider.messages_to_oci_params()with_should_allow_more_tool_calls()Technical Details
The fix passes
max_sequential_tool_callsfromChatOCIGenAItoProvidervia kwargs, allowing the provider to determine whether to:tool_choice="none"(force stop) when limit reached or loop detectedtool_choice="auto"(allow continuation) for valid multi-step workflowsBackward Compatibility
✅ Fully backward compatible - no breaking changes
Testing
Comprehensive Test Coverage (8/8 tests passing)
1. Basic Tool Calling Tests (4 models)
meta.llama-4-scout-17b-16e-instructmeta.llama-3.3-70b-instructcohere.command-a-03-2025cohere.command-r-plus-08-20242. Multi-Step Orchestration Tests (2 models)
Simulates realistic diagnostic workflows with 6 tools:
check_status: Current resource healthget_events: Recent failure eventsget_metrics: Historical trendscheck_changes: Recent deploymentscreate_alert: Incident creationtake_action: Remediation actionsValidates:
max_sequential_tool_callslimitTest Results
✅ 8 passed in 63.73s (1:03)
Code Quality
✅ Ruff linting: All checks passed
✅ Type safety: Proper type hints throughout
✅ Documentation: Comprehensive docstrings and test docs
✅ Clean commits: 2 atomic commits ready for review
Usage Example
Commits