feat: add programmatic tool caller by mkmeral · Pull Request #387 · strands-agents/tools

mkmeral · 2026-02-05T05:35:14Z

Description

Adds programmatic_tool_caller tool that enables code-based tool invocation. Agents can write Python code that calls other tools as async functions, reducing API round-trips and enabling complex orchestration patterns like loops, parallel execution, and chaining.

Note: Does not work on Windows.

Key Features

Async-first design: Tools exposed as await tool_name(...) - code runs in async context automatically
Only print() output returned: Tool results stay in code execution context, don't enter agent's context window unless explicitly printed
Pluggable executor: Custom Executor implementations for sandboxed environments (Docker, Lambda, etc.)
Security controls: Code validation for dangerous patterns, user confirmation (bypassable via BYPASS_TOOL_CONSENT), configurable allowed tools

Example Usage

from strands import Agent
from strands_tools import programmatic_tool_caller, calculator

agent = Agent(tools=[programmatic_tool_caller, calculator])

result = agent.tool.programmatic_tool_caller(
    code="""
# Simple tool call
result = await calculator(expression="2 + 2")
print(f"Result: {result}")

# Loop with tool calls
total = 0
for i in range(1, 6):
    square = await calculator(expression=f"{i} ** 2")
    total += int(square)
print(f"Sum of squares: {total}")

# Parallel execution
results = await asyncio.gather(
    calculator(expression="10 * 1"),
    calculator(expression="10 * 2"),
    calculator(expression="10 * 3"),
)
print(f"Parallel results: {results}")
"""
)

Environment Variables

Variable	Description
`BYPASS_TOOL_CONSENT`	Skip user confirmation if `"true"`
`PROGRAMMATIC_TOOL_CALLER_ALLOWED_TOOLS`	Comma-separated list of tools to expose (default: all except self)

Custom Executors

from strands_tools.programmatic_tool_caller import programmatic_tool_caller, Executor

class DockerExecutor(Executor):
    def execute(self, code: str, namespace: dict) -> str:
        # Run in sandboxed container
        ...

programmatic_tool_caller.executor = DockerExecutor()

Related Issues

Type of Change

New Tool

Testing

Unit tests for executor, tool execution, validation, and allowed tools filtering
Integration tests with real Agent and tools
Tests cover async execution, loops, asyncio.gather, custom executors, user cancellation

Checklist

I have read the CONTRIBUTING document
I have added any necessary tests that prove my fix is effective or my feature works
I have updated the documentation accordingly (README.md)
I have added an appropriate example to the documentation to outline the feature
My changes generate no new warnings

This tool enables programmatic/code-based tool invocation for Strands Agents, inspired by Anthropic's Programmatic Tool Calling feature. It allows an agent to write Python code that calls other tools as functions. Key features: - Tools exposed as callable methods via 'tools.<tool_name>(**kwargs)' - Supports complex orchestration with loops, conditionals, data processing - Captures stdout/stderr from executed code - Records all tool calls for transparency - Validates code for potentially dangerous patterns - User confirmation required unless BYPASS_TOOL_CONSENT is set Example usage: result = agent.tool.programmatic_tool_caller( code=''' result = tools.calculator(expression="2 + 2") print(f"Result: {result}") ''' ) The tool integrates with Strands' DecoratedFunctionTool pattern, calling tools directly with keyword arguments and handling both string and dict return values. Includes comprehensive unit tests covering: - ToolProxy functionality - Code validation - Tool execution - Integration with real tools - Edge cases and error handling

@tool

Changes: - Use tool_context via @tool(context=True) instead of agent parameter - Handle multiple content blocks in tool results (combine all text) - Remove allowed_tools parameter (let agent decide which tools to use) - Add comprehensive integration tests with real tools - Fix test assertions and add more edge case coverage Test coverage: - 43 unit tests - 10 integration tests - All tests passing

- Add tool entry to the tools table - Add usage example section with code sample - Note that tool does not work on Windows (uses exec)

Major changes: - Remove ToolProxy class, inject tools directly as functions - Tools exposed as both async (tool_name) and sync (tool_name_sync) - Only return print() output, not tool call summary or execution time - Support async tool calls via asyncio This aligns with Anthropic's design where: - Tools are callable as async functions: await tool_name(...) - Only print() output is captured and returned to agent - Tool results stay in code execution context, don't enter agent messages

…oop()

- Remove sync functions, only expose async (await tool_name(...)) - Auto-wrap user code in async function - no boilerplate needed - Support asyncio.gather() for parallel execution - Simplified implementation and tests

- Add Executor abstract base class for custom execution environments - LocalAsyncExecutor as default (local exec with asyncio) - Custom executors can be set via: programmatic_tool_caller.executor = MyExecutor() - Add PROGRAMMATIC_TOOL_CALLER_ALLOWED_TOOLS env var to control exposed tools - Tests for executor swapping and env var filtering

This reverts commit ca41d0f.

Use agent.tool.<name>() instead of directly calling tool_impl() from registry. This properly handles all tool types including MCP tools which are not directly callable but work through the ToolExecutor._stream() mechanism. - Changed _execute_tool to use getattr(agent.tool, tool_name)() - Added record_direct_tool_call=False to prevent polluting message history - Handle AttributeError for tool not found case

mkmeral

Interrupt limitation with programmatic tool calls

Tools that use interrupts (human-in-the-loop) will raise a RuntimeError when invoked through the programmatic tool caller. The SDK's _ToolCaller has two guards:

If an interrupt is already active when a direct tool call is made → RuntimeError("cannot directly call tool during interrupt")
If a tool raises an interrupt during a direct call → RuntimeError("cannot raise interrupt in direct tool call")

This is a known SDK limitation — direct/programmatic tool calls have no mechanism to pause execution, collect human input, and resume. The error surfaces as a failed tool result back to the agent, so the agent can reason about it and try an alternative approach.

Add limitations note regarding interrupt-capable tools.

Aligns programmatic_tool_caller with the sandboxes design doc (strands-agents/docs#681) Phase 1 requirements: - Remove Executor ABC and LocalAsyncExecutor classes The design doc separates Sandbox (SDK-level, where code runs) from the programmatic tool caller (tools-level, runs in host process). The Executor abstraction competed with the Sandbox design. - Inline async execution logic directly in the tool function Phase 1 always runs orchestration code in-process. The ~15 lines of execution logic are now directly in programmatic_tool_caller(). - Use compile() for better error tracebacks Per the design doc: compile(code, '<programmatic_tool_caller>', 'exec') gives clearer tracebacks than raw exec(). - Remove custom executor documentation and examples The Custom Executors section in the module docstring is removed. The Sandbox + Tool Proxy design (Phase 2) replaces this concept. - Remove executor-related tests TestExecutor class and test_custom_executor removed. Added test_stderr_captured and test_syntax_error_handled for coverage. The core tool logic (tool wrappers, _execute_tool, _create_async_tool_function, _validate_code, _get_allowed_tools) is unchanged. The tool gets simpler, not more complex. Refs: strands-agents/docs#681, strands-agents#387

…configurable extras Changes: - Remove _validate_code() — python_repl has no code validation, neither should we - Match base namespace to python_repl: {"__name__": "__main__"} instead of custom {__builtins__, asyncio, json, re, math} - asyncio always injected (required for async tool wrappers) - Add PROGRAMMATIC_TOOL_CALLER_EXTRA_MODULES env var for configurable namespace (comma-separated module names, e.g. "json,re,math,collections") - Extract _build_namespace() helper for clarity and testability - Add TestBuildNamespace test class (7 tests) - Add test_extra_modules_available_in_code and test_code_can_import_modules - Remove TestValidateCode test class and all _validate_code references - Remove unused imports (json, math, re from top-level)

agent-of-mkmeral

🔴 Adversarial Testing Result: FAIL — 3 issues found

Scope: Full adversarial testing of programmatic_tool_caller — input boundaries, stdout/stderr restoration, async execution, namespace security, tool execution error paths, env var handling, contract verification, BaseException handling, indentation/wrapping, and concurrency.

Tests written: 47
Tests passing: 47 (all findings confirmed with reproducible artifacts)
Tests failing (findings): 3 bugs found and proven

Findings Summary

#	Category	Severity	Description
1	Bug	Critical	`SystemExit` and `KeyboardInterrupt` escape the tool — `sys.exit()` in user code kills the host process
2	Bug	Medium	Empty code and comment-only code cause `SyntaxError` due to async wrapping producing empty function body
3	Bug	Medium	Tool named `asyncio` shadows the required `asyncio` module in namespace, breaking all async functionality

Finding 1 — BaseException Escape (Critical)

Category: Bug
Severity: Critical

Reproduction:

# This kills the host process:
result = programmatic_tool_caller(code="import sys; sys.exit(42)", tool_context=ctx)
# ^^^ SystemExit propagates past except Exception — host dies

# This also escapes:
result = programmatic_tool_caller(code="raise KeyboardInterrupt()", tool_context=ctx)
# ^^^ KeyboardInterrupt propagates to caller

Root cause: The exception handlers are:

except SyntaxError:
    ...
except Exception:
    ...

SystemExit and KeyboardInterrupt inherit from BaseException, NOT Exception. They bypass both catch blocks.

Observed behavior: SystemExit propagates up the call stack, killing the host process. KeyboardInterrupt similarly escapes.

Expected behavior: User code should never be able to kill the host. The tool should catch these and return an error dict.

Fix:

except SyntaxError:
    ...
except (SystemExit, KeyboardInterrupt) as e:
    error_msg = f"Execution error: {type(e).__name__}: {e}"
    console.print(Panel(error_msg, title="[bold red]Error[/]", border_style="red"))
    return {"status": "error", "content": [{"text": error_msg}]}
except Exception:
    ...

Artifact: TestFinding2_BaseExceptionEscape::test_system_exit_escapes_tool, test_keyboard_interrupt_escapes_tool

Finding 2 — Empty/Comment-Only Code SyntaxError (Medium)

Category: Unhandled Edge Case
Severity: Medium

Reproduction:

# Empty code → SyntaxError
result = programmatic_tool_caller(code="", tool_context=ctx)
# Returns: {"status": "error", "content": [{"text": "Syntax error: expected an indented block..."}]}

# Comment-only code → SyntaxError  
result = programmatic_tool_caller(code="# just a comment", tool_context=ctx)
# Same SyntaxError

Root cause: The async wrapping does:

indented_code = textwrap.indent(code, "    ")
wrapped_code = f"async def __user_code__():\n{indented_code}\n"

Empty code → async def __user_code__():\n\n → empty function body = SyntaxError.
Comment-only → async def __user_code__():\n # comment\n → comments aren't statements = SyntaxError.

Expected behavior: Empty/comment-only code should succeed with (no output), same as passing x = 42.

Fix: Add pass if code strip is empty or comments-only:

if not code.strip() or all(line.strip().startswith('#') for line in code.strip().splitlines()):
    indented_code = "    pass"
else:
    indented_code = textwrap.indent(code, "    ")

Artifact: TestFinding1_EmptyCodeWrapping::test_empty_code_string_fails, test_comment_only_code_fails

Finding 3 — asyncio Module Shadowing (Medium)

Category: Bug
Severity: Medium

Reproduction:

# If a tool named 'asyncio' exists in the registry:
ns = _build_namespace({"asyncio"}, agent)
assert ns["asyncio"] is not asyncio_module  # True! It's now a tool wrapper
# asyncio.gather(), asyncio.sleep() etc. are now broken

Root cause: In _build_namespace(), asyncio module is injected first, then tools overwrite by name:

namespace["asyncio"] = asyncio  # ← injected
for tool_name in available_tools:
    namespace[tool_name] = _create_async_tool_function(...)  # ← overwrites 'asyncio'

Expected behavior: Reserved names (asyncio, __name__, extra modules) should not be overwritable by tool names.

Fix: Either exclude reserved names from tool injection, or inject tools first and modules after:

# Option A: Exclude reserved names
RESERVED = {"asyncio", "__name__"}
for tool_name in available_tools - RESERVED:
    namespace[tool_name] = ...

# Option B: Inject tools first, then required modules (modules win)
for tool_name in available_tools:
    namespace[tool_name] = ...
namespace["asyncio"] = asyncio  # Overwrite any tool shadowing

Artifact: TestFinding3_AsyncioShadowing::test_asyncio_tool_overwrites_module

What Survived

41 adversarial tests passed without finding issues:

✅ stdout/stderr properly restored after RuntimeError, SyntaxError, and user-replaced stdout
✅ Unicode, null bytes, large output all handled correctly
✅ Async code with user-defined coroutines, asyncio.sleep, asyncio.gather all work
✅ Tool errors properly propagate to user code via try/except
✅ asyncio.gather with 20 concurrent calls completes correctly
✅ Sequential tool calls in loops work correctly
✅ Namespace isolation: agent object not leaked, __name__ properly set
✅ Self-exclusion: programmatic_tool_caller never available in namespace
✅ Return format consistent (always {"status": ..., "content": [{"text": ...}]})
✅ Env var edge cases (extra commas, whitespace, empty strings) handled correctly
✅ Complex code structures (decorators, classes, nested indentation) work through async wrapping

🤖 AI agent response. Strands Agents. Feedback welcome!

Containerized Agent added 7 commits February 5, 2026 04:37

docs: update PR description with final implementation details

d91494b

docs: add programmatic_tool_caller to README.md

7855781

- Add tool entry to the tools table - Add usage example section with code sample - Note that tool does not work on Windows (uses exec)

docs: update README with new async tool calling API

0911b99

chore: remove PR_DESCRIPTION.md from branch

091bae8

mkmeral changed the title ~~Feat/programmatic tool caller~~ feat: Add programmatic tool caller Feb 5, 2026

mkmeral temporarily deployed to auto-approve February 5, 2026 05:35 — with GitHub Actions Inactive

fix: use asyncio.get_running_loop() instead of deprecated get_event_l…

41ba670

…oop()

mkmeral temporarily deployed to auto-approve February 5, 2026 05:44 — with GitHub Actions Inactive

Containerized Agent added 2 commits February 5, 2026 06:10

refactor: async-only API with auto async context wrapping

ef58a2a

- Remove sync functions, only expose async (await tool_name(...)) - Auto-wrap user code in async function - no boilerplate needed - Support asyncio.gather() for parallel execution - Simplified implementation and tests

chore: remove PR_DESCRIPTION.md

a900f94

mkmeral temporarily deployed to auto-approve February 5, 2026 06:10 — with GitHub Actions Inactive

Containerized Agent added 3 commits February 5, 2026 15:08

chore: remove PR_DESCRIPTION.md

eca71ad

chore: add PR_DESCRIPTION.md to gitignore

ca41d0f

mkmeral temporarily deployed to auto-approve February 5, 2026 15:09 — with GitHub Actions Inactive

Revert "chore: add PR_DESCRIPTION.md to gitignore"

4c0857b

This reverts commit ca41d0f.

mkmeral temporarily deployed to auto-approve February 5, 2026 15:35 — with GitHub Actions Inactive

mkmeral marked this pull request as ready for review February 5, 2026 22:09

mkmeral temporarily deployed to auto-approve February 6, 2026 17:28 — with GitHub Actions Inactive

mkmeral force-pushed the feat/programmatic-tool-caller branch from be93226 to 717566e Compare February 6, 2026 17:43

mkmeral temporarily deployed to auto-approve February 6, 2026 17:43 — with GitHub Actions Inactive

mkmeral force-pushed the feat/programmatic-tool-caller branch from 717566e to b273bd6 Compare February 6, 2026 18:43

mkmeral temporarily deployed to auto-approve February 6, 2026 18:43 — with GitHub Actions Inactive

mkmeral commented Feb 10, 2026

View reviewed changes

fix: document limitations for tool execution

99d0264

Add limitations note regarding interrupt-capable tools.

mkmeral temporarily deployed to auto-approve February 23, 2026 20:34 — with GitHub Actions Inactive

mkmeral changed the title ~~feat: Add programmatic tool caller~~ feat: add programmatic tool caller Feb 23, 2026

fix: code quality improvements for programmatic tool caller

029065d

mkmeral temporarily deployed to auto-approve February 23, 2026 22:03 — with GitHub Actions Inactive

mkmeral mentioned this pull request Mar 18, 2026

docs(designs): sandboxes and code execution strands-agents/docs#681

Open

4 tasks

agent-of-mkmeral added 2 commits March 25, 2026 13:46

agent-of-mkmeral reviewed Mar 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add programmatic tool caller#387

feat: add programmatic tool caller#387
mkmeral wants to merge 19 commits intostrands-agents:mainfrom
mkmeral:feat/programmatic-tool-caller

mkmeral commented Feb 5, 2026 •

edited

Loading

Uh oh!

mkmeral left a comment •

edited

Loading

Uh oh!

agent-of-mkmeral left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mkmeral commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Key Features

Example Usage

Environment Variables

Custom Executors

Related Issues

Type of Change

Testing

Checklist

Uh oh!

mkmeral left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Interrupt limitation with programmatic tool calls

Uh oh!

agent-of-mkmeral left a comment

Choose a reason for hiding this comment

🔴 Adversarial Testing Result: FAIL — 3 issues found

Finding 1 — BaseException Escape (Critical)

Finding 2 — Empty/Comment-Only Code SyntaxError (Medium)

Finding 3 — asyncio Module Shadowing (Medium)

What Survived

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mkmeral commented Feb 5, 2026 •

edited

Loading

mkmeral left a comment •

edited

Loading