Runnable companion to The Agent Is a Workflow That Writes Itself: recursive subagents lower to child workflows and PTC runs through a deterministic workflow-space interpreter.
The demo is deterministic by default. No LLM key is required — the model is a scripted activity that emits tool calls or final answers. You can opt into a live OpenAI or Anthropic model when you want to poke at it manually.
uv sync --group dev
uv run python -m scripts.demoscripts/demo.py starts temporal server start-dev plus one worker on one task queue,
then drops you into a small REPL. Each prompt starts a real DemoAgentWorkflow,
prints the Temporal graph it produced, and leaves any file side effects in
.demo-sandbox for inspection. Open http://localhost:8233 while the REPL is
running to inspect workflow histories in Temporal Web. This default path expects
the Temporal CLI to be on PATH.
The default model is scripted, so no LLM key is required. Use /scenarios and
/scenario ptc inside the REPL for deterministic walkthroughs of direct tools,
programmatic tool calling, subagents, retries, and validation feedback. To use a
live provider, pass --model; otherwise the demo stays in scripted mode.
You can also run one prompt or scripted scenario and exit:
uv run python -m scripts.demo --scenario ptc
uv run python -m scripts.demo --scenario retry
uv run python -m scripts.demo "write a note in the sandbox"If you already have a Temporal dev server running:
temporal server start-dev
uv run python -m scripts.demo --no-start-dev-server --address 127.0.0.1:7233For a lightweight run without Temporal Web, use Temporal's in-process test server:
uv run python -m scripts.demo --in-process --scenario ptcTo run with a real provider instead of the scripted model, provide --model and
the corresponding API key. scripts/demo.py loads .env automatically, so you can put
keys there instead of exporting them in your shell:
echo 'OPENAI_API_KEY=...' > .env
uv run python -m scripts.demo --model gpt-5.5
# or
echo 'ANTHROPIC_API_KEY=...' > .env
uv run python -m scripts.demo --model claude-3-5-haiku-latestBare model names default to OpenAI except names beginning with claude, which
default to Anthropic. You can also be explicit with openai:gpt-5.5 or
anthropic:claude-3-5-haiku-latest.
uv run pytestThe pytest suite runs the same workflow shape in Temporal's in-process test
environment and asserts that normal tools become generated activities, run_ptc
and spawn stay workflow-native, inner PTC calls use the target tool's
transport, retries happen at the Temporal boundary, validation failures are
model-visible, and cancellation propagates as cancellation.
Compose normal Pydantic AI capabilities, then wrap the agent with
TemporalAgent:
from datetime import timedelta
from pydantic_ai import Agent
from pydantic_ai.durable_exec.temporal import TemporalAgent
from temporalio.common import RetryPolicy
from durable_agents.agent import run_durable_subagent
from durable_agents.capabilities import DemoDeps, build_capabilities
sandbox, faults, subagents, ptc = build_capabilities(run_durable_subagent)
agent = Agent(
"openai:gpt-5.5",
deps_type=DemoDeps,
name="durable_capability_agent",
instructions="Use tools for durable work. Use spawn for bounded subagents.",
capabilities=[sandbox, faults, subagents, ptc],
)
durable_agent = TemporalAgent(
agent,
model_activity_config={
"start_to_close_timeout": timedelta(seconds=30),
"retry_policy": RetryPolicy(maximum_attempts=1),
},
tool_activity_config={
**faults.durable_tool_activity_config,
**subagents.durable_tool_activity_config,
**ptc.durable_tool_activity_config,
},
)Most tools should become generated Temporal activities. Two tools are different:
spawn starts a child workflow, and run_ptc orchestrates inner tool calls
inside the parent workflow. Mark those wrappers with False:
tool_activity_config = {
"subagents": {"spawn": False},
"ptc": {"run_ptc": False},
}Their inner work still uses the normal durable transport. A write_file(...)
inside run_ptc is a sandbox activity. A spawn(...) inside run_ptc is a
child workflow.
Capability tools are ordinary async Python functions. If they are not explicitly marked workflow-native, Pydantic AI's Temporal plugin runs them as generated activities:
from dataclasses import dataclass
from typing import cast
from pydantic_ai import RunContext
from pydantic_ai.capabilities import AbstractCapability
from pydantic_ai.toolsets import AgentToolset
from pydantic_ai.toolsets.function import FunctionToolset
from durable_agents.capabilities import DemoDeps, sandbox_path
@dataclass
class NotesCapability(AbstractCapability[DemoDeps]):
toolset_id: str = "notes"
def get_toolset(self) -> AgentToolset[DemoDeps]:
async def save_note(ctx: RunContext[DemoDeps], title: str, body: str) -> str:
target = sandbox_path(ctx.deps.sandbox_dir, f"{title}.txt")
target.write_text(body)
return str(target)
return cast(AgentToolset[DemoDeps], FunctionToolset([save_note], id=self.toolset_id))The model calls spawn. The runtime starts a child DemoAgentWorkflow on the
same task queue with PARENT_CLOSE_POLICY_TERMINATE:
{
"tool_name": "spawn",
"args": {
"agent_name": "generic",
"task": "Read the sandbox files and summarize the plan.",
},
}Use run_ptc when the model needs loops, branches, fanout, or aggregation over
existing tools:
{
"tool_name": "run_ptc",
"args": {
"code": """
results = await asyncio.gather(
write_file(path="notes/a.txt", content="alpha"),
spawn(agent_name="generic", task="Summarize alpha."),
)
results
""".strip(),
},
}run_ptc itself is not an activity. Its inner write_file(...) call is still a
generated activity, and its inner spawn(...) call is still a child workflow.
model step -> generated Temporal activity
normal capability tool -> generated Temporal activity
subagent spawn -> child DemoAgentWorkflow on the same task queue
programmatic tool calling -> workflow-native PtcCapability, no run_ptc activity
PTC inner tool/spawn call -> the same activity/child workflow as a direct call
Transient infrastructure failures retry as Temporal activities. Validation and known execution failures come back to the model as repairable feedback. Parent cancellation propagates to active child work and does not trigger another model turn.
Tool errors are routed by class in src/durable_agents/errors.py:
ToolValidationError surfaces as repairable model feedback, ToolExecutionError
as a known terminal failure, and ToolRetryableError opts the activity into its
Temporal retry policy.
Each scenario in tests/helpers/scenarios.py declares a scripted model route,
expected sandbox side effects, and expected Temporal history shape. The matrix
covers:
- sandbox and fault tools are generated activities;
run_ptcandspawnstay workflow-native;- PTC inner calls use the target tool's activity or child-workflow transport;
- retryable activity failures retry by Temporal policy;
- validation failures are model-visible and not infrastructure retries;
- child workflow failures bubble back as model-visible spawn failures;
- PTC parallel branches cancel in-flight siblings on first failure;
- parent cancellation interrupts direct, child, and PTC-inner work.
See tests/README.md for the full scenario catalog.
src/durable_agents/capabilities/: the capability package.sandbox.py,subagents.py,ptc.py, andfaults.pyhold the behavior;__init__.pywires the default demo set together.src/durable_agents/agent.py: builds the Pydantic AIAgent, wraps it withTemporalAgent, and marksrun_ptcandspawnas workflow-native.tests/helpers/scenarios.py: the scenario matrix — model scripts plus the expected Temporal graph and sandbox files.tests/README.md: the test contract in prose.scripts/migration_demo.py/scripts/migration_fuzz.py: see Cross-worker migration.
scripts/migration_demo.py and scripts/migration_fuzz.py start two
subprocess.Popen workers — each its own Python interpreter and SDK client —
polling one task queue, then run a parent workflow that fans out subagents.
They print which worker identity completed each workflow's tasks, so you can
see when a child lands on a different worker than the parent.
uv run python -m scripts.migration_demo
uv run python -m scripts.migration_fuzz --iterations 10migration_fuzz generates random workflow shapes from three families and
reports per-family cross-process placement frequency:
direct-spawns: parent emits N parallelspawntool calls.ptc-spawns: parent emits onerun_ptcthat gathers Nspawncalls.ptc-mixed: parent emits onerun_ptcthat gathers Wwrite_filecalls and Nspawncalls.
uv run python -m scripts.migration_fuzz --iterations 20 --workers 3 --max-n 5
uv run python -m scripts.migration_fuzz --kind ptc-spawns --iterations 10Both scripts require the Temporal CLI on PATH. Temporal does not guarantee
fair dispatch across pollers, so any single run may keep all work on one
worker; the fuzzer repeats and reports how often a child landed on a different
worker than the parent.
This repo omits a few things the production setup has: policy gates, platform
adapters, controlled worker-failover drills, and detached spawn/wait
handles.
MIT. Take, copy, modify, and reuse the code freely. See LICENSE.