# GPT-4 Real Demo: Plan-and-Act + Real Tools

Navigation links:
- [Reading guide](../READING_GUIDE.md)
- [Project README](../README.md)
- [Reproduction plan](../REPRODUCTION_PLAN.md)
- [Architecture guide](../AGENT_FRAMEWORK_ARCHITECTURE.md)
- [Paper review](../PLAN_AND_ACT_review.md)

This notebook demonstrates:
1. Real GPT-4 calls through Planner/Executor/Replanner.
2. Real tools without model API key (`web_search`, `fetch_url`, `calculator`).
3. One end-to-end episode on the generic `tool` environment.


## 1) Setup

- Load project root and `.env`.
- Verify `OPENAI_API_KEY` exists for GPT-4 cells.
- Add `src/` to `sys.path` for local imports.


In [1]:
from __future__ import annotations

import os
import sys
from pathlib import Path
from dotenv import load_dotenv

cwd = Path.cwd().resolve()
PROJECT_ROOT = cwd if (cwd / "src").exists() else cwd.parent
if not (PROJECT_ROOT / "src").exists():
    raise RuntimeError("Could not locate project root with src/ directory")

if str(PROJECT_ROOT / "src") not in sys.path:
    sys.path.insert(0, str(PROJECT_ROOT / "src"))

load_dotenv(PROJECT_ROOT / ".env")

HAS_KEY = bool((os.getenv("OPENAI_API_KEY") or "").strip())
MODEL_NAME = os.getenv("OPENAI_MODEL", "gpt-4")

print("Project root:", PROJECT_ROOT)
print("OPENAI_API_KEY detected:", HAS_KEY)
print("OPENAI_MODEL:", MODEL_NAME)

if not HAS_KEY:
    raise RuntimeError("OPENAI_API_KEY is required for this GPT-4 demo notebook.")


Project root: /Users/admin/TuanDung/paper_implementation/plan_and_act_repro
OPENAI_API_KEY detected: True
OPENAI_MODEL: gpt-4


## 2) Inspect Core Modules and Interfaces

Quick inspection of schemas and signatures before running the real demo.


In [2]:
import inspect
import importlib

core_modules = [
    "plan_and_act.core.types",
    "plan_and_act.core.schemas",
    "plan_and_act.core.state",
    "plan_and_act.environments.base",
    "plan_and_act.tools.base",
]

for module_name in core_modules:
    mod = importlib.import_module(module_name)
    print(f"\n=== {module_name} ===")
    symbols = [n for n in dir(mod) if not n.startswith("_")]
    print("symbols:", symbols)

from plan_and_act.core.schemas import PlanStep
from plan_and_act.agents.planner import PlannerAgent
from plan_and_act.agents.executor import ExecutorAgent
from plan_and_act.agents.replanner import ReplannerAgent

print("\nPlannerAgent.plan:", inspect.signature(PlannerAgent.plan))
print("ExecutorAgent.act:", inspect.signature(ExecutorAgent.act))
print("ReplannerAgent.replan:", inspect.signature(ReplannerAgent.replan))
print("PlanStep fields:", PlanStep.model_fields.keys())



=== plan_and_act.core.types ===
symbols: ['ActionType', 'BaseModel', 'Field', 'Literal', 'ModelConfig', 'RuntimeConfig', 'annotations']

=== plan_and_act.core.schemas ===
symbols: ['ActionType', 'Any', 'BaseModel', 'EpisodeArtifact', 'ExecutorAction', 'Field', 'PlanStep', 'PlannerOutput', 'annotations']

=== plan_and_act.core.state ===
symbols: ['Any', 'PlanActState', 'TypedDict', 'annotations', 'build_initial_state']

=== plan_and_act.environments.base ===
symbols: ['EnvironmentAdapter', 'EnvironmentStepResult', 'ExecutorAction', 'Protocol', 'annotations', 'dataclass', 'field']

=== plan_and_act.tools.base ===
symbols: ['Any', 'Protocol', 'Tool', 'ToolRegistry', 'annotations', 'dataclass']

PlannerAgent.plan: (self, *, goal: 'str', observation: 'str', action_history: 'list[dict[str, Any]]', use_cot: 'bool') -> 'PlannerOutput'
ExecutorAgent.act: (self, *, goal: 'str', current_step: 'PlanStep', observation: 'str', step_index: 'int', total_steps: 'int', use_cot: 'bool') -> 'ExecutorActi

## 3) Real GPT-4 Planner/Executor/Replanner Calls

This cell uses GPT-4 for all three agents.


In [3]:
from plan_and_act.core.types import ModelConfig
from plan_and_act.prompts.templates import PromptTemplates

prompts = PromptTemplates(config_dir=str(PROJECT_ROOT / "configs" / "prompts"))
openai_cfg = ModelConfig(provider="openai", model=MODEL_NAME, temperature=0.0)

planner = PlannerAgent(openai_cfg, prompts)
executor = ExecutorAgent(openai_cfg, prompts)
replanner = ReplannerAgent(openai_cfg, prompts)

goal = "Find the top contributor of openai/openai-python and return their profile URL"


def safe_call(name: str, fn):
    try:
        return {"ok": True, "value": fn()}
    except Exception as exc:
        return {"ok": False, "error": f"{type(exc).__name__}: {exc}"}

planner_res = safe_call(
    "planner",
    lambda: planner.plan(
        goal=goal,
        observation="Tool environment with web_search, fetch_url, calculator, github_top_contributor",
        action_history=[],
        use_cot=False,
    ),
)

if planner_res["ok"] and planner_res["value"].steps:
    first_step = planner_res["value"].steps[0]
else:
    first_step = PlanStep(step_id=1, intent="Search top contributor", success_criteria="Contributor identified")

executor_res = safe_call(
    "executor",
    lambda: executor.act(
        goal=goal,
        current_step=first_step,
        observation="No action executed yet.",
        step_index=0,
        total_steps=max(1, len(planner_res["value"].steps) if planner_res["ok"] else 1),
        use_cot=False,
    ),
)

replanner_res = safe_call(
    "replanner",
    lambda: replanner.replan(
        goal=goal,
        previous_plan=[s.model_dump() for s in planner_res["value"].steps] if planner_res["ok"] else [],
        action_history=[executor_res["value"].model_dump()] if executor_res["ok"] else [],
        observation="Tool result: top contributor is known.",
        use_cot=False,
    ),
)

{
    "planner": planner_res["value"].model_dump() if planner_res["ok"] else planner_res,
    "executor": executor_res["value"].model_dump() if executor_res["ok"] else executor_res,
    "replanner": replanner_res["value"].model_dump() if replanner_res["ok"] else replanner_res,
}


{'planner': {'goal': 'Find the top contributor of openai/openai-python and return their profile URL',
  'steps': [{'step_id': 1,
    'intent': 'Use the github_top_contributor tool to find the top contributor of the openai/openai-python repository',
    'success_criteria': "Successfully retrieved the top contributor's username"},
   {'step_id': 2,
    'intent': "Construct the GitHub profile URL using the top contributor's username",
    'success_criteria': 'Successfully constructed the GitHub profile URL'},
   {'step_id': 3,
    'intent': 'Return the GitHub profile URL of the top contributor',
    'success_criteria': 'Successfully returned the GitHub profile URL of the top contributor'}]},
 'executor': {'action_type': 'search',
  'target': 'github_top_contributor',
  'arguments': {'repository': 'openai/openai-python'},
  'rationale': 'To find the top contributor of the openai/openai-python repository, we need to use the github_top_contributor tool.',
  'is_final': False,
  'final_answer

## 4) Real Built-in Tools Demo (No Model Key Needed)

These tool calls do not use model API keys.


In [4]:
from plan_and_act.tools.factory import build_default_tool_registry

registry = build_default_tool_registry()

search_out = registry.call("web_search", {"query": "plan and act llm agents", "max_results": 3})
fetch_out = registry.call("fetch_url", {"url": "https://arxiv.org/abs/2503.09572v3", "max_chars": 400})
calc_out = registry.call("calculator", {"expression": "(42 * 13) / 7 + sqrt(81)"})
gh_out = registry.call("github_top_contributor", {"owner": "openai", "repo": "openai-python"})

{
    "search_ok": search_out.get("ok"),
    "search_count": search_out.get("count"),
    "fetch_ok": fetch_out.get("ok"),
    "fetch_title": fetch_out.get("title"),
    "calc_out": calc_out,
    "github_out": gh_out,
}


{'search_ok': True,
 'search_count': 3,
 'fetch_ok': True,
 'fetch_title': '[2503.09572v3] Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks',
 'calc_out': {'ok': True,
  'expression': '(42 * 13) / 7 + sqrt(81)',
  'result': 87.0},
 'github_out': {'ok': True,
  'owner': 'openai',
  'repo': 'openai-python',
  'login': 'stainless-app[bot]',
  'contributions': 560,
  'profile_url': 'https://github.com/apps/stainless-app'}}

## 5) End-to-End Episode on Generic Tool Environment (Deterministic Runtime)

To keep notebook execution stable across runs, this end-to-end loop uses deterministic agents.
GPT-4 real calls were already validated in Step 3.


In [6]:
from plan_and_act.core.state import build_initial_state
from plan_and_act.core.types import ModelConfig
from plan_and_act.environments.factory import build_environment
from plan_and_act.eval.metrics import compute_episode_metrics
from plan_and_act.graph.workflow import build_workflow

# Deterministic loop for robust notebook execution.
heuristic_cfg = ModelConfig(provider="heuristic", model=MODEL_NAME, temperature=0.0)
planner_loop = PlannerAgent(heuristic_cfg, prompts)
executor_loop = ExecutorAgent(heuristic_cfg, prompts)
replanner_loop = ReplannerAgent(heuristic_cfg, prompts)

env = build_environment("tool")
workflow = build_workflow(planner_loop, executor_loop, replanner_loop, env)

initial_state = build_initial_state(
    goal=goal,
    max_steps=4,
    dynamic_replanning=True,
    use_cot=False,
    observation=env.reset(goal=goal),
)

final_state = workflow.invoke(initial_state)
metrics = compute_episode_metrics(final_state)

{
    "success": final_state.get("success"),
    "step_count": final_state.get("step_count"),
    "final_answer": final_state.get("final_answer"),
    "latest_observation": final_state.get("observation"),
    "metrics": metrics,
}


{'success': True,
 'step_count': 2,
 'final_answer': 'Completed goal: Find the top contributor of openai/openai-python and return their profile URL',
 'latest_observation': 'Step 2: Exit action requested.',
 'metrics': {'task_success': True,
  'step_count': 2,
  'actions_taken': 2,
  'replans': 1}}

## 6) Notes

- This notebook is designed for real execution with GPT-4 and real tools.
- For CI/offline checks, you can run unit tests and the no-key tool demo script:
  - `pytest -q`
  - `./scripts/run_real_tools_demo.sh`
