# OpenTools Agents Demo: ZeroShot, CoT, ReAct, OctoTools, OpenTools

This notebook demonstrates how to run different OpenTools agents via `UnifiedSolver` and compare their behavior.

**Table of contents**

1. Setup
2. Agent demos
   - 2.1 ZeroShot (no tools, trace)
   - 2.2 Chain-of-Thought (no tools, trace)
   - 2.3 ReAct (simple task + tool-using task)
   - 2.4 OctoTools (simple task + tool-using task)
   - 2.5 OpenTools (tool-using multi-step task)


## 1. Setup

Run this section once to configure imports and helper utilities before running the agent demos below.

In [1]:
import os
import json
import sys
sys.path.insert(0, "..")
sys.path.insert(0, "src")
from opentools import UnifiedSolver

  from .autonotebook import tqdm as notebook_tqdm


### Agent types and what to look for

This notebook compares several agent styles **before** you see their traces. At a high level:

- **ZeroShot**: Single LLM call, **no tools**, no explicit planning. Fast and cheap, but can be brittle on harder tasks. The result dictionary mainly has a `direct_output` string.
- **Chain-of-Thought (CoT)**: Still **no tools**, but the LLM is prompted to reason step-by-step. You will usually see a longer textual reasoning trace and a final answer in `direct_output`.
- **ReAct**: Alternates between **Thought ‚Üí Action (tool) ‚Üí Observation**. The logs show multiple "ReAct Reasoning Cycle" steps, tool calls, and observations, followed by a **Final Answer** section.
- **OctoTools**: Uses a more structured **plan ‚Üí tool calls ‚Üí memory** loop. The trace is broken into named steps such as *Query Analysis*, *Action Prediction*, *Command Generation*, *Command Execution*, and *Context Verification*, plus a final summary.
- **OpenTools (more detailed)**: Think of this as a **small team of sub-agents** working together:
  1. **Input**: You pass a single `question` (and optional `image_path` / extra kwargs) into `UnifiedSolver(..., agent_name="opentools", ...)`.
  2. **Reasoner sub-agent**: Reads the question (and global memory) and breaks it into a **sub-problem** like ‚Äúfind this paper‚Äù or ‚Äúcompute this quantity and explain it‚Äù. This shows up in the logs as `sub_problem=...`.
  3. **Tool-call generator**: Given that sub-problem, it decides **which tools** to call (e.g. `Arxiv_Paper_Search_Tool`, `Search_Engine_Tool`, `Visual_AI_Tool`) and with what arguments. You will see a list of `Generated Tool calls: [...]`.
  4. **Executor**: Actually runs those tools and records raw results (JSON, text, etc.). This is where you see `Executed Tool calls: [...]` with concrete outputs.
  5. **Verifier**: Checks whether the tool results really answer the sub-problem, may clean up the output, and writes a concise summary.
  6. **Global memory**: The verified summary is stored in a shared memory dict (keyed by step index). Later reasoning cycles can **read this memory** instead of re-calling tools.
  7. **Loop + stopping**: Steps 2‚Äì6 repeat, using the growing memory, until the reasoner marks `stop=True`.
  8. **Final answer**: A final summarization step reads from global memory and produces a **single, user-facing answer**. In Python, you‚Äôll see this in `result["direct_output"]` or `result["final_output"]`, while the long trace (logs + memory) explains how that answer was constructed.

If you only care about the answer, focus on `direct_output` / `final_output`. If you want to understand *how* the agent solved the problem, scroll up through the OpenTools log and follow the numbered reasoning cycles and memory updates.

## 2. Agent demos

Each subsection below runs the same (or similar) task with a different agent so you can compare behavior and traces.

### 2.1 ZeroShot (no tools)

A fast, single-shot answer from the LLM with no tools enabled. We print the full result so you can inspect any trace fields.

In [2]:
zero_shot_solver = UnifiedSolver(
    agent_name="zero_shot",
    llm_engine_name="gpt-4o-mini",
    verbose=True,
)

zs_question = "What is the capital of United States?"
zs_result = zero_shot_solver.solve(question=zs_question)

print("ZeroShot (no tools):", zs_result["direct_output"])

[94m[19:17:20][ZeroShotLLM][INFO] Initializing ZeroShot LLM agent...[0m
[94m[19:17:21][ZeroShotLLM][INFO] ZeroShot LLM agent initialized successfully[0m
UnifiedSolver initialized with agent: ZeroShotLLM
Agent description: Zero-shot LLM responses without tools - fast and simple
[94m[19:17:21][ZeroShotLLM][INFO] Received question: What is the capital of United States?[0m
[94m[19:17:21][ZeroShotLLM][INFO] Generating direct LLM response...[0m
[94m[19:17:22][ZeroShotLLM][INFO] LLM Response:
The capital of the United States is Washington, D.C.[0m
[94m[19:17:22][ZeroShotLLM][INFO] Token Usage Summary:[0m
[94m[19:17:22][ZeroShotLLM][INFO]   Total tokens: 43[0m
[94m[19:17:22][ZeroShotLLM][INFO]   Prompt tokens: 30[0m
[94m[19:17:22][ZeroShotLLM][INFO]   Completion tokens: 13[0m
[94m[19:17:22][ZeroShotLLM][INFO]   API calls: 1[0m
[94m[19:17:22][ZeroShotLLM][INFO] Completed in 1.36 seconds[0m
ZeroShot (no tools): The capital of the United States is Washington, D.C.


### 2.2 Chain-of-Thought (no tools)

Chain-of-Thought encourages the model to reason step by step. This example uses a small math/logic problem and prints any available reasoning trace.

In [3]:
cot_solver = UnifiedSolver(
    agent_name="chain_of_thought",
    llm_engine_name="gpt-4o-mini",
    verbose=True,
)

cot_question = (
    "Solve step by step: A store sells apples for $2 each and bananas for $1 each. "
    "If I buy 3 apples and 4 bananas, how much do I pay in total?"
)
cot_result = cot_solver.solve(question=cot_question)

print("Chain-of-Thought (no tools):", cot_result["direct_output"])

[94m[19:17:22][ChainOfThought][INFO] Initializing Chain of Thought agent...[0m
[94m[19:17:22][ChainOfThought][INFO] Chain of Thought agent initialized successfully[0m
UnifiedSolver initialized with agent: ChainOfThought
Agent description: Step-by-step reasoning without tools - good for complex logic problems
[94m[19:17:22][ChainOfThought][INFO] Received question: Solve step by step: A store sells apples for $2 each and bananas for $1 each. If I buy 3 apples and 4 bananas, how much do I pay in total?[0m
[94m[19:17:22][ChainOfThought][INFO] Generating step-by-step reasoning...[0m
[94m[19:17:29][ChainOfThought][INFO] Chain of Thought Reasoning:
To solve the problem step by step, we need to calculate the total cost of the apples and bananas separately and then add them together.

1. **Determine the cost of the apples:**
   - The price of one apple is $2.
   - You are buying 3 apples.
   - To find the total cost for the apples, multiply the number of apples by the price per apple:


### 2.3 ReAct agent

ReAct alternates between **Thought ‚Üí Action (tool) ‚Üí Observation**.

In [4]:
# 2.3.2 ReAct with tools (image illusion)
react_tool_solver = UnifiedSolver(
    agent_name="react",
    llm_engine_name="gpt-4o-mini",
    verbose=True,
    enabled_tools=["Visual_AI_Tool"],
    output_types="direct",
)

react_tool_question = (
    "Look at the provided image and use tools as needed. "
    "Question: What color is the dog in the image, what breed is it, and what is the dog lying next to? "
    "Answer in one or two short sentences."
)

react_tool_result = react_tool_solver.solve(
    question=react_tool_question,
    image_path=r"../assets/image.jpg",
)

print("ReAct image", react_tool_result["direct_output"])

[94m[19:17:29][ReAct][INFO] Initializing ReAct reasoning components üß†...[0m
[94m[19:17:29][ReAct][INFO] ReAct reasoning components initialized successfully üß†[0m
[94m[19:17:29][ReAct][INFO] Enable FAISS retrieval: False at ReAct[0m
[94m[19:17:29][ReAct][INFO] Enabled tools üîß: ['Visual_AI_Tool'][0m
[94m[19:17:29][ReAct][INFO] Initializing tool-based agent components...[0m
[94m[19:17:29][ReAct][INFO] Initializing tool capabilities...[0m




[94m[19:17:32][ReAct][INFO] Available tools that is successfully loaded üîß: ['Visual_AI_Tool'][0m
[94m[19:17:32][ReAct][INFO] Tool capabilities initialized successfully[0m
[94m[19:17:32][ReAct][INFO] FAISS tool retrieval disabled - using all available tools[0m
[94m[19:17:32][ReAct][INFO] Tool-based agent components initialized successfully[0m
UnifiedSolver initialized with agent: ReAct
Agent description: Reasoning and Acting agent - alternates between thinking and tool usage
[94m[19:17:32][ReAct][INFO] Received question: Look at the provided image and use tools as needed. Question: What color is the dog in the image, what breed is it, and what is the dog lying next to? Answer in one or two short sentences.[0m
[94m[19:17:32][ReAct][INFO] Received image: ../assets/image.jpg[0m
[94m[19:17:32][ReAct][INFO] Using all available tools: ['Visual_AI_Tool'][0m
[94m[19:17:32][ReAct][INFO] Starting ReAct reasoning and acting loop üí≠...[0m

Agent Step 1: ReAct Reasoning Cycle 1


In [5]:
react_tool_solver = UnifiedSolver(
    agent_name="react",
    llm_engine_name="gpt-4o-mini",
    verbose=True,
    enabled_tools=[
        "Search_Engine_Tool",
        "URL_Text_Extractor_Tool",
        "Arxiv_Paper_Search_Tool",  
    ],
    output_types="direct",
)

react_tool_question = "Who is the author of the paper 'OctoTools: An Agentic Framework with Extensible Tools for Complex Reasoning'"
react_tool_result = react_tool_solver.solve(question=react_tool_question)

print("ReAct: ", react_tool_result["direct_output"])

[94m[19:17:48][ReAct][INFO] Initializing ReAct reasoning components üß†...[0m
[94m[19:17:48][ReAct][INFO] ReAct reasoning components initialized successfully üß†[0m
[94m[19:17:48][ReAct][INFO] Enable FAISS retrieval: False at ReAct[0m
[94m[19:17:48][ReAct][INFO] Enabled tools üîß: ['Search_Engine_Tool', 'URL_Text_Extractor_Tool', 'Arxiv_Paper_Search_Tool'][0m
[94m[19:17:48][ReAct][INFO] Initializing tool-based agent components...[0m
[94m[19:17:48][ReAct][INFO] Initializing tool capabilities...[0m
[94m[19:17:48][ReAct][INFO] Available tools that is successfully loaded üîß: ['Arxiv_Paper_Search_Tool', 'Search_Engine_Tool', 'URL_Text_Extractor_Tool'][0m
[94m[19:17:48][ReAct][INFO] Tool capabilities initialized successfully[0m
[94m[19:17:48][ReAct][INFO] FAISS tool retrieval disabled - using all available tools[0m
[94m[19:17:48][ReAct][INFO] Tool-based agent components initialized successfully[0m
UnifiedSolver initialized with agent: ReAct
Agent description: Reasoni

### 2.4 OctoTools agent

OctoTools uses a **plan ‚Üí execute** style with memory. Again we show one simple task without tools, then a tool-using math example.

In [6]:
# 2.4.1 OctoTools with image illusion
octotools_simple_solver = UnifiedSolver(
    agent_name="octotools",
    llm_engine_name="gpt-4o-mini",
    verbose=True,
    enabled_tools=["Visual_AI_Tool"],
    output_types="direct",
)

octotools_simple_question = (
    "Look at the provided image and use tools as needed. "
    "Question: What color is the dog in the image, what breed is it, and what is the dog lying next to? "
    "Answer in one or two short sentences."
)

octotools_simple_result = octotools_simple_solver.solve(
    question=octotools_simple_question,
    image_path=r"../assets/image.jpg",
)

print("OctoTools image", octotools_simple_result["direct_output"])

[94m[19:18:19][OctoTools][INFO] Enabled tools üîß: ['Visual_AI_Tool'][0m
[94m[19:18:19][OctoTools][INFO] Initializing tool-based agent components...[0m
[94m[19:18:19][OctoTools][INFO] Initializing tool capabilities...[0m
[94m[19:18:19][OctoTools][INFO] Available tools that is successfully loaded üîß: ['Visual_AI_Tool'][0m
[94m[19:18:19][OctoTools][INFO] Tool capabilities initialized successfully[0m
[94m[19:18:19][OctoTools][INFO] FAISS tool retrieval disabled - using all available tools[0m
[94m[19:18:19][OctoTools][INFO] Tool-based agent components initialized successfully[0m
[94m[19:18:19][OctoTools][INFO] Initializing OctoTools reasoning components...[0m
[94m[19:18:19][OctoTools][INFO] OctoTools reasoning components initialized successfully[0m
UnifiedSolver initialized with agent: OctoTools
Agent description: Advanced tool-based agent with planning, memory, and step-by-step execution for complex tasks
[94m[19:18:19][OctoTools][INFO] Received Query: Look at the pr

In [7]:
# 2.4.2 OctoTools with tools (Wolfram)
octotools_tool_solver = UnifiedSolver(
    agent_name="octotools",
    llm_engine_name="gpt-4o-mini",
    verbose=True,
    enabled_tools=[
        "Search_Engine_Tool",
        "URL_Text_Extractor_Tool",
        "Arxiv_Paper_Search_Tool",  
    ],
    output_types="direct",
)

octotools_tool_question = "Who is the author of the paper 'OctoTools: An Agentic Framework with Extensible Tools for Complex Reasoning'"
octotools_tool_result = octotools_tool_solver.solve(question=octotools_tool_question)

print ("OctoTools (with Wolfram_Math_Tool)", octotools_tool_result)

[94m[19:18:58][OctoTools][INFO] Enabled tools üîß: ['Search_Engine_Tool', 'URL_Text_Extractor_Tool', 'Arxiv_Paper_Search_Tool'][0m
[94m[19:18:58][OctoTools][INFO] Initializing tool-based agent components...[0m
[94m[19:18:58][OctoTools][INFO] Initializing tool capabilities...[0m
[94m[19:18:58][OctoTools][INFO] Available tools that is successfully loaded üîß: ['Arxiv_Paper_Search_Tool', 'Search_Engine_Tool', 'URL_Text_Extractor_Tool'][0m
[94m[19:18:58][OctoTools][INFO] Tool capabilities initialized successfully[0m
[94m[19:18:58][OctoTools][INFO] FAISS tool retrieval disabled - using all available tools[0m
[94m[19:18:58][OctoTools][INFO] Tool-based agent components initialized successfully[0m
[94m[19:18:58][OctoTools][INFO] Initializing OctoTools reasoning components...[0m
[94m[19:18:58][OctoTools][INFO] OctoTools reasoning components initialized successfully[0m
UnifiedSolver initialized with agent: OctoTools
Agent description: Advanced tool-based agent with planning, 

### 2.5 OpenTools agent

OpenTools runs a multi-agent style loop with reasoning, generator/verifier, and memory. This example uses the paper searching tools (`Arxiv_Paper_Search_Tool`, `Search_Engine_Tool`) and web extractor tool (`URL_Text_Extractor_Tool`) on a slightly more involved web intensive task.

In [8]:
opentools_solver = UnifiedSolver(
    agent_name="opentools",
    llm_engine_name="gpt-5-mini",
    verbose=True,
    enabled_tools=[
        "Arxiv_Paper_Search_Tool",
        "Search_Engine_Tool",
        "URL_Text_Extractor_Tool",
    ],
    output_types="direct",
)

opentools_question = (
    "Step 1: Find the first author of the paper titled 'OctoTools: An Agentic Framework with Extensible Tools for Complex Reasoning'.\n"
    "Step 2: Return the affiliation(s) of that person and his/her research interests"
)
opentools_result = opentools_solver.solve(question=opentools_question)

print("OpenTools (multi-step author + citation search)", opentools_result["direct_output"])

[94m[19:19:31][OpenTools][INFO] Enabled tools üîß: ['Arxiv_Paper_Search_Tool', 'Search_Engine_Tool', 'URL_Text_Extractor_Tool'][0m
[94m[19:19:31][OpenTools][INFO] Initializing tool-based agent components...[0m
[94m[19:19:31][OpenTools][INFO] Initializing tool capabilities...[0m
[94m[19:19:31][OpenTools][INFO] Available tools that is successfully loaded üîß: ['Arxiv_Paper_Search_Tool', 'Search_Engine_Tool', 'URL_Text_Extractor_Tool'][0m
[94m[19:19:31][OpenTools][INFO] Tool capabilities initialized successfully[0m
[94m[19:19:31][OpenTools][INFO] FAISS tool retrieval disabled - using all available tools[0m
[94m[19:19:31][OpenTools][INFO] Tool-based agent components initialized successfully[0m
UnifiedSolver initialized with agent: OpenTools
Agent description: OpenTools agent - uses tools to solve problems
[94m[19:19:31][OpenTools][INFO] Received question: Step 1: Find the first author of the paper titled 'OctoTools: An Agentic Framework with Extensible Tools for Complex Re