# Building an Agent Reasoning Loop

What if a user asks a complex question with multiple steps to be clarified? This project is aimed to implement complete Agent Reasoning Loop, which,  instead of tool calling, is able to reason over tools and multiple steps.

# References

This project is based on the course **"Building Agentic RAG with Llamaindex"** by **Deeplearning.AI** and is available at the following [link](https://learn.deeplearning.ai/courses/building-agentic-rag-with-llamaindex/).

## Setup

In [None]:
# Mounting to Google Drive
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
cd "YOUR-PATH-HERE"

In [None]:
%%capture
!pip install llama-index llama-index-llms-openai llama-index-embeddings-openai openai pypdf

In [None]:
!pip list | grep llama-index

llama-index                             0.12.19
llama-index-agent-openai                0.4.6
llama-index-cli                         0.4.0
llama-index-core                        0.12.19
llama-index-embeddings-openai           0.3.1
llama-index-indices-managed-llama-cloud 0.6.7
llama-index-llms-openai                 0.3.20
llama-index-multi-modal-llms-openai     0.4.3
llama-index-program-openai              0.3.1
llama-index-question-gen-openai         0.3.0
llama-index-readers-file                0.4.5
llama-index-readers-llama-parse         0.4.0


In [None]:
import os
from llama_index.core import (
    VectorStoreIndex,
    SummaryIndex,
    SimpleDirectoryReader,
    ServiceContext,
    Settings
)
from llama_index.core import SimpleDirectoryReader
from llama_index.core.node_parser import SentenceSplitter
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.core.query_engine import RouterQueryEngine
from llama_index.core.tools import QueryEngineTool
from llama_index.core.selectors import LLMSingleSelector
from llama_index.core.prompts import PromptTemplate
from llama_index.core.tools import FunctionTool, QueryEngineTool
from llama_index.core.vector_stores import MetadataFilters, FilterCondition
from typing import List, Optional
import textwrap

In [None]:
# Set OpenAI API key
import openai

openai.api_key = 'YOUR-OPENAI-API-KEY-HERE'

In [None]:
import nest_asyncio
nest_asyncio.apply()

## Load the data

In [None]:
#the pdf file is available here
#!wget "https://openreview.net/pdf?id=VtmBAGCN7o" -O metagpt.pdf

## Setup the Query Tools

In [None]:
def get_doc_tools(
    file_path: str,
    name: str,
) -> str:
    """Get vector query and summary query tools from a document."""

    # load documents
    documents = SimpleDirectoryReader(input_files=[file_path]).load_data()
    splitter = SentenceSplitter(chunk_size=1024)
    nodes = splitter.get_nodes_from_documents(documents)
    vector_index = VectorStoreIndex(nodes)

    def vector_query(
        query: str,
        page_numbers: Optional[List[str]] = None
    ) -> str:
        """Use to answer questions over the MetaGPT paper.

        Useful if you have specific questions over the MetaGPT paper.
        Always leave page_numbers as None UNLESS there is a specific page you want to search for.

        Args:
            query (str): the string query to be embedded.
            page_numbers (Optional[List[str]]): Filter by set of pages. Leave as NONE
                if we want to perform a vector search
                over all pages. Otherwise, filter by the set of specified pages.

        """

        page_numbers = page_numbers or []
        metadata_dicts = [
            {"key": "page_label", "value": p} for p in page_numbers
        ]

        query_engine = vector_index.as_query_engine(
            similarity_top_k=2,
            filters=MetadataFilters.from_dicts(
                metadata_dicts,
                condition=FilterCondition.OR
            )
        )
        response = query_engine.query(query)
        return response


    vector_query_tool = FunctionTool.from_defaults(
        name=f"vector_tool_{name}",
        fn=vector_query
    )

    summary_index = SummaryIndex(nodes)
    summary_query_engine = summary_index.as_query_engine(
        response_mode="tree_summarize",
        use_async=True,
    )
    summary_tool = QueryEngineTool.from_defaults(
        name=f"summary_tool_{name}",
        query_engine=summary_query_engine,
        description=(
            "Use ONLY IF you want to get a holistic summary of MetaGPT. "
            "Do NOT use if you have specific questions over MetaGPT."
        ),
    )

    return vector_query_tool, summary_tool

In [None]:
vector_tool, summary_tool = get_doc_tools("metagpt.pdf", "metagpt")

## Setup Function Calling Agent

In [None]:
llm = OpenAI(model="gpt-3.5-turbo", temperature=0)

### High-Level Agent Architecture

Based on LlamaIndex [documentation](https://docs.llamaindex.ai/en/stable/examples/agent/agent_runner/agent_runner/), **"agents"** are composed of **AgentRunner** objects that interact with **AgentWorkers**. AgentRunners are orchestrators that store state (including conversational memory), create and maintain tasks, run steps through each task, and offer the user-facing, high-level interface for users to interact with.

AgentWorkers control the step-wise execution of a Task. Given an input step, an agent worker is responsible for generating the next step. They can be initialized with parameters and act upon state passed down from the Task/TaskStep objects, but do not inherently store state themselves. The outer AgentRunner is responsible for calling an AgentWorker and collecting/aggregating the results."


In [None]:
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner

agent_worker = FunctionCallingAgentWorker.from_tools(
    [vector_tool, summary_tool],
    llm=llm,
    verbose=True
)
agent = AgentRunner(agent_worker)

In [None]:
response = agent.query(
    "Tell me about the agent roles in MetaGPT, "
    "and then how they communicate with each other."
)

Added user message to memory: Tell me about the agent roles in MetaGPT, and then how they communicate with each other.
=== Calling Function ===
Calling function: summary_tool_metagpt with args: {"input": "agent roles in MetaGPT"}
=== Function Output ===
In MetaGPT, the agent roles include the Product Manager responsible for creating the Product Requirement Document and analyzing requirements, the Architect who designs the system architecture and technical specifications, the Project Manager who breaks down the project into tasks and assigns them to Engineers, the Engineers who develop the code based on specifications, and the QA Engineer who reviews the code, creates unit tests, and ensures software quality. Each agent has a specialized role contributing to the collaborative software development process in MetaGPT.
=== Calling Function ===
Calling function: summary_tool_metagpt with args: {"input": "communication between agent roles in MetaGPT"}
=== Function Output ===
The communicatio

In [None]:
print(response.source_nodes[0].get_content(metadata_mode="all"))

page_label: 1
file_name: metagpt.pdf
file_path: metagpt.pdf
file_type: application/pdf
file_size: 16911937
creation_date: 2025-02-18
last_modified_date: 2025-02-14

Preprint
METAGPT: M ETA PROGRAMMING FOR A
MULTI -AGENT COLLABORATIVE FRAMEWORK
Sirui Hong1∗, Mingchen Zhuge2∗, Jonathan Chen1, Xiawu Zheng3, Yuheng Cheng4,
Ceyao Zhang4, Jinlin Wang1, Zili Wang, Steven Ka Shing Yau5, Zijuan Lin4,
Liyang Zhou6, Chenyu Ran1, Lingfeng Xiao1,7, Chenglin Wu1†, J¨urgen Schmidhuber2,8
1DeepWisdom, 2AI Initiative, King Abdullah University of Science and Technology,
3Xiamen University, 4The Chinese University of Hong Kong, Shenzhen,
5Nanjing University, 6University of Pennsylvania,
7University of California, Berkeley, 8The Swiss AI Lab IDSIA/USI/SUPSI
ABSTRACT
Remarkable progress has been made on automated problem solving through so-
cieties of agents based on large language models (LLMs). Existing LLM-based
multi-agent systems can already solve simple dialogue tasks. Solutions to more
complex tasks

In [None]:
# The agents can maintain the conversation history over time
response = agent.chat(
    "Tell me about the evaluation datasets used."
)

Added user message to memory: Tell me about the evaluation datasets used.
=== Calling Function ===
Calling function: summary_tool_metagpt with args: {"input": "evaluation datasets used in MetaGPT"}
=== Function Output ===
The evaluation datasets used in MetaGPT include HumanEval, MBPP, and a self-generated SoftwareDev dataset.
=== LLM Response ===
The evaluation datasets used in MetaGPT include HumanEval, MBPP, and a self-generated SoftwareDev dataset.


In [None]:
response = agent.chat("Tell me the results over one of the above datasets.")

Added user message to memory: Tell me the results over one of the above datasets.
=== Calling Function ===
Calling function: vector_tool_metagpt with args: {"query": "results over HumanEval dataset", "page_numbers": null}
=== Function Output ===
MetaGPT achieved a pass rate of 85.9% and 87.7% over the HumanEval dataset.
=== LLM Response ===
MetaGPT achieved a pass rate of 85.9% and 87.7% over the HumanEval dataset.


According to LlamaIndex documentation, there are the following benefits of agent control:
* **Decoupling of Task Creation and Execution**: Users gain the flexibility to schedule task execution according to their needs.
* **Enhanced Debuggabilty**: Offers deeper insights into each step of the execution process, improving troubleshooting capabilities.
* **Steerability**: Allows users to directly modify intermediate steps and incorporate human feedback for refined control.

## Lower-Level: Debuggability and Control

In [None]:
agent_worker = FunctionCallingAgentWorker.from_tools(
    [vector_tool, summary_tool],
    llm=llm,
    verbose=True
)
agent = AgentRunner(agent_worker)

In [None]:
task = agent.create_task(
    "Tell me about the agent roles in MetaGPT, "
    "and then how they communicate with each other."
)

In [None]:
step_output = agent.run_step(task.task_id)

Added user message to memory: Tell me about the agent roles in MetaGPT, and then how they communicate with each other.
=== Calling Function ===
Calling function: summary_tool_metagpt with args: {"input": "agent roles in MetaGPT"}
=== Function Output ===
The agent roles in MetaGPT include the Product Manager, who is responsible for creating the Product Requirement Document and analyzing user stories and competitive analysis; the Architect, who designs the system architecture; the Project Manager, who breaks down tasks and assigns them to Engineers; the Engineers, who develop the code based on specifications; and the QA Engineer, who reviews the code, generates unit tests, and ensures software quality. Each agent has a specific role in the collaborative workflow to contribute to the success of the project.
=== Calling Function ===
Calling function: summary_tool_metagpt with args: {"input": "how agents communicate with each other in MetaGPT"}
=== Function Output ===
Agents in MetaGPT comm

In [None]:
completed_steps = agent.get_completed_steps(task.task_id)
print(f"Num completed for task {task.task_id}: {len(completed_steps)}")
print(completed_steps[0].output.sources[0].raw_output)

Num completed for task 24c88674-90be-4b3b-8c65-6a48758f9e85: 1
The agent roles in MetaGPT include the Product Manager, who is responsible for creating the Product Requirement Document and analyzing user stories and competitive analysis; the Architect, who designs the system architecture; the Project Manager, who breaks down tasks and assigns them to Engineers; the Engineers, who develop the code based on specifications; and the QA Engineer, who reviews the code, generates unit tests, and ensures software quality. Each agent has a specific role in the collaborative workflow to contribute to the success of the project.


In [None]:
upcoming_steps = agent.get_upcoming_steps(task.task_id)
print(f"Num upcoming steps for task {task.task_id}: {len(upcoming_steps)}")
upcoming_steps[0]

Num upcoming steps for task 24c88674-90be-4b3b-8c65-6a48758f9e85: 1


TaskStep(task_id='24c88674-90be-4b3b-8c65-6a48758f9e85', step_id='aa9c72e1-b31c-4cac-af1c-f82e76f572a3', input=None, step_state={}, next_steps={}, prev_steps={}, is_ready=True)

In [None]:
step_output = agent.run_step(
    task.task_id, input="What about how agents share information?"
)

Added user message to memory: What about how agents share information?
=== Calling Function ===
Calling function: summary_tool_metagpt with args: {"input": "how agents share information in MetaGPT"}
=== Function Output ===
Agents in MetaGPT share information through a structured communication protocol that includes a shared message pool and a publish-subscribe mechanism. This allows them to exchange messages directly and subscribe to relevant messages based on their profiles. The shared message pool and subscription mechanism enhance communication efficiency by providing a centralized platform for information exchange and ensuring that agents receive only task-related information, thus avoiding information overload.


In [None]:
step_output = agent.run_step(task.task_id)
print(step_output.is_last)

=== LLM Response ===
Agents in MetaGPT share information through a structured communication protocol that includes a shared message pool and a publish-subscribe mechanism. This allows them to exchange messages directly and subscribe to relevant messages based on their profiles. The shared message pool and subscription mechanism enhance communication efficiency by providing a centralized platform for information exchange and ensuring that agents receive only task-related information, thus avoiding information overload.
True


In [None]:
# final response
response = agent.finalize_response(task.task_id)

# Extract the response text
response_text = str(response) if isinstance(response, str) else response.response

# Wrap text to a readable width
wrapped_response = textwrap.fill(response_text, width=80)

# Print structured output
print("=" * 80)
print("**Generated Final Response:**\n")
print(wrapped_response)
print("=" * 80)

**Generated Final Response:**

Agents in MetaGPT share information through a structured communication protocol
that includes a shared message pool and a publish-subscribe mechanism. This
allows them to exchange messages directly and subscribe to relevant messages
based on their profiles. The shared message pool and subscription mechanism
enhance communication efficiency by providing a centralized platform for
information exchange and ensuring that agents receive only task-related
information, thus avoiding information overload.
