## 3. Building an Agent Reasoning Loop

So far, our queries have been done in a single forward pass. 
- Given the query, we call the right tool with the right parameters and get back the response. 
- But this is still quite limiting. 
- What if the user asks a complex question consisting of multiple steps or a vague question that needs clarification? 
  - Let's define a complete agent reasoning loop. 
  - Instead of calling a tool in a single shot setting, an agent can reason over tools and multiple steps.

You will use the function calling agent implementation, which natively integrates with the function calling capabilities of LLMs. 

We can compress all the work we did before and use `utils.py` to call relevant functions.

In [1]:
from dotenv import load_dotenv
import nest_asyncio
import os
nest_asyncio.apply()
load_dotenv()

# Access variables
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

In [6]:
from utils3 import get_doc_tools

vector_tool, summary_tool = get_doc_tools("metaGPT.pdf", "metagpt")

Agent in LlamaIndex consists of two main components: 
- an agent worker, responsible for executing the next step of a given agent, and 
- an agent runner, responsible for creating a task and orchestrating runs of agent workers on top of a given task, returning the final response to the user. 
  
We import both function calling agent worker and agent runner from LlamaIndex, passing in tools like the vector tool and the summary tool, and set verbose to true to view intermediate outputs.

The function calling agent worker's primary responsibility is to use function calling to decide the next step based on the existing conversation history, memory, and current user input. 

It decides whether to call a tool and return a final response. The overall agent interface is behind the agent runner, which we use to query the agent. 

In [7]:
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo", temperature=0)

In [8]:
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner

agent_worker = FunctionCallingAgentWorker.from_tools(
    [vector_tool, summary_tool], 
    llm=llm, 
    verbose=True
)
agent = AgentRunner(agent_worker)

The agent breaks down this question into steps, 
- calling the summary tool to answer about agent roles in Meta GPT. It returns information on roles such as product manager, architect, project manager, QA, engineer, etc.,
- then addresses communication between these roles, showing structured and efficient interaction. 

The conversation history helps generate a final response, using tools like the vector tool for concise context.

In [9]:
response = agent.query(
    "Tell me about the agent roles in MetaGPT, "
    "and then how they communicate with each other."
)

Added user message to memory: Tell me about the agent roles in MetaGPT, and then how they communicate with each other.
=== Calling Function ===
Calling function: summary_tool_metagpt with args: {"input": "agent roles in MetaGPT"}
=== Function Output ===
The agent roles in MetaGPT are the Product Manager, Architect, Project Manager, Engineer, and QA Engineer. Each role has specific responsibilities in the software development process, such as generating Product Requirement Documents, designing system architecture, task allocation, code implementation, and quality assurance through unit testing. These roles work collaboratively to ensure the successful completion of software projects within the MetaGPT framework.
=== Calling Function ===
Calling function: summary_tool_metagpt with args: {"input": "how agents communicate with each other in MetaGPT"}
=== Function Output ===
Agents in MetaGPT communicate with each other through structured communication interfaces and a shared message pool. 

Running a multi-step query requires tracing sources. 
- We inspect response notes to trace back the content, maintaining conversation history over time with a memory module that uses a rolling buffer depending on the LLM's context window size. 
- When the agent uses a tool, it considers the current chat and previous conversation history to take the next action. 
- We switch from querying to using agent.chat, asking about the evaluation datasets used. 

In [10]:
print(response.source_nodes[0].get_content(metadata_mode="all"))

page_label: 1
file_name: metaGPT.pdf
file_path: metaGPT.pdf
file_type: application/pdf
file_size: 16715764
creation_date: 2024-06-08
last_modified_date: 2024-06-08

Preprint
METAGPT: M ETA PROGRAMMING FOR A
MULTI -AGENT COLLABORATIVE FRAMEWORK
Sirui Hong1∗, Mingchen Zhuge2∗, Jonathan Chen1, Xiawu Zheng3, Yuheng Cheng4,
Ceyao Zhang4,Jinlin Wang1,Zili Wang ,Steven Ka Shing Yau5,Zijuan Lin4,
Liyang Zhou6,Chenyu Ran1,Lingfeng Xiao1,7,Chenglin Wu1†,J¨urgen Schmidhuber2,8
1DeepWisdom,2AI Initiative, King Abdullah University of Science and Technology,
3Xiamen University,4The Chinese University of Hong Kong, Shenzhen,
5Nanjing University,6University of Pennsylvania,
7University of California, Berkeley,8The Swiss AI Lab IDSIA/USI/SUPSI
ABSTRACT
Remarkable progress has been made on automated problem solving through so-
cieties of agents based on large language models (LLMs). Existing LLM-based
multi-agent systems can already solve simple dialogue tasks. Solutions to more
complex tasks, however, 

In [11]:
response = agent.chat(
    "Tell me about the evaluation datasets used."
)

Added user message to memory: Tell me about the evaluation datasets used.
=== Calling Function ===
Calling function: summary_tool_metagpt with args: {"input": "evaluation datasets used in MetaGPT"}
=== Function Output ===
The evaluation datasets used in MetaGPT include HumanEval, MBPP, and a self-generated SoftwareDev dataset. The HumanEval dataset consists of 164 handwritten programming tasks, while the MBPP dataset consists of 427 Python tasks. The SoftwareDev dataset is a collection of 70 representative examples of software development tasks covering diverse scopes such as mini-games, image processing algorithms, and data visualization. These datasets were used to evaluate the performance of MetaGPT in code generation tasks.
=== LLM Response ===
The evaluation datasets used in MetaGPT include HumanEval, MBPP, and a self-generated SoftwareDev dataset. HumanEval consists of 164 handwritten programming tasks, MBPP includes 427 Python tasks, and the SoftwareDev dataset comprises 70 repr

Lets see if the agent is able to recall this conversation history. So, we will ask another question now. 

In [12]:
response = agent.chat("Tell me the results over one of the above datasets.")

Added user message to memory: Tell me the results over one of the above datasets.
=== Calling Function ===
Calling function: vector_tool_metagpt with args: {"query": "results over the HumanEval dataset", "page_numbers": ["6"]}
=== Function Output ===
The results over the HumanEval dataset were part of the experimental evaluation conducted in the study.
=== LLM Response ===
The results over the HumanEval dataset were part of the experimental evaluation conducted in the study.


The agent uses the summary tool to identify datasets like human eval, MVP, and software dev. A follow-up query about results over one of these datasets demonstrates the agent's ability to maintain conversation history and provide detailed responses.

So now, we successfully were able to provide a interface with interacting with the agent. 

### 2. Lower-Level: Debuggability and Control



Now lets explore steps through more granular control can be achieved the agent. This helps in creating high level Research assistant over RAG pipelines, and also debug and control it. 

Key benefits:
- debug ability in execution of each  step.
  - As a dev, you have more transparency & visibility of what is happening under the hood. 
  - We can trace the agent's executions and check the failure reasons, test new inputs to obtain better results. 
  - Ultimately provides richer UX.
- steeriability by allowing user to inject feedback. 
  - Suppose we want the agent to listen to human feedback while it is running, to achieve this we have to create a async queue, where human feedback can be delivered throught the agent' execution. 
  - So, while the agent is working, if it recieves any human feedback, it will address it and modify its response. 





In [13]:
agent_worker = FunctionCallingAgentWorker.from_tools(
    [vector_tool, summary_tool], 
    llm=llm, 
    verbose=True
)
agent = AgentRunner(agent_worker)

In [14]:
task = agent.create_task(
    "Tell me about the agent roles in MetaGPT, "
    "and then how they communicate with each other."
)

In [15]:
step_output = agent.run_step(task.task_id)

Added user message to memory: Tell me about the agent roles in MetaGPT, and then how they communicate with each other.
=== Calling Function ===
Calling function: summary_tool_metagpt with args: {"input": "agent roles in MetaGPT"}
=== Function Output ===
The agent roles in MetaGPT include the Product Manager, Architect, Project Manager, Engineer, and QA Engineer. The Product Manager is responsible for generating the Product Requirement Document (PRD) and competitive analysis. The Architect designs the system architecture and technical specifications. The Project Manager breaks down the project into tasks for execution. The Engineer implements the code based on the specifications. Lastly, the QA Engineer generates unit tests and ensures the quality of the software. Each role has specific responsibilities that contribute to the collaborative software development process within MetaGPT.


We see only first task has been executed and then stops. Lets see how many steps have been executed, and what is the current output till now. 

In [16]:
completed_steps = agent.get_completed_steps(task.task_id)
print(f"Num completed for task {task.task_id}: {len(completed_steps)}")
print(completed_steps[0].output.sources[0].raw_output)

Num completed for task 92edf57d-bfe8-4f49-8478-f530b47fce56: 1
The agent roles in MetaGPT include the Product Manager, Architect, Project Manager, Engineer, and QA Engineer. The Product Manager is responsible for generating the Product Requirement Document (PRD) and competitive analysis. The Architect designs the system architecture and technical specifications. The Project Manager breaks down the project into tasks for execution. The Engineer implements the code based on the specifications. Lastly, the QA Engineer generates unit tests and ensures the quality of the software. Each role has specific responsibilities that contribute to the collaborative software development process within MetaGPT.


Lets take a loot at the upcoming steps too. 

In [17]:
upcoming_steps = agent.get_upcoming_steps(task.task_id)
print(f"Num upcoming steps for task {task.task_id}: {len(upcoming_steps)}")
upcoming_steps[0]

Num upcoming steps for task 92edf57d-bfe8-4f49-8478-f530b47fce56: 1


TaskStep(task_id='92edf57d-bfe8-4f49-8478-f530b47fce56', step_id='348d6ec7-2d92-4843-ba92-00a6981e9836', input=None, step_state={}, next_steps={}, prev_steps={}, is_ready=True)

- We see the current input to the next task is None, this is because it auto-generates the action based on the conversation history. 

- Another option we currently have is to take the intermediate results and stop this flow. 

Now, lets chenge the question to the agent. 

In [18]:
step_output = agent.run_step(
    task.task_id, input="What about how agents share information?"
)

Added user message to memory: What about how agents share information?
=== Calling Function ===
Calling function: summary_tool_metagpt with args: {"input": "how agents share information in MetaGPT"}
=== Function Output ===
Agents in MetaGPT share information through a shared message pool where they can publish structured messages and subscribe to relevant messages based on their profiles. This method allows agents to efficiently exchange information without the need for direct one-to-one communication, enhancing communication efficiency and reducing information overload.


We see the agent is able to handle the interruptions and address different questions if asked in middle. 

Now lets perform the final step to synthesize the answer. 

In [19]:
step_output = agent.run_step(task.task_id)
print(step_output.is_last)

=== LLM Response ===
Agents in MetaGPT share information through a shared message pool where they can publish structured messages and subscribe to relevant messages based on their profiles. This method allows agents to efficiently exchange information without the need for direct one-to-one communication, enhancing communication efficiency and reducing information overload.
True


In [20]:
response = agent.finalize_response(task.task_id)
print(str(response))

Agents in MetaGPT share information through a shared message pool where they can publish structured messages and subscribe to relevant messages based on their profiles. This method allows agents to efficiently exchange information without the need for direct one-to-one communication, enhancing communication efficiency and reducing information overload.
