# Multi-Agent Report Generation with AgentWorkflow

In this notebook, we will explore how to use the `AgentWorkflow` class to create multi-agent systems. Specifically, we will create a system that can generate a report on a given topic.

## Setup

In this example, we demonstrate the use of local serving of a `qwen3-8b` Small Language Model (SML) as our LLM, served by LM Studio, and a state-of-the-art model `gemini-2.5-flash` hosted by Google Cloud. For all supported LLM inference providers and models, check out the [examples documentation](https://docs.llamaindex.ai/en/stable/examples/llm/openai/) or [LlamaHub](https://llamahub.ai/?tab=llms) for a list of all supported LLMs and how to install/use them.

If we wanted, each agent could have a different LLM, but for this example, we will use the same LLM for all agents.

In [1]:
# Load environment variables from .env file
import os
from dotenv import load_dotenv
load_dotenv()

# Environment variables for local LM studio inference
model = "qwen/qwen3-8b"
base_url = "http://127.0.0.1:1234/v1"
api_key = ""

# Environment variable for Google GenAI API inference
google_api_key = os.getenv("GOOGLE_API_KEY", "")
google_model = "gemini-2.5-flash"

# Environment variables for local data
dir_input = './data/input'
dir_output = './data/output'
dir_chromadb = './database/vector_store/'
chromadb_collection = 'internet_history'

In [2]:
# Fix for "RuntimeError: This event loop is already running"
import nest_asyncio
nest_asyncio.apply()

from llama_index.llms.lmstudio import LMStudio
from llama_index.core.base.llms.types import ChatMessage, MessageRole

# Initialize the LMStudio client with the model and base URL
#llm = LMStudio(
#    model_name=model,
#    base_url=base_url,
#    temperature=0.7,
#)

from llama_index.llms.google_genai import GoogleGenAI
# Initialize the Google GenAI client with the API key
llm = GoogleGenAI(
    model=google_model,
    api_key=google_api_key,  
)

In [3]:
# Test the LLM endpoint with a simple prompt
response = llm.complete("Write a paragraph on the history of the internet.")
print(str(response))

The internet's origins trace back to the late 1960s with ARPANET, a project by the U.S. Department of Defense designed to allow researchers to share information and computing resources across a robust, decentralized network. Throughout the 1970s and 80s, it evolved with the development of foundational protocols like TCP/IP, enabling diverse networks to communicate seamlessly. However, it was the invention of the World Wide Web by Tim Berners-Lee at CERN in the early 1990s that truly democratized access, introducing user-friendly concepts like hyperlinks and web browsers. This innovation, coupled with the subsequent commercialization and rapid adoption throughout the late 1990s and early 2000s, transformed the internet from a niche academic tool into an indispensable global infrastructure, profoundly reshaping communication, commerce, education, and entertainment worldwide.


## System Design

Our system will have three agents:

1. A `ResearchAgent` that will search local data as well as the web for information on the given topic.
2. A `WriteAgent` that will write the report using the information found by the `ResearchAgent`.
3. A `ReviewAgent` that will review the report and provide feedback.

We will use the `AgentWorkflow` class to create a multi-agent system that will execute these agents in order.

While there are many ways to implement this system, in this case, we will use a few tools to help with the research and writing processes.

1. A `web_search` tool to search the web for information on the given topic.
2. A `query_engine` tool to query local documents via Query Engine (RAG)
3. A `record_notes` tool to record notes on the given topic.
4. A `write_report` tool to write the report using the information found by the `ResearchAgent`.
5. A `review_report` tool to review the report and provide feedback.

Utilizing the `Context` class, we can pass state between agents, and each agent will have access to the current state of the system.


### Function convert_pdfs_to_markdown

The function takes two arguments: the directory containing PDF files and the directory where the converted Markdown files will be saved. The function checks if the output directory exists and creates it if necessary. It then iterates over all PDF files in the input directory, converts each to Markdown using the DocumentConverter class, and saves the result in the output directory.

In [7]:
from warnings import filterwarnings
from docling.document_converter import DocumentConverter

# Suppress warning from easyocr to avoid cluttering the output of the conversion process
filterwarnings(action="ignore", category=FutureWarning, module="easyocr") 


def convert_pdfs_to_markdown(pdf_dir, md_dir):
	if not os.path.exists(md_dir):
		os.makedirs(md_dir)

	pdf_files = [f for f in os.listdir(pdf_dir) if f.endswith('.pdf')]
	for pdf_file in pdf_files:
		pdf_path = os.path.join(pdf_dir, pdf_file)
		md_path = os.path.join(md_dir, f"{os.path.splitext(pdf_file)[0]}.md")

		if not os.path.exists(md_path):
			print(f"Converting `{pdf_file}` to Markdown ...")

			doc_converter = DocumentConverter()
			result = doc_converter.convert(source=pdf_path)
			
			with open(md_path, 'w', encoding='utf-8') as md_file:
				md_file.write(result.document.export_to_markdown())

  from .autonotebook import tqdm as notebook_tqdm


### Execute the convert_pdfs_to_markdown function

Convert all PDFs in the specified input directory to Markdown format and saving them in the output directory. The function prints messages to indicate the progress of the conversion process.

In [8]:
convert_pdfs_to_markdown(dir_input, dir_output)

Converting `internet-history-09.pdf` to Markdown ...




### Initializes models and clients required for generating the vector database.

We create an embedding model using the HuggingFace library then read the converted Markdown documents from the output directory and loads them into a SimpleDirectoryReader.

Next, the code initializes a ChromaDB client and creates or retrieves a collection within the database. It sets up a vector store using the ChromaDB collection and a storage context with default settings. Finally, it creates a VectorStoreIndex from the loaded documents, using the embedding model for vectorization. The process concludes with a print statement indicating that the vector database has been successfully generated.

In [11]:
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import (
    SimpleDirectoryReader,
    VectorStoreIndex,
    StorageContext,
    load_index_from_storage,
)
import chromadb

chroma_embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
documents = SimpleDirectoryReader(input_dir=dir_output).load_data()

chroma_client = chromadb.PersistentClient(path = dir_chromadb)
chroma_collection = chroma_client.get_or_create_collection(name=chromadb_collection)

vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context, embed_model=chroma_embed_model)

print("Vector database successfully generated!")

Vector database successfully generated!


### Test a simple query to the vector database

In [None]:
test_query = "Who published the first paper on packet switching theory?"
result = index.as_query_engine(llm=llm).query(test_query)
print(f"Q: {test_query}\nA: {result.response.strip()}\n\nSources:")
display([(n.text, n.metadata) for n in result.source_nodes])

Q: Who published the first paper on packet switching theory?
A: Leonard Kleinrock at MIT published the first paper on packet switching theory in July 1961.

Sources:


[("There is the operations and management aspect of a global and complex operational infrastructure. There is the social aspect, which resulted in a broad community of Internauts working together to create and evolve the technology. And there is the commercialization aspect, resulting in an extremely effective transition of research results into a broadly deployed and available information infrastructure.\n\nThe Internet today is a widespread information infrastructure, the initial prototype of what is often called the National (or Global or Galactic) Information Infrastructure. Its history is complex and involves many aspects - technological, organizational, and community. And its influence reaches not only to the technical fields of computer communications but throughout society as we move toward increasing use of online tools to accomplish electronic commerce, information acquisition, and community operations.\n\n## 2. ORIGINS OF THE INTERNET\n\nThe first recorded description of the

In [13]:
from tavily import AsyncTavilyClient
from llama_index.core.workflow import Context


async def search_web(query: str) -> str:
    """Useful for using the web to answer questions."""
    client = AsyncTavilyClient(api_key=os.environ.get("TAVILY_SEARCH_API_KEY"))
    return str(await client.search(query))

async def query_data(query: str) -> str:
    """Query local vector database for information on internet history."""
    engine = index.as_query_engine(llm=llm)
    formatted_output = f"Q: {query}\nA: {result.response.strip()}\n\nSources:\n{[(n.text, n.metadata) for n in result.source_nodes]}"
    return str(formatted_output)

async def record_notes(ctx: Context, notes: str, notes_title: str) -> str:
    """Useful for recording notes on a given topic. Your input should be notes with a title to save the notes under."""
    current_state = await ctx.get("state")
    if "research_notes" not in current_state:
        current_state["research_notes"] = {}
    current_state["research_notes"][notes_title] = notes
    await ctx.set("state", current_state)
    return "Notes recorded."


async def write_report(ctx: Context, report_content: str) -> str:
    """Useful for writing a report on a given topic. Your input should be a markdown formatted report."""
    current_state = await ctx.get("state")
    current_state["report_content"] = report_content
    await ctx.set("state", current_state)
    return "Report written."


async def review_report(ctx: Context, review: str) -> str:
    """Useful for reviewing a report and providing feedback. Your input should be a review of the report."""
    current_state = await ctx.get("state")
    current_state["review"] = review
    await ctx.set("state", current_state)
    return "Report reviewed."

With our tools defined, we can now create our agents.

If the LLM you are using supports tool calling, you can use the `FunctionAgent` class. Otherwise, you can use the `ReActAgent` class.

Here, the name and description of each agent is used so that the system knows what each agent is responsible for and when to hand off control to the next agent.

In [18]:
from llama_index.core.agent.workflow import FunctionAgent, ReActAgent

test_agent_search = ReActAgent(
    tools=[search_web],
    llm=llm,
    system_prompt="You are a helpful assistant that can search the web for information.",
)

test_agent_query = ReActAgent(
    tools=[query_data],
    llm=llm,
    system_prompt=(
        "You are a helpful assistant that can query a local vector database for information on internet history. "
        "You should first search the local vertor database with the query_data tool for information on the topic. "
        "You should return the response in a markdown format including the question, answer, and sources. "
    ),
)

research_agent = ReActAgent(
    name="ResearchAgent",
    description="Useful for searching the web for information on a given topic and recording notes on the topic.",
    system_prompt=(
        "You are the ResearchAgent that can search local data or on the web for information on a given topic and record notes on the topic."
        "You should first search the local vector database with the query_data tool for information on the topic if relevant to the information stored. "
        "If not sufficient, you should then search web with the search_web tool for information on the topic. "
        "You should always record notes on the topic using the record_notes tool. "
        "Once notes are recorded and once you are satisfied, you should always hand off control to the WriteAgent to write a report on the topic. "
        "You should have at least some notes on a topic before handing off control to the WriteAgent."
    ),
    llm=llm,
    tools=[query_data, search_web, record_notes],
    can_handoff_to=["WriteAgent"],
)

write_agent = ReActAgent(
    name="WriteAgent",
    description="Useful for writing a report on a given topic.",
    system_prompt=(
        "You are the WriteAgent that can write a report on a given topic. "
        "Your report should be in a markdown format. The content should be grounded in the research notes. "
        "Once the report is written, you should get feedback at least once from the ReviewAgent."
    ),
    llm=llm,
    tools=[write_report],
    can_handoff_to=["ReviewAgent", "ResearchAgent"],
)

review_agent = ReActAgent(
    name="ReviewAgent",
    description="Useful for reviewing a report and providing feedback.",
    system_prompt=(
        "You are the ReviewAgent that can review the write report and provide feedback. "
        "Your review should either approve the current report or request changes for the WriteAgent to implement. "
        "If you have feedback that requires changes, you should hand off control to the WriteAgent to implement the changes after submitting the review."
    ),
    llm=llm,
    tools=[review_report],
    can_handoff_to=["WriteAgent"],
)

## Testing a single agent

Use the test agent to ensure that agent and tools work with the chosen model

In [15]:
response = await test_agent_search.run(user_msg="What is the weather in San Francisco?")
print(str(response))

The weather in San Francisco is currently 57°F and partly cloudy, with a west wind at 10 mph and 77% humidity.


In [16]:
response = await test_agent_query.run(user_msg="Who published the first paper on packet switching theory?")
print(str(response))

Leonard Kleinrock at MIT published the first paper on packet switching theory in July 1961.


## Running the Workflow

With our agents defined, we can create our `AgentWorkflow` and run it.

In [19]:
from llama_index.core.agent.workflow import AgentWorkflow

agent_workflow = AgentWorkflow(
    agents=[research_agent, write_agent, review_agent],
    root_agent=research_agent.name,
    initial_state={
        "research_notes": {},
        "report_content": "Not written yet.",
        "review": "Review required.",
    },
)

As the workflow is running, we will stream the events to get an idea of what is happening under the hood.

In [20]:
from llama_index.core.agent.workflow import (
    AgentInput,
    AgentOutput,
    ToolCall,
    ToolCallResult,
    AgentStream,
)

handler = agent_workflow.run(
    user_msg=(
        "Search for the history of internet and write me a report on it. "
        "Briefly describe the history of the internet, including the development of the internet, the development of the web, "
        "and the development of the internet in the 21st century."
    )
)

current_agent = None
current_tool_calls = ""
async for event in handler.stream_events():
    if (
        hasattr(event, "current_agent_name")
        and event.current_agent_name != current_agent
    ):
        current_agent = event.current_agent_name
        print(f"\n{'='*50}")
        print(f"🤖 Agent: {current_agent}")
        print(f"{'='*50}\n")

    # if isinstance(event, AgentStream):
    #     if event.delta:
    #         print(event.delta, end="", flush=True)
    # elif isinstance(event, AgentInput):
    #     print("📥 Input:", event.input)
    elif isinstance(event, AgentOutput):
        if event.response.content:
            print("📤 Output:", event.response.content)
        if event.tool_calls:
            print(
                "🛠️  Planning to use tools:",
                [call.tool_name for call in event.tool_calls],
            )
    elif isinstance(event, ToolCallResult):
        print(f"🔧 Tool Result ({event.tool_name}):")
        print(f"  Arguments: {event.tool_kwargs}")
        print(f"  Output: {event.tool_output}")
    elif isinstance(event, ToolCall):
        print(f"🔨 Calling Tool: {event.tool_name}")
        print(f"  With arguments: {event.tool_kwargs}")


🤖 Agent: ResearchAgent

📤 Output: Thought: The current language of the user is: english. I need to use a tool to help me answer the question. The user wants a report on the history of the internet, covering its development, the web's development, and its evolution in the 21st century. I should use the `query_data` tool to gather this information.
Action: query_data
Action Input: {"query": "history of the internet, development of the web, internet in the 21st century"}
🛠️  Planning to use tools: ['query_data']
🔨 Calling Tool: query_data
  With arguments: {'query': 'history of the internet, development of the web, internet in the 21st century'}
🔧 Tool Result (query_data):
  Arguments: {'query': 'history of the internet, development of the web, internet in the 21st century'}
  Output: Q: history of the internet, development of the web, internet in the 21st century
A: Leonard Kleinrock at MIT published the first paper on packet switching theory in July 1961.

Sources:
[("There is the oper

  current_state = await ctx.get("state")
  await ctx.set("state", current_state)


📤 Output: Thought: The current language of the user is: english. I have successfully gathered all the necessary information and recorded it as notes. Now, I need to hand off to the `WriteAgent` to compile this information into a report as requested by the user.
Action: handoff
Action Input: {'to_agent': 'WriteAgent', 'reason': 'I have gathered all the necessary information on the history of the internet, including its development, the web, and its evolution in the 21st century. The WriteAgent is now needed to compile this into a report.'}
🛠️  Planning to use tools: ['handoff']
🔨 Calling Tool: handoff
  With arguments: {'to_agent': 'WriteAgent', 'reason': 'I have gathered all the necessary information on the history of the internet, including its development, the web, and its evolution in the 21st century. The WriteAgent is now needed to compile this into a report.'}
🔧 Tool Result (handoff):
  Arguments: {'to_agent': 'WriteAgent', 'reason': 'I have gathered all the necessary information

  current_state = await ctx.get("state")
  await ctx.set("state", current_state)


📤 Output: Thought: I can answer without using any more tools. I'll use the user's language to answer
Answer: I have successfully generated the report on the history of the internet, including its development, the development of the web, and its evolution in the 21st century. The report has been written and is available in the `report_content` state.


Now, we can retrieve the final report in the system for ourselves.

In [21]:
state = await handler.ctx.store.get("state")
print(state["report_content"])

# The History of the Internet

The Internet, a global network of interconnected computer networks, has revolutionized communication and information exchange. Its origins can be traced back to foundational research in the mid-20th century, evolving through several key stages to become the ubiquitous infrastructure it is today.

## Development of the Internet

The conceptual groundwork for the Internet began with **Leonard Kleinrock**'s work on packet switching theory, first published in July 1961. This theory proposed breaking data into small blocks (packets) for more efficient transmission, a fundamental departure from traditional circuit-switched networks.

In August 1962, **J.C.R. Licklider** of MIT envisioned a "Galactic Network," a globally interconnected set of computers allowing universal access to data and programs. Licklider, as the first head of the computer research program at DARPA (Defense Advanced Research Projects Agency), convinced his successors of the importance of thi