# DocumentAgent

In this notebook, we demonstrated how to use Document Agent which 
1. Ingest documents from local or url.
2. Answer questions with RAG capability.

### Installation

To get started with the document agent integration in AG2, follow these steps:

Install AG2 with the `rag` extra:
   ```bash
   pip install ag2[rag]
   ```

**Warning:**
   1. Current document agent only support questions related to the ingested documents.
   2. Answers may not be accurate for documents that could not be parsed correctly to markdown format.

You're all set! Now you can start using document agent feature in AG2.

## Inside the DocumentAgent

![./documentagent/documentagent_swarm.png](DocumentAgent's internal swarm)

In [None]:
import os

import autogen

config_list = autogen.config_list_from_json(
    "../OAI_CONFIG_LIST",
    filter_dict={
        "model": ["gpt-4o"],
    },
)
os.environ["OPENAI_API_KEY"] = config_list[0]["api_key"]

llm_config = {
    "config_list": config_list,
}

### Ingesting local documents and answering questions

In [None]:
from autogen.agents.experimental.document_agent.document_agent import DocumentAgent

document_agent = DocumentAgent(llm_config=llm_config, collection_name="toast_report")
document_agent.run(
    "could you ingest ../test/agentchat/contrib/graph_rag/Toast_financial_report.pdf? What is the fiscal year 2024 financial summary?",
    max_turns=1,
)

### Fetching a webpage and answering questions

In [None]:
from autogen.agents.experimental.document_agent.document_agent import DocumentAgent

document_agent = DocumentAgent(llm_config=llm_config, collection_name="news_reports")
document_agent.run(
    "could you read 'https://www.independent.co.uk/space/earth-core-inner-shape-change-b2695585.html' and summarize the article?",
    max_turns=1,
)

## Multiple DocumentAgents in a Swarm

In [1]:
from autogen import ConversableAgent, initiate_swarm_chat, AfterWorkOption
from autogen.agents.experimental import DocumentAgent

# Ensure the OPENAI_API_KEY is set in the environment
llm_config = {"model": "gpt-4o-mini", "api_type": "openai", "cache_seed": None}

nvidia_agent = DocumentAgent(
    name="nvidia_agent",
    llm_config=llm_config,
    collection_name="nvidia-demo",
)

amd_agent = DocumentAgent(
    name="amd_agent",
    llm_config=llm_config,
    collection_name="amd-demo",
)

analyst = ConversableAgent(
    name="financial_analyst",
    system_message=(
        "You are a financial analyst working with two specialist agents, amd_agent who handles all AMD documents and queries, and nvidia_agent who handles all NVIDIA documents and queries. "
        "Each agent knows how to load documents and answer questions from the document regarding their respective companies. "
        "Only mention one of the two agents at a time, prioritize amd_agent. You will be able to engage each agent separately in subsequent iterations. "
        "Work with only one agent at a time and provide (a) an indication for them to take action by saying '[Next to speak is ...]' together with (b) documents they need to ingest and (c) queries they need to run, if any. "
        "When all documents have been ingested and all queries have been answered, provide a summary and add 'TERMINATE' to the end of your summary."
    ),
    is_termination_msg=lambda x: x.get("content", "") and "terminate" in x.get("content", "").lower(),
    llm_config=llm_config,
)

result, _, _ = initiate_swarm_chat(
    initial_agent=analyst,
    agents=[analyst, nvidia_agent, amd_agent],
    messages=(
        "Use the amd_agent to load AMD's 4th quarter 2024 report from "
        "./documentagent/AMDQ4-2024.pdf "
        "and use the nvidia_agent to load NVIDIA's 3rd quarter 2025 report from "
        "./documentagent/NVIDIAQ3-2025.pdf. "
        "Then ask both agents what AMD and NVIDIA did, in detail, in regards to AI in their latest quarters and what the net revenues were."
    ),
    swarm_manager_args={"llm_config": llm_config, "system_message": "You are managing a financial analyst and two specialist company agents. After each specialist agent, amd_agent or nvidia_agent speak, always have the financial analyst speak next.", "is_termination_msg": lambda x: x.get("content", "") and "terminate" in x.get("content", "").lower()},
    after_work=AfterWorkOption.SWARM_MANAGER
)

  from .autonotebook import tqdm as notebook_tqdm
INFO:chromadb.telemetry.product.posthog:Anonymized telemetry enabled. See                     https://docs.trychroma.com/telemetry for more information.
INFO:autogen.agents.experimental.document_agent.docling_query_engine:Creating new collection nvidia-demo in the database.
INFO:autogen.agents.experimental.document_agent.docling_query_engine:Creating new collection amd-demo in the database.


[33m_User[0m (to chat_manager):

Use the amd_agent to load AMD's 4th quarter 2024 report from ./documentagent/AMDQ4-2024.pdf and use the nvidia_agent to load NVIDIA's 3rd quarter 2025 report from ./documentagent/NVIDIAQ3-2025.pdf. Then ask both agents what AMD and NVIDIA did, in detail, in regards to AI in their latest quarters and what the net revenues were.

--------------------------------------------------------------------------------
[32m
Next speaker: financial_analyst
[0m
[31m
>>>>>>>> USING AUTO REPLY...[0m
[33mfinancial_analyst[0m (to chat_manager):

[Next to speak is amd_agent]  
Documents to ingest: ./documentagent/AMDQ4-2024.pdf  
Queries to run:  
1. What did AMD do regarding AI in the 4th quarter of 2024?  
2. What were the net revenues for AMD in the 4th quarter of 2024?  

---

After amd_agent has processed the requests, I will continue with nvidia_agent.

--------------------------------------------------------------------------------
[32m
Next speaker: amd_a

INFO:autogen.agents.experimental.document_agent.document_agent:initiate_tasks context_variables


[33m_Swarm_Tool_Executor[0m (to chat_manager):

[32m***** Response from calling tool (call_hVFsjovgqvZOgNhxFN734TX2) *****[0m
Updated context variables with task decisions
[32m**********************************************************************[0m

--------------------------------------------------------------------------------
[32m
Next speaker: TaskManagerAgent
[0m
[31m
>>>>>>>> USING AUTO REPLY...[0m
[33mTaskManagerAgent[0m (to chat_manager):

[32m***** Suggested tool call (call_2Gt2gcI9DVxoiGzwKbufBV03): transfer_TaskManagerAgent_to_DoclingDocIngestAgent *****[0m
Arguments: 
{}
[32m*******************************************************************************************************************[0m

--------------------------------------------------------------------------------
[32m
Next speaker: _Swarm_Tool_Executor
[0m
[35m
>>>>>>>> EXECUTING FUNCTION transfer_TaskManagerAgent_to_DoclingDocIngestAgent...
Call ID: call_2Gt2gcI9DVxoiGzwKbufBV03
Input arguments

INFO:autogen.agents.experimental.document_agent.document_utils:Error when checking if ./documentagent/AMDQ4-2024.pdf is a valid URL: Invalid URL.
INFO:autogen.agents.experimental.document_agent.document_utils:Detected file. Returning file path...
INFO:docling.document_converter:Going to convert document batch...
INFO:docling.utils.accelerator_utils:Accelerator device: 'mps'
INFO:docling.utils.accelerator_utils:Accelerator device: 'mps'
INFO:docling.pipeline.base_pipeline:Processing document AMDQ4-2024.pdf
INFO:docling.document_converter:Finished converting document AMDQ4-2024.pdf in 20.78 sec.
INFO:autogen.agents.experimental.document_agent.parser_utils:Document converted in 20.78 seconds.
DEBUG:autogen.agents.experimental.document_agent.parser_utils:Document AMDQ4-2024.pdf converted.
Saved markdown output to: /Users/marksze/AG2/dev/code/2025-02-ag2documentagent/ag2/notebook/parsed_docs
DEBUG:autogen.agents.experimental.document_agent.parser_utils:item-0 at level 0: unspecified: group 

[33m_Swarm_Tool_Executor[0m (to chat_manager):

[32m***** Response from calling tool (call_ASXli2VKFYxFuIFw931gumxB) *****[0m
Data Ingestion Task Completed for ./documentagent/AMDQ4-2024.pdf
[32m**********************************************************************[0m

--------------------------------------------------------------------------------
[32m
Next speaker: TaskManagerAgent
[0m
[31m
>>>>>>>> USING AUTO REPLY...[0m
[33mTaskManagerAgent[0m (to chat_manager):

[32m***** Suggested tool call (call_gFeJH4sXdtRRMfjD5TwQmcUk): transfer_TaskManagerAgent_to_QueryAgent *****[0m
Arguments: 
{}
[32m********************************************************************************************************[0m

--------------------------------------------------------------------------------
[32m
Next speaker: _Swarm_Tool_Executor
[0m
[35m
>>>>>>>> EXECUTING FUNCTION transfer_TaskManagerAgent_to_QueryAgent...
Call ID: call_gFeJH4sXdtRRMfjD5TwQmcUk
Input arguments: {}[0m
[33m

INFO:autogen.agents.experimental.document_agent.document_agent:initiate_tasks context_variables


[33m_Swarm_Tool_Executor[0m (to chat_manager):

[32m***** Response from calling tool (call_fWnvmd6Iotd6GXqxyunfsEqV) *****[0m
Updated context variables with task decisions
[32m**********************************************************************[0m

--------------------------------------------------------------------------------
[32m
Next speaker: TaskManagerAgent
[0m
[31m
>>>>>>>> USING AUTO REPLY...[0m
[33mTaskManagerAgent[0m (to chat_manager):

[32m***** Suggested tool call (call_Xs0dShAhViT6G2gTn1Ceq7di): transfer_TaskManagerAgent_to_DoclingDocIngestAgent *****[0m
Arguments: 
{}
[32m*******************************************************************************************************************[0m

--------------------------------------------------------------------------------
[32m
Next speaker: _Swarm_Tool_Executor
[0m
[35m
>>>>>>>> EXECUTING FUNCTION transfer_TaskManagerAgent_to_DoclingDocIngestAgent...
Call ID: call_Xs0dShAhViT6G2gTn1Ceq7di
Input arguments

INFO:autogen.agents.experimental.document_agent.document_utils:Error when checking if ./documentagent/NVIDIAQ3-2025.pdf is a valid URL: Invalid URL.
INFO:autogen.agents.experimental.document_agent.document_utils:Detected file. Returning file path...
INFO:docling.document_converter:Going to convert document batch...
INFO:docling.utils.accelerator_utils:Accelerator device: 'mps'
INFO:docling.utils.accelerator_utils:Accelerator device: 'mps'
INFO:docling.pipeline.base_pipeline:Processing document NVIDIAQ3-2025.pdf
INFO:docling.document_converter:Finished converting document NVIDIAQ3-2025.pdf in 15.61 sec.
INFO:autogen.agents.experimental.document_agent.parser_utils:Document converted in 15.61 seconds.
DEBUG:autogen.agents.experimental.document_agent.parser_utils:Document NVIDIAQ3-2025.pdf converted.
Saved markdown output to: /Users/marksze/AG2/dev/code/2025-02-ag2documentagent/ag2/notebook/parsed_docs
DEBUG:autogen.agents.experimental.document_agent.parser_utils:item-0 at level 0: unspeci

[33m_Swarm_Tool_Executor[0m (to chat_manager):

[32m***** Response from calling tool (call_oHaMIiTd9vRpuyWZ2e4GYpT1) *****[0m
Data Ingestion Task Completed for ./documentagent/NVIDIAQ3-2025.pdf
[32m**********************************************************************[0m

--------------------------------------------------------------------------------
[32m
Next speaker: TaskManagerAgent
[0m
[31m
>>>>>>>> USING AUTO REPLY...[0m
[33mTaskManagerAgent[0m (to chat_manager):

[32m***** Suggested tool call (call_7GVEhhQLCDszHrhzE8Rim1eI): transfer_TaskManagerAgent_to_QueryAgent *****[0m
Arguments: 
{}
[32m********************************************************************************************************[0m

--------------------------------------------------------------------------------
[32m
Next speaker: _Swarm_Tool_Executor
[0m
[35m
>>>>>>>> EXECUTING FUNCTION transfer_TaskManagerAgent_to_QueryAgent...
Call ID: call_7GVEhhQLCDszHrhzE8Rim1eI
Input arguments: {}[0m
[