# DocumentAgent

In this notebook, see how DocumentAgent, through natural language, can

1. Ingest documents from a local file or URL
2. and answer questions with RAG capability

### Installation

To get started with the document agent integration in AG2, follow these steps:

Install AG2 with the `rag` extra:
```bash
pip install ag2[rag]
```

Note:

   1. Currently, the DocumentAgent only supports questions related to the ingested documents.
   2. Answers may not be accurate for documents that cannot be parsed correctly to Markdown format.

You're all set! Now you can start using DocumentAgent feature in AG2.

## Inside the DocumentAgent

![./documentagent/documentagent_swarm.png](DocumentAgent's internal swarm)

In [None]:
import os

import autogen
from autogen import AfterWorkOption, ConversableAgent, initiate_swarm_chat
from autogen.agents.experimental import DocumentAgent

In [None]:
config_list = autogen.config_list_from_json(
    "../OAI_CONFIG_LIST",
    filter_dict={
        "model": ["gpt-4o"],
    },
)
os.environ["OPENAI_API_KEY"] = config_list[0]["api_key"]

llm_config = {
    "config_list": config_list,
}

### Ingesting local documents and answering questions

In [None]:
document_agent = DocumentAgent(llm_config=llm_config, collection_name="toast_report")
document_agent.run(
    "could you ingest ../test/agentchat/contrib/graph_rag/Toast_financial_report.pdf? What is the fiscal year 2024 financial summary?",
    max_turns=1,
)

### Fetching a webpage and answering questions

In [None]:
document_agent = DocumentAgent(llm_config=llm_config, collection_name="news_reports")
document_agent.run(
    "could you read 'https://www.independent.co.uk/space/earth-core-inner-shape-change-b2695585.html' and summarize the article?",
    max_turns=1,
)

## Multiple DocumentAgents in a Swarm

In [None]:
# Ensure the OPENAI_API_KEY is set in the environment
llm_config = {"model": "gpt-4o-mini", "api_type": "openai", "cache_seed": None}

nvidia_agent = DocumentAgent(
    name="nvidia_agent",
    llm_config=llm_config,
    collection_name="nvidia-demo",
)

amd_agent = DocumentAgent(
    name="amd_agent",
    llm_config=llm_config,
    collection_name="amd-demo",
)

analyst = ConversableAgent(
    name="financial_analyst",
    system_message=(
        "You are a financial analyst working with two specialist agents, amd_agent who handles all AMD documents and queries, and nvidia_agent who handles all NVIDIA documents and queries. "
        "Each agent knows how to load documents and answer questions from the document regarding their respective companies. "
        "Only mention one of the two agents at a time, prioritize amd_agent. You will be able to engage each agent separately in subsequent iterations. "
        "Work with only one agent at a time and provide (a) an indication for them to take action by saying '[Next to speak is ...]' together with (b) documents they need to ingest and (c) queries they need to run, if any. "
        "When all documents have been ingested and all queries have been answered, provide a summary and add 'TERMINATE' to the end of your summary."
    ),
    is_termination_msg=lambda x: x.get("content", "") and "terminate" in x.get("content", "").lower(),
    llm_config=llm_config,
)

result, _, _ = initiate_swarm_chat(
    initial_agent=analyst,
    agents=[analyst, nvidia_agent, amd_agent],
    messages=(
        "Use the amd_agent to load AMD's 4th quarter 2024 report from "
        "./documentagent/AMDQ4-2024.pdf "
        "and use the nvidia_agent to load NVIDIA's 3rd quarter 2025 report from "
        "./documentagent/NVIDIAQ3-2025.pdf. "
        "Then ask both agents what AMD and NVIDIA did, in detail, in regards to AI in their latest quarters and what the net revenues were."
    ),
    swarm_manager_args={
        "llm_config": llm_config,
        "system_message": "You are managing a financial analyst and two specialist company agents. After each specialist agent, amd_agent or nvidia_agent speak, always have the financial analyst speak next.",
        "is_termination_msg": lambda x: x.get("content", "") and "terminate" in x.get("content", "").lower(),
    },
    after_work=AfterWorkOption.SWARM_MANAGER,
)