# Build an AI Agent with MongoDB

This notebook is a companion to the [Build AI Agents with MongoDB](https://www.mongodb.com/docs/atlas/atlas-vector-search/ai-agents) page. For a more traditional Python development example and detailed explanations of the code, refer to the tutorial on the page.

This notebook demonstrates an AI agent that uses MongoDB as the database for both agentic RAG and agent memory.

<a target="_blank" href="https://colab.research.google.com/github/mongodb/docs-notebooks/blob/main/use-cases/ai-agent.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

## Set up the environment

In [None]:
pip install --quiet --upgrade pymongo voyageai openai langchain langchain_mongodb

In [None]:
import os

os.environ["VOYAGE_API_KEY"] = "<voyage-api-key>"
os.environ["OPENAI_API_KEY"] = "<openai-api-key>"
MONGODB_URI = "<connection-string>"

## Use MongoDB as a vector database
In this section, you configure the embedding model, chunk and ingest your data into a MongoDB collection, and then create a vector search index on your data to enable vector search.

In [None]:
import voyageai

# Configure the embedding model to use for retrieval
model = "voyage-3-large"
voyage_client = voyageai.Client()

# Define a function to generate embeddings
def get_embedding(data, input_type = "document"):
  embeddings = voyage_client.embed(
      data, model = model, input_type = input_type
  ).embeddings
  return embeddings[0]

In [1]:
from pymongo import MongoClient
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Connect to your MongoDB cluster
mongo_client = MongoClient(MONGODB_URI)
collection = mongo_client["agent_db"]["test"]

# Chunk PDF data
loader = PyPDFLoader("https://investors.mongodb.com/node/13176/pdf")
data = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=400, chunk_overlap=20)
documents = text_splitter.split_documents(data)

# Ingest chunked documents into collection
docs_to_insert = [{
    "text": doc.page_content,
    "embedding": get_embedding(doc.page_content)
} for doc in documents]
result = collection.insert_many(docs_to_insert)

In [1]:
from pymongo.operations import SearchIndexModel
import time

# Create your index model, then create the search index
index_name="vector_index"
search_index_model = SearchIndexModel(
  definition = {
    "fields": [
      {
        "type": "vector",
        "numDimensions": 1024,
        "path": "embedding",
        "similarity": "cosine"
      }
    ]
  },
  name = index_name,
  type = "vectorSearch"
)
collection.create_search_index(model=search_index_model)

# Wait for initial sync to complete
print("Polling to check if the index is ready. This may take up to a minute.")
predicate=None
if predicate is None:
   predicate = lambda index: index.get("queryable") is True

while True:
   indices = list(collection.list_search_indexes(index_name))
   if len(indices) and predicate(indices[0]):
      break
   time.sleep(5)
print(index_name + " is ready for querying.")

Polling to check if the index is ready. This may take up to a minute.
vector_index is ready for querying.


## Define tools for the agent
Next, you define two tools that the agent can use to complete tasks:
- `vector_search_tool`: Runs a vector search query to retrieve semantically similar documents from Atlas.
- `calculator_tool`: Uses the `eval()` function which can be used for math operations on a string. 

In [1]:
# Define a vector search tool
def vector_search_tool(user_input: str) -> str:
    query_embedding = get_embedding(user_input)
    pipeline = [
        {
              "$vectorSearch": {
                "index": "vector_index",
                "queryVector": query_embedding,
                "path": "embedding",
                "exact": True,
                "limit": 5
              }
        }, {
              "$project": {
                "_id": 0,
                "text": 1
           }
        }
    ]

    results = collection.aggregate(pipeline)

    array_of_results = []
    for doc in results:
        array_of_results.append(doc)
    return array_of_results

In [1]:
# Define a simple calculator tool
def calculator_tool(user_input: str) -> str:
    try:
        result = eval(user_input)
        return str(result)
    except Exception as e:
        return f"Error: {str(e)}" 

In [1]:
# Test the tools
print(vector_search_tool("What was MongoDB's latest acquisition?"))
print(calculator_tool("15 * 25"))

[{'text': 'dilutive impact of the acquisition consideration.\nFor the third consecutive year, MongoDB  was named a Leader in the 2024 Gartner® Magic Quadrant™ for Cloud\nDatabase Management Systems. Gartner evaluated 20 vendors based on Ability to Execute and Completeness of Vision.\nLombard Odier, a Swiss private bank, partnered with MongoDB  to migrate and modernize its legacy banking technology'}, {'text': '"Looking ahead, we remain incredibly excited about our long-term growth opportunity. MongoDB  removes the constraints of legacy databases,\nenabling businesses to innovate at AI speed with our flexible document model and seamless scalability. Following the Voyage AI acquisition, we'}, {'text': 'Measures."\nFourth Quarter Fiscal 2025 and Recent Business Highlights\nMongoDB  acquired Voyage AI, a pioneer in state-of-the-art embedding and reranking models that power next-generation\nAI applications. Integrating Voyage AI\'s technology with MongoDB  will enable organizations to easil

## Add memory to your agent
In this section, you define a basic system for agentic memory by using MongoDB. This system includes the following functions that enable storing and retrieve previous LLM interactions from your MongoDB collection.
- `store_chat_message`: to store information about the interaction in the collection.
- `retrieve_session_history`: to retrieve all interactions for a specific session.

In [1]:
from datetime import datetime
from typing import List

# Create a new collection to store chat message history
memory_collection = mongo_client["ai_agent"]["chat_history"]

def store_chat_message(session_id: str, role: str, content: str) -> None:
    message = {
        "session_id": session_id,     # Unique identifier for the chat session
        "role": role,                 # Role of the sender (user or system) 
        "content": content,           # Content of the message
        "timestamp": datetime.now(),  # Timestamp of when the message was sent
    }
    memory_collection.insert_one(message)

def retrieve_session_history(session_id: str) -> List:
    # Query the collection for messages with a specific "session_id" in ascending order
    cursor =  memory_collection.find({"session_id": session_id}).sort("timestamp", 1)

    # Iterate through the cursor and return a JSON object with the message role and content
    if cursor:
        messages = [{"role": msg["role"], "content": msg["content"]} for msg in cursor]
    else:
        messages = []
    return messages

In [None]:
# Test your memory functions
store_chat_message("test_session", "user", "Sample input")
store_chat_message("test_session", "system", "Sample response")
print(retrieve_session_history("test_session"))

[{'role': 'user', 'content': 'Sample input'}, {'role': 'system', 'content': 'Sample response'}]


## Configure the agent
This step configures agent planning, which includes how the agent handles tool execution and responses. After configuring the LLM, you define the following functions:
- `tool_selector`: Uses the LLM to determine the best tool based on the user's input.
- `get_llm_response`: Helper function for LLM response generation.
- `generate_answer`: Orchestrates the agent workflow for a task.

In [None]:
from openai import OpenAI

openai_client = OpenAI()
model_name = "gpt-4o"

In [12]:
# Define a tool selector function that decides which tool to use based on user input and message history
def tool_selector(user_input, session_history=None):
    messages = [
        {
            "role": "system",
            "content": (
                "Select the appropriate tool from the options below. Consider the full context of the conversation before deciding.\n\n"
                "Tools available:\n"
                "- vector_search_tool: Retrieve specific context about recent MongoDB earnings and announcements\n"
                "- calculator_tool: For mathematical operations\n"
                "- none: For general questions without additional context\n"

                "Process for making your decision:\n"
                "1. First, analyze if the current question relates to or follows up on a previous vector search query\n"
                "2. For follow-up questions, incorporate context from previous exchanges to create a comprehensive search query\n"
                "3. Only use calculator_tool for explicit mathematical operations\n"
                "4. Default to none only when certain the other tools won't help\n\n"
                
                "When continuing a conversation:\n"
                "- Identify the specific topic being discussed\n"
                "- Include relevant details from previous exchanges\n"
                "- Formulate a query that stands alone but preserves conversation context\n\n"
                
                "Return a JSON object only: {\"tool\": \"selected_tool\", \"input\": \"your_query\"}"
           )
        }
    ]
    if session_history:
        messages.extend(session_history)
    messages.append({"role": "user", "content": user_input})
    
    response = openai_client.chat.completions.create(
      model=model_name,
      messages=messages
    ).choices[0].message.content
    try:
        tool_call = eval(response)
        return tool_call.get("tool"), tool_call.get("input")
    except:
        return "none", user_input

In [1]:
# Test the tool_selector function
print(tool_selector("What was MongoDB's latest acquisition?"))
print(tool_selector("What is 15*25?"))

('vector_search_tool', 'MongoDB latest acquisition')
('calculator_tool', '15*25')


In [1]:
def get_llm_response(messages, system_message_content):
    """Helper function to get response from LLM with consistent formatting"""
    # Add the system message to the messages list
    system_message = {
        "role": "system",
        "content": system_message_content,
    }
    
    # If the system message should go at the end (for context-based queries)
    if any(msg.get("role") == "system" for msg in messages):
        messages.append(system_message)
    else:
        # For general queries, put system message at beginning
        messages = [system_message] + messages
    
    # Get response from LLM
    response = openai_client.chat.completions.create(
        model=model_name,
        messages=messages
    ).choices[0].message.content
    
    return response

In [1]:
# Define the agent workflow
def generate_response(session_id: str, user_input: str) -> str:
    
    # Store the user input in the chat history collection
    store_chat_message(session_id, "user", user_input)

    # Initialize a list of inputs to pass to the LLM
    llm_input = []

    # Retrieve the session history for the current session and add it to the LLM input
    session_history = retrieve_session_history(session_id)
    llm_input.extend(session_history)

    # Append the user message in the correct format
    user_message = {
        "role": "user",
        "content": user_input
    }
    llm_input.append(user_message)

    # Call the tool_selector function to determine which tool to use
    tool, tool_input = tool_selector(user_input, session_history)
    print("Tool selected: ", tool)
    
    # Process based on selected tool
    if tool == "vector_search_tool":
        context = vector_search_tool(tool_input)
        # Construct the system prompt using the retrieved context and append it to the LLM input
        system_message_content = (
            f"Answer the user's question based on the retrieved context and conversation history.\n"
            f"1. First, understand what specific information the user is requesting\n" 
            f"2. Then, locate the most relevant details in the context provided\n"
            f"3. Finally, provide a clear, accurate response that directly addresses the question\n\n"
            f"If the current question builds on previous exchanges, maintain continuity in your answer.\n"
            f"Only state facts clearly supported by the provided context. If information is not available, say 'I DON'T KNOW'.\n\n"
            f"Context:\n{context}"
        )
        response = get_llm_response(llm_input, system_message_content)
    elif tool == "calculator_tool":
        # Perform the calculation using the calculator tool
        response = calculator_tool(tool_input)
    else:
        system_message_content = "You are a helpful assistant. Respond to the user's prompt as best as you can based on the conversation history."
        response = get_llm_response(llm_input, system_message_content)
    
    # Store the system response in the chat history collection
    store_chat_message(session_id, "system", response)
    return response

## Test the agent
Run a few queries to test the agent. The agent determines the best tool to use based on the query and retains memory from previous interactions to inform future responses.

In [None]:
message_1 = generate_response(
    session_id="123",
    user_input="What was MongoDB's latest acquisition?",
)
print(message_1)

message_2 = generate_response(
    session_id="123",
    user_input="What do they do?",
)
print(message_2)

message_3 = generate_response(
    session_id="123",
    user_input="What's 15*25?",
)
print(message_3)

Tool selected:  vector_search_tool
MongoDB's latest acquisition was **Voyage AI**, a pioneer in state-of-the-art embedding and reranking models that power next-generation AI applications. Integrating Voyage AI's technology will enable organizations to build trustworthy AI-powered applications more easily.
Tool selected:  none
Voyage AI specializes in embedding and reranking models that enhance next-generation AI applications. Their technology supports building intelligent, AI-powered systems by making it easier for organizations to implement advanced, trustworthy AI solutions. These capabilities are likely focused on improving data retrieval, search functionality, and recommendations, enabling innovative applications in various industries.
Tool selected:  calculator_tool
375
