# MongoDB Query Engine Tutorial

This notebook demonstrates the use of the `MongoDBQueryEngine` for retrieval-augmented question answering over documents using MongoDB. It shows how to set up the engine with MongoDB and simple text parser using LlamaIndex parsed Markdown files, and execute natural language queries against the indexed data. 

The `MongoDBQueryEngine` integrates cloud MongoDB Atlas but also MongoDB localhost vector storage with LlamaIndex for efficient document retrieval.

In [None]:
%pip install llama-index==0.12.16
%pip install llama-index-vector-stores-mongodb==0.6.0

Before calling the `MongoDBQueryEngine`, we will create and import our own embedding function. You can replace this with any embedding function built-in from Langchain or Llama-Index or your custom build

In [None]:
import os

from chromadb.utils import embedding_functions

import autogen

config_list = autogen.config_list_from_json(env_or_file="OAI_CONFIG_LIST")

assert len(config_list) > 0
print("models to use: ", [config_list[i]["model"] for i in range(len(config_list))])

# Put the OpenAI API key into the environment
os.environ["OPENAI_API_KEY"] = config_list[0]["api_key"]

openai_ef = embedding_functions.OpenAIEmbeddingFunction(
    os.environ["OPENAI_API_KEY"],
    model_name="text-embedding-3-large",
)

In [None]:
from llama_index.embeddings.openai import OpenAIEmbedding

from autogen.agentchat.contrib.rag.mongodb_query_engine import MongoDBQueryEngine

query_engine = MongoDBQueryEngine(
    connection_string="",
    database_name="vector_db_2",
    collection_name="test_collection",
    embedding_function=OpenAIEmbedding(),
)

In [None]:
print(query_engine.get_collection_name())

In [None]:
input_dir = "/root/ag2/test/agents/experimental/document_agent/pdf_parsed/"  # Update to match your input directory

In [None]:
input_docs = [input_dir + "nvidia_10k_2024.md"]  # Update to match your input documents
query_engine.init_db(new_doc_paths_or_urls=input_docs)

In [None]:
question = "How much money did Nvidia spend in research and development"
answer = query_engine.query(question)
print(answer)

In [None]:
new_docs = [input_dir + "Toast_financial_report.md"]
query_engine.add_docs(new_doc_paths_or_urls=new_docs)

Using `query` to answer user input question

In [None]:
question = "What is the trading symbol for Toast"
answer = query_engine.query(question)
print(answer)