# Building a technical documentation agent with MongoDB, Gemini, and LangGraph

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mongodb-developer/GenAI-Showcase/blob/main/notebooks/agents/docs_agent_mngodb_langgraph.ipynb)

#### **NOTE**: The 📚 emoji indicates reference documentation.

# Step 1: Install required libraries

* **pymongo**: Python driver for MongoDB
* **langchain**: Python library for LangChain, an LLM app orchestration framework
* **langchain-google-genai**: Python library to use Google's GenAI models in LangChain
* **langgraph**: Python library for LangChain's agent orchestration framework, LangGraph
* **langgraph-checkpoint-mongodb**: Python library to add MongoDB as a checkpointer in LangGraph
* **sentence_transformers**: Python library to use open-source ML models from Hugging Face

In [2]:
! pip install -qU pymongo langchain langchain-google-genai langgraph langgraph-checkpoint-mongodb sentence_transformers

# Step 2: Setup prerequisites

* Register for a [free MongoDB Atlas account](https://www.mongodb.com/cloud/atlas/register)
* [Create a new database cluster](https://www.mongodb.com/docs/guides/atlas/cluster/)
* [Obtain the connection string](https://www.mongodb.com/docs/guides/atlas/connection-string/) for your database cluster

In [3]:
import getpass

from pymongo import MongoClient

In [4]:
# Paste your MongoDB connection string. Be sure to replace the password placeholder with your actual password.
MONGODB_URI = getpass.getpass("Enter your MongoDB connection string:")
# Initialize a MongoDB Python client
mongodb_client = MongoClient(MONGODB_URI, appname="devrel.workshop.agents")
# Check the connection to the server
mongodb_client.admin.command("ping")

Enter your MongoDB connection string: ········


{'ok': 1.0,
 '$clusterTime': {'clusterTime': Timestamp(1744928532, 43),
  'signature': {'hash': b'\x7fN\x9d\xa9\x98\xbf\xcb\xf9\x9cy\xf0\xdd\xd3\x83\xa9v\x9e\xb8\x17)',
   'keyId': 7456513059255746561}},
 'operationTime': Timestamp(1744928532, 43)}

### **Do not change the values assigned to the variables below**

In [5]:
#  Database name
DB_NAME = "mongodb_genai_devday"
# Name of the collection with full documents- used for summarization
FULL_COLLECTION_NAME = "mongodb-docs"
# Name of the collection for vector search- used for Q&A
VS_COLLECTION_NAME = "mongodb-docs-embedded"
# Name of the vector search index
VS_INDEX_NAME = "vector_index"

📚 https://pymongo.readthedocs.io/en/stable/tutorial.html#getting-a-database

In [6]:
# Connect to the `DB_NAME` database.
db = mongodb_client[DB_NAME]

📚 https://pymongo.readthedocs.io/en/stable/tutorial.html#getting-a-collection

In [7]:
# Connect to the `VS_COLLECTION_NAME` collection.
vs_collection = db[VS_COLLECTION_NAME]

In [8]:
# Connect to the `FULL_COLLECTION_NAME` collection.
full_collection = db[FULL_COLLECTION_NAME]

In [9]:
# Endpoint for data import and to get the LangChain LLM object
SERVERLESS_URL = "https://vtqjvgchmwcjwsrela2oyhlegu0hwqnw.lambda-url.us-west-2.on.aws/"
# Set the LLM provider to `google` to use Gemini 1.5 Pro as the brain of the agent
LLM_PROVIDER = "google"