# MongoDB Connection and LangChain Vector Store Setup

This notebook demonstrates how to connect to MongoDB using the `MONGODB_CONNECTION` environment variable, and provides examples for setting up a vector store with LangChain.


In [None]:
# Add required packages to pyproject.toml (run once from project root or use uv add)
# Note: python-dotenv, langchain, and langchain-openai are already in pyproject.toml
!uv add pymongo langchain-mongodb

In [None]:
import os
from pprint import pprint

from dotenv import load_dotenv
from pymongo import MongoClient
from pymongo.errors import ServerSelectionTimeoutError

# Load environment variables from a .env file when available
load_dotenv()

mongo_uri = os.getenv("MONGODB_CONNECTION")
if not mongo_uri:
    raise ValueError(
        "Environment variable MONGODB_CONNECTION is missing. Set it in your .env file or shell before running this notebook."
    )

client = MongoClient(mongo_uri, serverSelectionTimeoutMS=5000)

try:
    server_info = client.server_info()
except ServerSelectionTimeoutError as exc:
    raise ConnectionError(
        "Unable to reach the MongoDB server. Verify the connection string and network access."
    ) from exc

print("MongoDB server version:", server_info.get("version"))
print("Available databases:")
pprint(client.list_database_names())


## Extending to LangChain Vector Store

1. Create a vector collection in Atlas or local MongoDB with fields like `content` (original text) and `embedding` (vector).
2. Choose an embedding model for LangChain. Examples: `OpenAIEmbeddings`, `OllamaEmbeddings`, `HuggingFaceEmbeddings`, etc.
3. Use `MongoDBAtlasVectorSearch` (for Atlas) or `MongoDBStore` from `langchain_mongodb` to connect MongoDB as a LangChain document store.

Example code (assuming credentials and indexes are configured):

```python
from langchain_openai import OpenAIEmbeddings
from langchain_mongodb import MongoDBAtlasVectorSearch

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
collection = client["my_database"]["my_vector_collection"]

vector_store = MongoDBAtlasVectorSearch(
    collection=collection,
    embedding=embeddings,
    index_name="vector_index",  # Vector index name created in Atlas
    text_key="content",         # Original text field name
    embedding_key="embedding",  # Vector storage field name
)

# Add LangChain documents to MongoDB
docs = ["First document", "Second document"]
vector_store.add_texts(docs)
```

If you need to create a vector index or configure Atlas, create the vector index in the MongoDB Atlas UI first, then specify the same `index_name` in LangChain.
