# MongoDB Manager Usage Examples

This notebook demonstrates how to use the `MongoDB` manager class to create, manage, and interact with a MongoDB database in a Docker container.

## Setup

### Install Required Packages

In [1]:
!pip install py-dockerdb pymongo




[notice] A new release of pip is available: 25.0.1 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


### Import Dependencies

In [None]:
import uuid
from pathlib import Path
from bson import ObjectId
from docker_db.mongo_db import MongoDBConfig, MongoDB

## Creating a MongoDB Instance

Let's create a temporary directory for our database files:

In [None]:
import tempfile
import os
temp_dir = Path("tmp")
temp_dir.mkdir(exist_ok=True)
container_name = f"demo-mongodb-{uuid.uuid4().hex[:8]}"
init_script_path = Path("configs", "mongodb", "init.js")
init_script_path.exists()

Created temporary directory: C:\Users\acisse\AppData\Local\Temp\tmp6_jnqria


Now, let's set up the MongoDB configuration:

In [None]:
# Generate a unique container name
container_name = f"demo-mongodb-{uuid.uuid4().hex[:8]}"

# Create a configuration for our database
config = MongoDBConfig(
    user="demouser",
    password="demopass",
    database="demodb",
    root_username="admin",
    root_password="adminpass",
    project_name="demo",
    workdir=temp_dir,
    container_name=container_name,
    retries=20,
    delay=3, 
    nit_script=init_script_path,
    env_vars={"YourEnvVar": "TestEnvironment"})

# Initialize the database manager
db_manager = MongoDB(config)

## Start the Database

We'll now create and start the database:

In [5]:
# Create and start the database
db_manager.create_db()
print(f"Database started successfully in container '{container_name}'")
print(f"Connection details:")
print(f"  Host: {config.host}")
print(f"  Port: {config.port}")
print(f"  User: {config.user}")
print(f"  Database: {config.database}")

Ensuring database 'demodb' and user 'demouser' exist...
Created user 'demouser' with access to database 'demodb'
Database started successfully in container 'demo-mongodb-f9bbb039'
Connection details:
  Host: localhost
  Port: 27017
  User: demouser
  Database: demodb


## Connect and Create Collections

Now that our database is running, let's connect to it and create some collections:

In [6]:
# Connect to the database
client = db_manager.connection
db = client[config.database]

print("Connected to MongoDB successfully.")
print(f"Available collections: {db.list_collection_names()}")

Connected to MongoDB successfully.
Available collections: []


### Creating Collections and Inserting Data

In [7]:
# Create a users collection and insert some data
users_collection = db["users"]

# Sample user data
users_data = [
    {"_id": ObjectId("64a7b1c2d3e4f5a6b7c8d9e0"), "username": "alice", "email": "alice@example.com", "age": 28},
    {"_id": ObjectId("64a7b1c2d3e4f5a6b7c8d9e1"), "username": "bob", "email": "bob@example.com", "age": 32},
    {"_id": ObjectId("64a7b1c2d3e4f5a6b7c8d9e2"), "username": "charlie", "email": "charlie@example.com", "age": 25},
]

# Insert users
result = users_collection.insert_many(users_data)
print(f"Inserted users with IDs: {[str(id) for id in result.inserted_ids]}")
print("Users collection created successfully.")
print(f"Available collections: {db.list_collection_names()}")

Inserted users with IDs: ['64a7b1c2d3e4f5a6b7c8d9e0', '64a7b1c2d3e4f5a6b7c8d9e1', '64a7b1c2d3e4f5a6b7c8d9e2']
Users collection created successfully.
Available collections: ['users']


In [8]:
# Create a posts collection and insert some data
posts_collection = db["posts"]

# Sample post data
posts_data = [
    {
        "_id": ObjectId("64a7b1c2d3e4f5a6b7c8d9e3"),
        "user_id": ObjectId("64a7b1c2d3e4f5a6b7c8d9e0"),
        "title": "Alice's First Post",
        "content": "Hello world from Alice!",
        "created_at": "2025-05-20T09:30:00.000Z"
    },
    {
        "_id": ObjectId("64a7b1c2d3e4f5a6b7c8d9e4"),
        "user_id": ObjectId("64a7b1c2d3e4f5a6b7c8d9e0"),
        "title": "Alice's Second Post",
        "content": "Another post from Alice",
        "created_at": "2025-05-20T10:15:00.000Z"
    },
    {
        "_id": ObjectId("64a7b1c2d3e4f5a6b7c8d9e5"),
        "user_id": ObjectId("64a7b1c2d3e4f5a6b7c8d9e1"),
        "title": "Bob's Introduction",
        "content": "Hi, this is Bob!",
        "created_at": "2025-05-20T11:00:00.000Z"
    },
    {
        "_id": ObjectId("64a7b1c2d3e4f5a6b7c8d9e6"),
        "user_id": ObjectId("64a7b1c2d3e4f5a6b7c8d9e2"),
        "title": "Charlie's Notes",
        "content": "Some notes from Charlie",
        "created_at": "2025-05-20T12:45:00.000Z"
    }
]

# Insert posts
result = posts_collection.insert_many(posts_data)
print(f"Inserted posts with IDs: {[str(id) for id in result.inserted_ids]}")
print("Posts collection created successfully.")
print(f"Available collections: {db.list_collection_names()}")

Inserted posts with IDs: ['64a7b1c2d3e4f5a6b7c8d9e3', '64a7b1c2d3e4f5a6b7c8d9e4', '64a7b1c2d3e4f5a6b7c8d9e5', '64a7b1c2d3e4f5a6b7c8d9e6']
Posts collection created successfully.
Available collections: ['posts', 'users']


## Querying Data

Now let's perform some queries on our collections:

In [9]:
# Query all users
print("All users:")
print("-------------------------")
for user in users_collection.find():
    print(f"Username: {user['username']}, Email: {user['email']}, Age: {user['age']}")

# Find a specific user by username
print("\nFind user by username:")
print("-------------------------")
bob = users_collection.find_one({"username": "bob"})
print(f"Found user: {bob}")

# Find users within an age range
print("\nFind users by age range:")
print("-------------------------")
young_users = users_collection.find({"age": {"$lt": 30}})
for user in young_users:
    print(f"Username: {user['username']}, Email: {user['email']}, Age: {user['age']}")

All users:
-------------------------
Username: alice, Email: alice@example.com, Age: 28
Username: bob, Email: bob@example.com, Age: 32
Username: charlie, Email: charlie@example.com, Age: 25

Find user by username:
-------------------------
Found user: {'_id': ObjectId('64a7b1c2d3e4f5a6b7c8d9e1'), 'username': 'bob', 'email': 'bob@example.com', 'age': 32}

Find users by age range:
-------------------------
Username: alice, Email: alice@example.com, Age: 28
Username: charlie, Email: charlie@example.com, Age: 25


In [10]:
# Find posts by a specific user
alice_id = ObjectId("64a7b1c2d3e4f5a6b7c8d9e0")
print(f"Posts by Alice (user_id: {alice_id}):")
print("-------------------------")
alice_posts = posts_collection.find({"user_id": alice_id})
for post in alice_posts:
    print(f"Title: {post['title']}")
    print(f"Content: {post['content']}")
    print(f"Created at: {post['created_at']}\n")

# Find posts with a specific word in the title
print(f"Posts with 'notes' in the title:")
print("-------------------------")
notes_posts = posts_collection.find({"title": {"$regex": "notes", "$options": "i"}})
for post in notes_posts:
    print(f"Title: {post['title']}")
    print(f"Content: {post['content']}")
    print(f"Created at: {post['created_at']}")

Posts by Alice (user_id: 64a7b1c2d3e4f5a6b7c8d9e0):
-------------------------
Title: Alice's First Post
Content: Hello world from Alice!
Created at: 2025-05-20T09:30:00.000Z

Title: Alice's Second Post
Content: Another post from Alice
Created at: 2025-05-20T10:15:00.000Z

Posts with 'notes' in the title:
-------------------------
Title: Charlie's Notes
Content: Some notes from Charlie
Created at: 2025-05-20T12:45:00.000Z


## Advanced MongoDB Operations

Let's demonstrate some more advanced MongoDB operations like aggregation pipelines:

In [11]:
# Aggregate to count posts by user
pipeline = [
    {
        "$lookup": {
            "from": "users",
            "localField": "user_id",
            "foreignField": "_id",
            "as": "user"
        }
    },
    {"$unwind": "$user"},
    {
        "$group": {
            "_id": "$user._id",
            "username": {"$first": "$user.username"},
            "post_count": {"$sum": 1}
        }
    },
    {"$sort": {"post_count": -1}}
]

print("Post count by user:")
print("-------------------------")
post_counts = posts_collection.aggregate(pipeline)
for result in post_counts:
    print(f"Username: {result['username']}, Post count: {result['post_count']}")

Post count by user:
-------------------------
Username: alice, Post count: 2
Username: bob, Post count: 1
Username: charlie, Post count: 1


In [12]:
# Create indexes for better query performance
print("Creating indexes...")
users_collection.create_index("username", unique=True)
users_collection.create_index("email", unique=True)
print("Indexes created successfully.")

# List available indexes
print("\nAvailable indexes on 'users' collection:")
print(users_collection.index_information().keys())

Creating indexes...
Indexes created successfully.

Available indexes on 'users' collection:
dict_keys(['_id_', 'username_1', 'email_1'])


## Working with MongoDB Schema Validation

MongoDB supports schema validation for documents. Let's create a new collection with validation rules:

In [13]:
# Create a new collection with schema validation
db.create_collection(
    "products",
    validator={
        "$jsonSchema": {
            "bsonType": "object",
            "required": ["name", "price", "category"],
            "properties": {
                "name": {
                    "bsonType": "string",
                    "description": "must be a string and is required"
                },
                "price": {
                    "bsonType": "number",
                    "minimum": 0,
                    "description": "must be a non-negative number and is required"
                },
                "category": {
                    "bsonType": "string",
                    "enum": ["Electronics", "Books", "Clothing", "Food"],
                    "description": "must be a string from the enum and is required"
                },
                "tags": {
                    "bsonType": "array",
                    "items": {
                        "bsonType": "string"
                    }
                }
            }
        }
    }
)

print("Products collection with schema validation created successfully.")

# Insert a valid document
products_collection = db["products"]
valid_product = {
    "name": "Laptop",
    "price": 999.99,
    "category": "Electronics",
    "tags": ["portable", "high-performance"]
}

result = products_collection.insert_one(valid_product)
print("Inserted valid product.")

# Try to insert an invalid document
try:
    invalid_product = {
        "name": "Invalid Product",
        "price": -10,  # Invalid: negative price
        "category": "Invalid"  # Invalid: not in enum
    }
    products_collection.insert_one(invalid_product)
except Exception as e:
    print("Document validation failed, as expected.")

print(f"Available collections: {db.list_collection_names()}")

Products collection with schema validation created successfully.
Inserted valid product.
Document validation failed, as expected.
Available collections: ['posts', 'users', 'products']


## Clean Up

When you're done with the database, you can delete it:

In [14]:
# Close the connection
print("Closing connection...")
client.close()

# Delete the database container
db_manager.delete_db()
print(f"Database container '{container_name}' deleted")

# Clean up the temporary directory
import shutil
shutil.rmtree(temp_dir)
print(f"Temporary directory '{temp_dir}' removed")

Closing connection...


Database container 'demo-mongodb-f9bbb039' deleted
Temporary directory 'C:\Users\acisse\AppData\Local\Temp\tmp6_jnqria' removed


## Conclusion

This notebook demonstrated how to:

1. Configure and create a MongoDB database with `MongoDB`
2. Create collections and insert documents
3. Perform various queries and aggregations
4. Work with indexes and schema validation
5. Clean up the database when finished

The `MongoDB` manager provides a convenient way to spin up MongoDB instances in Docker containers for development, testing, or demonstration purposes.