<a href="https://colab.research.google.com/github/sreent/data-management-intro/blob/main/MongoDB%3A%20Selection%2C%20Projection%20and%20Sorting.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 1 Setting Up MongoDB Environment

In [None]:
# Install MongoDB's dependencies
!sudo wget http://archive.ubuntu.com/ubuntu/pool/main/o/openssl/libssl1.1_1.1.1f-1ubuntu2_amd64.deb
!sudo dpkg -i libssl1.1_1.1.1f-1ubuntu2_amd64.deb

# Import the public key used by the package management system
!wget -qO - https://www.mongodb.org/static/pgp/server-4.4.asc | apt-key add -

# Create a list file for MongoDB
!echo "deb [ arch=amd64,arm64 ] http://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.4 multiverse" | tee /etc/apt/sources.list.d/mongodb-org-4.4.list

# Reload the local package database
!apt-get update > /dev/null

# Install the MongoDB packages
!apt-get install -y mongodb-org > /dev/null

# Install pymongo
!pip install -q pymongo

# Create Data Folder
!mkdir -p /data/db

# Start MongoDB
!mongod --fork --logpath /var/log/mongodb.log --dbpath /data/db

In [None]:
from pymongo import MongoClient

# Establish connection to MongoDB
try:
    client = MongoClient('localhost', 27017)
    print("Connected to MongoDB")
except Exception as e:
    print("Error connecting to MongoDB: ", e)
    exit()

# List databases to check the connection
try:
    databases = client.list_database_names()
    print("Databases:", databases)
except Exception as e:
    print("Error listing databases: ", e)

# Retrieve server status
try:
    server_status = client.admin.command("serverStatus")
    print("Server Status:", server_status)
except Exception as e:
    print("Error retrieving server status: ", e)

# Perform basic database operations (Create, Read)
try:
    db = client.test_db
    collection = db.test_collection
    # Insert a document
    insert_result = collection.insert_one({"name": "test", "value": 123})
    print("Insert operation result:", insert_result.inserted_id)
    # Read a document
    read_result = collection.find_one({"name": "test"})
    print("Read operation result:", read_result)
except Exception as e:
    print("Error performing database operations: ", e)

# 2 Preparations

Databases and collections in MongoDB are created implicitly while data is inserted. In this tutorial, you will create a collection of *films*. There is no collection so far, so create one by inserting a document.

In [None]:
query = """
db.collection.insertMany([
    {
        "ISBN": "978-0321751041",
        "title": "The Art of Computer Programming",
        "author": "Donald E. Knuth",
        "publisher": "Addison Wesley",
        "yearPublished": 1968,
        "price": 200
    },
    {
        "ISBN": "978-0201633610",
        "title": "Design Patterns: Elements of Reusable Object-Oriented Software",
        "author": "Erich Gamma et al.",
        "publisher": "Addison Wesley",
        "yearPublished": 1994,
        "price": 45
    },
    {
        "ISBN": "978-0321573513",
        "title": "Effective Java",
        "author": "Joshua Bloch",
        "publisher": "Addison Wesley",
        "yearPublished": 2008,
        "price": 50
    },
    {
        "ISBN": "978-0132350884",
        "title": "Clean Code: A Handbook of Agile Software Craftsmanship",
        "author": "Robert C. Martin",
        "publisher": "Addison Wesley",
        "yearPublished": 2008,
        "price": 40
    },
    {
        "ISBN": "978-0321127426",
        "title": "Refactoring: Improving the Design of Existing Code",
        "author": "Martin Fowler",
        "publisher": "Pearson",
        "yearPublished": 1999,
        "price": 55
    }
])
"""

!mongo --quiet --eval '{query}'

You can list the contents of the newly created collection by calling the <code>find()</code> function.

In [None]:
query = """db.collection.find()"""

!mongo --quiet --eval '{query}'

# 3 Querying

Find books published by "Addison Wesley", output only their ISBN, title and price, sort by price in descending order.

In [None]:
query = """
db.collection.aggregate([
    {
        $match: {
            "publisher": "Addison Wesley"
        }
    },
    {
        $project: {
            _id: false,
            ISBN: true,
            title: true,
            price: true
        }
    },
    {
        $sort: {
            title: -1
        }
    }
])
"""

!mongo --quiet --eval '{query}'

# 4 Interpretation:

- The `$match` stage filters the documents to include only those where the `publisher` is "Addison Wesley".
- The `$project` stage reshapes each document to include only the `ISBN`, `title`, and `price` fields, excluding the `_id` field.
- The `$sort` stage sorts the resulting documents by the `title` field in descending order (from Z to A).