<a href="https://colab.research.google.com/github/sreent/data-management-intro/blob/main/MongoDB%20Hand-On%20Lab%20-%20Solutions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 1 Setting Up MongoDB Environment

In [None]:
# Install MongoDB's dependencies
!sudo wget http://archive.ubuntu.com/ubuntu/pool/main/o/openssl/libssl1.1_1.1.1f-1ubuntu2_amd64.deb
!sudo dpkg -i libssl1.1_1.1.1f-1ubuntu2_amd64.deb

# Import the public key used by the package management system
!wget -qO - https://www.mongodb.org/static/pgp/server-4.4.asc | apt-key add -

# Create a list file for MongoDB
!echo "deb [ arch=amd64,arm64 ] http://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.4 multiverse" | tee /etc/apt/sources.list.d/mongodb-org-4.4.list

# Reload the local package database
!apt-get update > /dev/null

# Install the MongoDB packages
!apt-get install -y mongodb-org > /dev/null

# Install pymongo
!pip install -q pymongo

# Create Data Folder
!mkdir -p /data/db

# Start MongoDB
!mongod --fork --logpath /var/log/mongodb.log --dbpath /data/db

In [None]:
from pymongo import MongoClient

# Establish connection to MongoDB
try:
    client = MongoClient('localhost', 27017)
    print("Connected to MongoDB")
except Exception as e:
    print("Error connecting to MongoDB: ", e)
    exit()

# List databases to check the connection
try:
    databases = client.list_database_names()
    print("Databases:", databases)
except Exception as e:
    print("Error listing databases: ", e)

# Retrieve server status
try:
    server_status = client.admin.command("serverStatus")
    print("Server Status:", server_status)
except Exception as e:
    print("Error retrieving server status: ", e)

# Perform basic database operations (Create, Read)
try:
    db = client.test_db
    collection = db.test_collection
    # Insert a document
    insert_result = collection.insert_one({"name": "test", "value": 123})
    print("Insert operation result:", insert_result.inserted_id)
    # Read a document
    read_result = collection.find_one({"name": "test"})
    print("Read operation result:", read_result)
except Exception as e:
    print("Error performing database operations: ", e)

# 2 Preparations

Databases and collections in MongoDB are created implicitly while data is inserted. In this tutorial, you will create a collection of *films*. There is no collection so far, so create one by inserting a document.

In [None]:
query = """
db.films.insert({
    "title": "Star Trek Into Darkness",
    "year": 2013,
    "genre": [
        "Action",
        "Adventure",
        "Sci-Fi",
    ],
    "actors": [
        "Pine, Chris",
        "Quinto, Zachary",
        "Saldana, Zoe",
    ],
    "releases": [
        {
            "country": "USA",
            "date": ISODate("2013-05-17"),
            "prerelease": true
        },
        {
            "country": "Germany",
            "date": ISODate("2003-05-16"),
            "prerelease": false
        }
    ]
})"""

!mongo --quiet --eval '{query}'

Now, there is a *films* collection. You can list the contents of the newly created collection by calling the <code>find()</code> function.

In [None]:
query = """db.films.find()"""

In [None]:
!mongo --quiet --eval '{query}'

If you prefer your result nicely formatted, use <code>pretty()</code>:

In [None]:
query = """db.films.find().pretty()"""

In [None]:
!mongo --quiet --eval '{query}'

As you can see, now there is an <code>_id</code> field which is unique for every document

Now insert some more films:

In [None]:
query = """
db.films.insert({
    "title": "Iron Man 3",
    "year": 2013,
    "genre": [
        "Action",
        "Adventure",
        "Sci-Fi",
    ],
    "actors": [
        "Downey Jr., Robert",
        "Paltrow, Gwyneth",
    ]
})
""" # no releases

!mongo --quiet --eval '{query}'

In [None]:
query = """
db.films.insert({
    "title": "This Means War",
    "year": 2011,
    "genre": [
        "Action",
        "Comedy",
        "Romance",
    ],
    "actors": [
        "Pine, Chris",
        "Witherspoon, Reese",
        "Hardy, Tom",
    ],
    "releases": [
        {
            "country": "USA",
            "date": ISODate("2011-02-17"),
            "prerelease": false
        },
        {
            "country": "UK" ,
            "date": ISODate("2011-03-01"),
            "prerelease": true
        }
    ]
})
"""

!mongo --quiet --eval '{query}'

In [None]:
query = """
db.films.insert({
    "title": "The Amazing Spider - Man 2",
    "year": 2014,
    "genre": [
        "Action",
        "Adventure",
        "Fantasy",
    ],
    "actors": [
        "Stone, Emma" ,
        "Woodley, Shailene"
    ]
})
""" # also no releases

!mongo --quiet --eval '{query}'

# 3 Querying

Now query your collection! Have MongoDB return all films with title **"Iron Man 3"** by calling:

In [None]:
query = """
db.films.find({"title": "Iron Man 3"})
"""

!mongo --quiet --eval '{query}'

Using <code>findOne</code> instead of find produces at most one result (in pretty format):

In [None]:
query = """
db.films.findOne({"title": "Iron Man 3"})
"""

!mongo --quiet --eval '{query}'

Regular expressions can also be used to query a collection. In this tutorial, a short notation is used where the actual regular expression is bounded by slashes (/). The following call yields all movies that start with the letter T:

In [None]:
query = """
db.films.find({"title": /^T/})
"""

!mongo --quiet --eval '{query}'

In [None]:
query = """
db.films.find({"title": {"$regex": "^T"}})
"""

!mongo --quiet --eval '{query}'

If you are only interested in certain attributes, you can use projection to thin out the produced result. While the selection criteria are given by the first argument of find, the projection is given by the second argument. An example:

In [None]:
query = """
db.films.find({"title": /^T/}, {"title": 1})
"""

!mongo --quiet --eval '{query}'

In [None]:
query = """
db.films.find({"title": {"$regex": "^T"}}, {"title": 1})
"""

!mongo --quiet --eval '{query}'

By default, the <code>_id</code> is part of the output, so you have to explicitly suppress it, if you don’t want to have it returned by MongoDB:

In [None]:
query = """
db.films.find({"title": /^T/}, {"_id": 0, "title": 1})
"""

!mongo --quiet --eval '{query}'

In [None]:
query = """
db.films.find({"title": {"$regex": "^T"}}, {"_id": 0, "title": 1})
"""

!mongo --quiet --eval '{query}'

You can also use conditional operators, for example to perform range queries. The following returns the titles of all films starting with the letter T where the year attribute is greater than 2009 and less than or equal to 2011:

In [None]:
query = """
db.films.find({
    "year": {
        $gt: 2009,
        $lte: 2011
    },
    "title": /^T/
},
{
    "_id": 0,
    "title": 1
})
"""

!mongo --quiet --eval '{query}'

In [None]:
query = """
db.films.find({
    "year": {
        $gt: 2009,
        $lte: 2011
    },
    "title": {"$regex": "^T"}
},
{
    "_id": 0,
    "title": 1
})
"""

!mongo --quiet --eval '{query}'

For a logical disjunction of the selection criteria, use the <code>$or</code> operator:

In [None]:
query = """
### INSERT YOUR CODE HERE ###
"""

!mongo --quiet --eval '{query}'

In [None]:
query = """
### INSERT YOUR REGEX BASED CODE HERE ###
"""

!mongo --quiet --eval '{query}'

There are also some options that can be appended to the regular expression, e.g. i to achieve caseinsensitivity. The following call returns the titles of all movies whose title contains lowercase t, ...

In [None]:
query = """
### INSERT YOUR CODE HERE ###
"""

!mongo --quiet --eval '{query}'

In [None]:
query = """
### INSERT YOUR REGEX BASED CODE HERE ###
"""

!mongo --quiet --eval '{query}'

... whereas the following call also returns titles that contain a T (uppercase):

In [None]:
query = """
### INSERT YOUR CODE HERE ###
"""

!mongo --quiet --eval '{query}'

In [None]:
query = """
### INSERT YOUR REGEX BASED CODE HERE ###
"""

!mongo --quiet --eval '{query}'

You can query for exact matches in lists, ...

In [None]:
query = """
db.films.find({"genre": "Adventure"}, {"_id": 0, "title": 1, "genre": 1})
"""

!mongo --quiet --eval '{query}'

... but you can also query for partial matches which yields all genres that start with the letter A:

In [None]:
query = """
### INSERT YOUR CODE HERE ###
"""

!mongo --quiet --eval '{query}'

In [None]:
query = """
### INSERT YOUR REGEX BASED CODE HERE ###
"""

!mongo --quiet --eval '{query}'

There are also more complex operators for more complex selection criteria, e.g. the <code>$all</code> operator. The following call prints the title and actors of every movie for which each of two given regular expressions matches at least one of its actors:

In [None]:
query = """
### INSERT YOUR CODE HERE ###
"""

!mongo --quiet --eval '{query}'

In [None]:
query = """
### INSERT YOUR REGEX BASED CODE HERE ###
"""

!mongo --quiet --eval '{query}'

In contrast, the <code>$nin</code> operator checks for the lack of matching values, i.e. actor names that do not match either one of the given regular expressions:

In [None]:
query = """
### INSERT YOUR CODE HERE ###
"""

!mongo --quiet --eval '{query}'

In [None]:
query = """
### INSERT YOUR REGEX BASED CODE HERE ###
"""

!mongo --quiet --eval '{query}'

The <code>$exists</code> operator can be used to check for the existence of an attribute, e.g. to select only movies with undefined releases:

In [None]:
query = """
### INSERT YOUR CODE HERE ###
"""

!mongo --quiet --eval '{query}'

In MongoDB, it is also possible to query nested data, i.e. subdocuments. The following returns the title and releases of every movie that is known to be released in the UK:

In [None]:
query = """
### INSERT YOUR CODE HERE ###
"""

!mongo --quiet --eval '{query}'

Please note that you have to use quotes to address nested fields.

Applying more complex selection criteria on a nested document, however, is a little tricky. For example, if you wanted MongoDB to return all movies that had their prerelease in the USA, you might try something like this:

In [None]:
query = """
db.films.find({
    "releases.country": "USA" ,
    "releases.prerelease": true
},
{
    "_id": 0 ,
    "title": 1,
    "releases": 1
})
"""

!mongo --quiet --eval '{query}'

However, This Means War is also returned, but was prereleased in the UK. The call above actually returns all movies that have some prerelease or were released in the USA. To only select movies were both applies to the same release, the <code>$elemMatch</code> can be used:



In [None]:
query = """
### INSERT YOUR CODE HERE ###
"""

!mongo --quiet --eval '{query}'

Naturally, there are many other operators not covered by this tutorial.

# 4 Update

You can also add or update fields in a document by using the <code>$set</code> operator. For example, you can add a rating field to one of the movies:

In [None]:
query = """
db.films.update(
    {"title": "Star Trek Into Darkness"},
    {$set: {"rating": 6.4}}
)
"""

!mongo --quiet --eval '{query}'

If you do not use the $set operator, every document fulfilling the selection criteria will be replaced, so be careful!

Now, verify if the <code>rating</code> field is added to the document:

In [None]:
query = """
### INSERT YOUR CODE HERE ###
"""

!mongo --quiet --eval '{query}'

To increment a number of value, you can use the <code>$inc</code> operator:

In [None]:
query = """
db.films.update(
    {"title": "Star Trek Into Darkness"},
    {$inc: {"rating": 0.1}}
)
"""

!mongo --quiet --eval '{query}'

Verify if the rating value has been incremented by <code>0.1</code>.

In [None]:
query = """
### INSERT YOUR CODE HERE ###
"""

!mongo --quiet --eval '{query}'

Again, there are many other different operators for different purposes, e.g. `$unset`, `$inc`, `$pop`, `$push`, `$pushAll` or `$addToSet`.

# 5 Delete

You can remove documents with the remove function. It actually works almost like the find function; you only don’t use the projection parameter. If, for example, you want to remove all film documents whose title starts with the letter T, you can first query for all such movies...

In [None]:
query = """
db.films.find({"title": /^T/})
"""

!mongo --quiet --eval '{query}'

... to verify that your selection criteria is correct and then replaced the find in your call by remove:

In [None]:
db . films . find ({ title : /^ T /})
query = """
db.films.remove({"title": /^T/})
"""

!mongo --quiet --eval '{query}'

Now, we verify if the documents has been removed from the collection:

In [None]:
query = """
### INSERT YOUR CODE HERE ###
"""

!mongo --quiet --eval '{query}'