Q1. What is MongoDB? Explain non-relational databases in short. In which scenarios it is preferred to use
MongoDB over SQL databases?

MongoDB is a popular NoSQL database that emerged in 2009. Unlike traditional SQL databases, which rely on rigid schemas and tables, MongoDB takes a more flexible approach. Here are the key points:
Schema-less structure: MongoDB stores data as JSON-like documents (BSON format), allowing for dynamic data management. You don’t need to define a fixed schema upfront; fields can vary across documents.
Horizontal scalability: MongoDB excels at scaling horizontally. You can add more servers (nodes) to handle increased load, making it suitable for large-scale applications.
Development simplicity: Its document-oriented model aligns closely with how data is used in applications, simplifying development and reducing time-to-deployment1.
SQL Overview:
SQL (Structured Query Language) has been around since the 1970s. It’s the backbone of relational databases, where data is organized into rows and columns within tables. Key points:
Strict schema: SQL databases enforce a fixed schema. You define the structure (columns, data types) beforehand, ensuring data consistency.
Vertical scaling: SQL databases typically scale vertically by adding more resources (CPU, RAM) to a single server.
Data integrity: SQL databases prioritize data relationships (expressed through joins) and maintain ACID properties (Atomicity, Consistency, Isolation, Durability)1.
When to Choose MongoDB Over SQL:
Use Cases for MongoDB:
High-performance scenarios: MongoDB shines when dealing with large data volumes and high write loads. Examples include:
Internet of Things (IoT) applications: Collecting sensor data in real time.
Real-time analytics: Storing and querying event logs.
Content management systems: Handling dynamic content with varying structures2.
Scenarios Favoring MongoDB:
Flexible data models: If your data doesn’t fit neatly into tables and rows, MongoDB’s document-oriented approach is advantageous.
Agile development: MongoDB allows you to iterate quickly during development without rigid schema constraints.
Horizontal scalability needs: When your application demands distributed scaling, MongoDB’s architecture fits the bill.
When to Stick with SQL:
Data integrity matters: If maintaining strict relationships, enforcing constraints, and ensuring ACID properties are critical, SQL databases (e.g., MySQL, PostgreSQL) are better suited.
Complex queries and joins: SQL excels at complex queries involving multiple tables.
Mature ecosystem: SQL databases have been battle-tested for decades and offer robust features.

Q2. State and Explain the features of MongoDB.

Document Model:
MongoDB stores data as documents, which are essentially self-contained objects. These documents are grouped into collections.
Unlike traditional relational databases, MongoDB doesn’t enforce a rigid schema. Each document in a collection can have a different set of fields. This flexibility allows developers to iterate quickly and adapt to changing requirements.
Documents are stored in BSON (Binary JSON) format, which combines the best of both worlds: the human-readable structure of JSON and the efficiency of binary storage. BSON also supports binary data storage (e.g., images, videos).
Sharding:
Sharding is MongoDB’s way of distributing large datasets across multiple instances (shards). It helps handle massive amounts of data by distributing the workload.
With sharding, MongoDB can horizontally scale out, ensuring efficient query execution even for large datasets.
Horizontal Scaling and Load Balancing:
MongoDB excels at horizontal scaling. You can add more servers to your cluster as your data grows, without significant downtime.
Load balancing ensures that queries are distributed evenly across the available servers, maintaining performance.
Flexible Schema:
MongoDB’s flexible schema allows you to evolve your data model over time. You can add or remove fields without disrupting existing data.
However, if needed, you can enforce validation rules to maintain a more structured schema.
Powerful Querying and Analytics:
MongoDB provides expressive query capabilities, including rich filtering, sorting, and aggregation.
Aggregation pipelines allow complex transformations and computations on data within the database.
Change-Friendly Design:
As your application evolves, MongoDB accommodates changes gracefully. You won’t face schema migrations or downtime during updates.
This adaptability is especially valuable in agile development environments.
Widely Supported and Code-Native:
MongoDB has drivers for over 10 programming languages, making it accessible to developers across different tech stacks.
Developers can work with MongoDB using familiar programming paradigms.
BSON Storage Efficiency:
BSON’s binary format is faster than plain JSON, making data retrieval and storage more efficient.
Developers interact with BSON using MongoDB drivers, which abstract the binary encoding details.

Q3. Write a code to connect MongoDB to Python. Also, create a database and a collection in MongoDB.

In [1]:
pip install pymongo

Collecting pymongo
  Downloading pymongo-4.8.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m24.4 MB/s[0m eta [36m0:00:00[0m00:01[0m
[?25hCollecting dnspython<3.0.0,>=1.16.0
  Downloading dnspython-2.6.1-py3-none-any.whl (307 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m307.7/307.7 kB[0m [31m24.5 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: dnspython, pymongo
Successfully installed dnspython-2.6.1 pymongo-4.8.0
Note: you may need to restart the kernel to use updated packages.


In [2]:
import pymongo

# Replace these values with your actual MongoDB connection details
MONGODB_URI = "mongodb://localhost:27017"  # Change to your MongoDB URI
DB_NAME = "my_database"  # Change to your desired database name
COLLECTION_NAME = "my_collection"  # Change to your desired collection name

def connect_to_mongodb():
    try:
        # Connect to MongoDB
        client = pymongo.MongoClient(MONGODB_URI)
        print("Connected to MongoDB successfully!")

        # Access the database
        db = client[DB_NAME]

        # Create a new collection (if it doesn't exist)
        collection = db[COLLECTION_NAME]
        print(f"Collection '{COLLECTION_NAME}' created in database '{DB_NAME}'.")

        # Insert a sample document
        sample_document = {"name": "John Doe", "age": 30}
        collection.insert_one(sample_document)
        print("Sample document inserted successfully.")

    except pymongo.errors.ConnectionFailure:
        print("Failed to connect to MongoDB. Check your connection settings.")

if __name__ == "__main__":
    connect_to_mongodb()


Connected to MongoDB successfully!
Collection 'my_collection' created in database 'my_database'.
Failed to connect to MongoDB. Check your connection settings.


Q4. Using the database and the collection created in question number 3, write a code to insert one record,
and insert many records. Use the find() and find_one() methods to print the inserted record.

In [5]:
pip install pymongo

Note: you may need to restart the kernel to use updated packages.


In [6]:
import pymongo

# Set up MongoDB connection
client = pymongo.MongoClient("mongodb://localhost:27017/")
db = client["mydatabase"]  # Replace "mydatabase" with your actual database name
collection = db["mycollection"]  # Replace "mycollection" with your actual collection name


In [7]:
# Create a document (record) to insert
single_record = {
    "name": "John Doe",
    "age": 30,
    "email": "john@example.com"
}

# Insert the document into the collection
result = collection.insert_one(single_record)

# Print the inserted record's ID
print(f"Inserted record ID: {result.inserted_id}")


ServerSelectionTimeoutError: localhost:27017: [Errno 111] Connection refused (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms), Timeout: 30s, Topology Description: <TopologyDescription id: 66b325855b0f9ad992cea82d, topology_type: Unknown, servers: [<ServerDescription ('localhost', 27017) server_type: Unknown, rtt: None, error=AutoReconnect('localhost:27017: [Errno 111] Connection refused (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms)')>]>

In [None]:
# Create a list of documents (records) to insert
many_records = [
    {"name": "Alice", "age": 25, "email": "alice@example.com"},
    {"name": "Bob", "age": 28, "email": "bob@example.com"},
    {"name": "Eve", "age": 22, "email": "eve@example.com"}
]

# Insert the documents into the collection
result = collection.insert_many(many_records)

# Print the inserted record IDs
print(f"Inserted record IDs: {result.inserted_ids}")


In [None]:
# Find all records in the collection
all_records = collection.find()
for record in all_records:
    print(record)

# Find one specific record (e.g., the first one)
first_record = collection.find_one()
print("First record:")
print(first_record)


Q5. Explain how you can use the find() method to query the MongoDB database. Write a simple code to
demonstrate this.

To use find(), you’ll first need to connect to your MongoDB database using a driver (such as PyMongo for Python).
Then, you can call the find() method on a specific collection within your database.

In [8]:
cursor = collection.find(filter, projection)


NameError: name 'projection' is not defined

In [9]:
{
    "_id": 1,
    "name": "Alice",
    "age": 30,
    "email": "alice@example.com"
}


{'_id': 1, 'name': 'Alice', 'age': 30, 'email': 'alice@example.com'}

In [10]:
from pymongo import MongoClient

# Connect to MongoDB (assuming it's running locally)
client = MongoClient("mongodb://localhost:27017")
db = client["mydatabase"]
collection = db["users"]

# Query for users older than 25
query = { "age": { "$gt": 25 } }
result = collection.find(query)

# Print the results
for user in result:
    print(user)


ServerSelectionTimeoutError: localhost:27017: [Errno 111] Connection refused (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms), Timeout: 30s, Topology Description: <TopologyDescription id: 66b3266b5b0f9ad992cea82f, topology_type: Unknown, servers: [<ServerDescription ('localhost', 27017) server_type: Unknown, rtt: None, error=AutoReconnect('localhost:27017: [Errno 111] Connection refused (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms)')>]>

Q6. Explain the sort() method. Give an example to demonstrate sorting in MongoDB.

In [4]:
db.collection_name.find().sort({ field1: 1, field2: -1 })


NameError: name 'db' is not defined

In [None]:
{
    "_id": 1,
    "product_name": "Laptop",
    "price": 1200
},
{
    "_id": 2,
    "product_name": "Smartphone",
    "price": 800
},
{
    "_id": 3,
    "product_name": "Tablet",
    "price": 500
}


Q7. Explain why delete_one(), delete_many(), and drop() is used.

delete_one(filter):
This method is used to delete a single document from a collection based on a specified filter condition.
You provide a filter (usually in the form of a dictionary) that defines which document(s) you want to delete.
If multiple documents match the filter, only the first one encountered will be deleted.


In [1]:
# Assuming we have a "users" collection
result = db.users.delete_one({"username": "alice"})
print(f"Deleted {result.deleted_count} document(s).")


NameError: name 'db' is not defined

delete_many(filter):
Similar to delete_one(), but it deletes multiple documents that match the specified filter.
Useful when you want to remove several documents at once.

In [2]:
# Delete all documents where the "age" field is less than 30
result = db.users.delete_many({"age": {"$lt": 30}})
print(f"Deleted {result.deleted_count} document(s).")


NameError: name 'db' is not defined

drop():
Unlike the previous methods, drop() operates at the collection level, not the document level.
It removes an entire collection from the database.
Use with caution because it’s irreversible—you won’t be able to recover the data once a collection is dropped.

In [3]:
# Drop the "products" collection
db.products.drop()


NameError: name 'db' is not defined