1.  What are the key differences between SQL and NoSQL databases

Ans:SQL is relational, uses structured schema and joins; NoSQL like MongoDB is document-based, schema-less, and more flexible.

2.  What makes MongoDB a good choice for modern applications?

Ans: It offers high scalability, flexibility, and handles unstructured data well.

3.  Explain the concept of collections in MongoDB?

Ans: Collections are groups of MongoDB documents, similar to tables in SQL.

4.  How does MongoDB ensure high availability using replication?

Ans: MongoDB uses replica sets (primary + secondary nodes) for failover and redundancy.

5.  What are the main benefits of MongoDB Atlas?

Ans: Managed cloud service, automatic backups, scalability, security, and monitoring.

6.  What is the role of indexes in MongoDB, and how do they improve performance?

Ans: Indexes speed up query performance by allowing faster data retrieval.

7.  Describe the stages of the MongoDB aggregation pipeline?

Ans:Typical stages: $match, $group, $sort, $project, $limit, $lookup.

8.  What is sharding in MongoDB? How does it differ from replication?

Ans:  Sharding = horizontal scaling; replication = data redundancy and high availability.

9.  What is PyMongo, and why is it used?

Ans: A Python driver to interact with MongoDB from Python applications.

10.  What are the ACID properties in the context of MongoDB transactions?

Ans: MongoDB supports ACID for transactions to ensure data reliability.

11.  What is the purpose of MongoDB’s explain() function?

Ans: Shows how MongoDB executes a query and helps with optimization.

12.  How does MongoDB handle schema validation?

Ans: MongoDB allows setting rules for document structure using JSON Schema.

13.  What is the difference between a primary and a secondary node in a replica set?

Ans: Primary handles reads/writes; secondaries replicate data and handle reads.

14.  What security mechanisms does MongoDB provide for data protection?

Ans:  Authentication, authorization, encryption (TLS/SSL, at-rest), and auditing.

15.  Explain the concept of embedded documents and when they should be used?

Ans: Store related data in the same document for better read performance.

16.  What is the purpose of MongoDB’s $lookup stage in aggregation?

Ans: Joins documents from two collections, like a SQL JOIN.

17.  What are some common use cases for MongoDB?

Ans: Real-time analytics, content management, IoT, e-commerce, mobile apps.

18.  What are the advantages of using MongoDB for horizontal scaling?

Ans: Improves performance, distributes load, handles large datasets efficiently.

19.  How do MongoDB transactions differ from SQL transactions?

Ans: MongoDB supports multi-document ACID transactions but with different syntax.

20.  What are the main differences between capped collections and regular collections?

Ans: Capped collections have fixed size and auto-remove oldest data; regular do not.

21.  What is the purpose of the $match stage in MongoDB’s aggregation pipeline?

Ans: Filters documents in the pipeline, like a WHERE clause.

22.  How can you secure access to a MongoDB database?

Ans:Use authentication, IP whitelisting, TLS, and role-based access.

23.  What is MongoDB’s WiredTiger storage engine, and why is it important??

Ans: Default engine in MongoDB; supports compression, concurrency, and performance.

In [None]:
#practical

#1. Write a Python script to load the Superstore dataset from a CSV file into MongoDB.

import pandas as pd
from pymongo import MongoClient

# Load CSV
df = pd.read_csv('superstore.csv')

# Connect to MongoDB
client = MongoClient("mongodb://localhost:27017/")
db = client["superstore_db"]
orders_collection = db["orders"]

# Insert records
orders_collection.insert_many(df.to_dict(orient="records"))


#2. Retrieve and print all documents from the Orders collection.

for doc in orders_collection.find():
    print(doc)


#3. Count and display the total number of documents in the Orders collection.

print("Total documents:", orders_collection.count_documents({}))


#4. Write a query to fetch all orders from the "West" region.

for doc in orders_collection.find({"Region": "West"}):
    print(doc)

#5.  Write a query to find orders where Sales is greater than 500.

for doc in orders_collection.find({"Sales": {"$gt": 500}}):
    print(doc)

#6.Fetch the top 3 orders with the highest Profit.

for doc in orders_collection.find().sort("Profit", -1).limit(3):
    print(doc)

#7. Update all orders with Ship Mode as "First Class" to "Premium Class."

orders_collection.update_many(
    {"Ship Mode": "First Class"},
    {"$set": {"Ship Mode": "Premium Class"}}
)

#8. Delete all orders where Sales is less than 50.

orders_collection.delete_many({"Sales": {"$lt": 50}})


#9.  Use aggregation to group orders by Region and calculate total sales per region.

pipeline = [
    {"$group": {"_id": "$Region", "total_sales": {"$sum": "$Sales"}}}
]
for doc in orders_collection.aggregate(pipeline):
    print(doc)

#10.Fetch all distinct values for Ship Mode from the collection.

print(orders_collection.distinct("Ship Mode"))

#11. Count the number of orders for each category.

pipeline = [
    {"$group": {"_id": "$Category", "order_count": {"$sum": 1}}}
]
for doc in orders_collection.aggregate(pipeline):
    print(doc)
