In [None]:
Q1-NoSQL Databases:

a) Write a Python program that connects to a MongoDB database and inserts a new document into a collection named "students". The document should include fields such as "name", "age", and "grade". Print a success message after the insertion.

from pymongo import MongoClient

# Connect to MongoDB
client = MongoClient("mongodb://localhost:27017/")

# Access the "students" collection
db = client["mydatabase"]
collection = db["students"]

# Create a new document
student = {
    "name": "John",
    "age": 18,
    "grade": "A"
}

# Insert the document into the collection
result = collection.insert_one(student)

# Print a success message
if result.acknowledged:
    print("Document inserted successfully.")
b) Implement a Python function that connects to a Cassandra database and inserts a new record into a table named "products". The record should contain fields like "id", "name", and "price". Handle any potential errors that may occur during the insertion.

from cassandra.cluster import Cluster

# Connect to Cassandra
cluster = Cluster(["localhost"])
session = cluster.connect()

# Create a new record
product = {
    "id": 1,
    "name": "Product 1",
    "price": 9.99
}

# Insert the record into the "products" table
try:
    session.execute("INSERT INTO products (id, name, price) VALUES (%s, %s, %s)", (product["id"], product["name"], product["price"]))
    print("Record inserted successfully.")
except Exception as e:
    print("Error occurred during insertion:", str(e))
Q2-Document-oriented NoSQL Databases:

a) Given a MongoDB collection named "books", write a Python function that fetches all the books published in the last year and prints their titles and authors.


from pymongo import MongoClient
from datetime import datetime, timedelta

# Connect to MongoDB
client = MongoClient("mongodb://localhost:27017/")

# Access the "books" collection
db = client["mydatabase"]
collection = db["books"]

# Calculate the date one year ago
one_year_ago = datetime.now() - timedelta(days=365)

# Query the books published in the last year
query = {"publication_date": {"$gt": one_year_ago}}
books = collection.find(query)

# Print the titles and authors of the books
for book in books:
    print("Title:", book["title"])
    print("Author:", book["author"])
    print()
b) Design a schema for a document-oriented NoSQL database to store customer information for an e-commerce platform. Write a Python program to insert a new customer document into the database and handle any necessary validations.


from pymongo import MongoClient

# Connect to MongoDB
client = MongoClient("mongodb://localhost:27017/")

# Access the "customers" collection
db = client["mydatabase"]
collection = db["customers"]

# Validate the customer data (example validation)
def validate_customer(customer):
    required_fields = ["name", "email", "address"]
    for field in required_fields:
        if field not in customer:
            raise ValueError(f"Missing required field: {field}")

# Create a new customer document
customer = {
    "name": "John Doe",
    "email": "johndoe@example.com",
    "address": "123 Main Street"
}

# Validate the customer document
try:
    validate_customer(customer)
    collection.insert_one(customer)
    print("Customer document inserted successfully.")
except ValueError as e:
    print("Error occurred during customer validation:", str(e))
Q3-High Availability and Fault Tolerance:

a) Explain the concept of replica sets in MongoDB. Write a Python program that connects to a MongoDB replica set and retrieves the status of the primary and secondary nodes.
from pymongo import MongoClient

# Connect to MongoDB replica set
client = MongoClient("mongodb://host1:port1,host2:port2,host3:port3/?replicaSet=myreplset")

# Get the status of the replica set
status = client.admin.command("replSetGetStatus")

# Print the status of the primary and secondary nodes
for member in status["members"]:
    if member["stateStr"] == "PRIMARY":
        print("Primary Node:", member["name"])
    elif member["stateStr"] == "SECONDARY":
        print("Secondary Node:", member["name"])
b) Describe how Cassandra ensures high availability and fault tolerance in a distributed database system. Write a Python program that connects to a Cassandra cluster and fetches the status of the nodes.


from cassandra.cluster import Cluster

# Connect to Cassandra cluster
cluster = Cluster(["host1", "host2", "host3"])
session = cluster.connect()

# Get the status of the Cassandra nodes
result = session.execute("SELECT * FROM system.local")

# Print the status of the nodes
for row in result:
    print("Node:", row.ip, "Status:", row.status)
Q4-Sharding in MongoDB:

a) Explain the concept of sharding in MongoDB and how it improves performance and scalability. Write a Python program that sets up sharding for a MongoDB cluster and inserts multiple documents into a sharded collection.

from pymongo import MongoClient
from bson import ObjectId

# Connect to MongoDB
client = MongoClient("mongodb://mongos1:27017,mongos2:27017,mongos3:27017/")

# Enable sharding for a database
admin_db = client.admin
admin_db.command("enableSharding", "mydatabase")

# Shard a collection based on a specific key
db = client["mydatabase"]
collection = db["mycollection"]
collection.create_index([("shard_key", "hashed")])
admin_db.command("shardCollection", "mydatabase.mycollection", key={"shard_key": "hashed"})

# Insert multiple documents into the sharded collection
documents = [
    {"_id": ObjectId(), "shard_key": "abc", "field1": "value1"},
    {"_id": ObjectId(), "shard_key": "def", "field1": "value2"},
    {"_id": ObjectId(), "shard_key": "ghi", "field1": "value3"}
]
collection.insert_many(documents)
b) Design a sharding strategy for a social media application where user data needs to be distributed across multiple shards. Write a Python program to demonstrate how data is distributed and retrieved from the sharded cluster.


from pymongo import MongoClient
from bson import ObjectId

# Connect to MongoDB
client = MongoClient("mongodb://mongos1:27017,mongos2:27017,mongos3:27017/")

# Enable sharding for a database
admin_db = client.admin
admin_db.command("enableSharding", "social_media")

# Shard the "users" collection based on the user ID
db = client["social_media"]
collection = db["users"]
collection.create_index([("_id", "hashed")])
admin_db.command("shardCollection", "social_media.users", key={ "_id": "hashed" })

# Insert user data into the sharded collection
users = [
    { "_id": ObjectId(), "name": "John Doe", "age": 25 },
    { "_id": ObjectId(), "name": "Jane Smith", "age": 30 },
    { "_id": ObjectId(), "name": "Tom Johnson", "age": 35 }
]
collection.insert_many(users)

# Query the user data from the sharded collection
result = collection.find({ "age": { "$gte": 30 } })

# Print the retrieved user data
for user in result:
    print("Name:", user["name"])
    print("Age:", user["age"])
    print()
Q5-Indexing in MongoDB:

a) Describe the concept of indexing in MongoDB and its importance in query optimization. Write a Python program that creates an index on a specific field in a MongoDB collection and executes a query using that index.


from pymongo import MongoClient

# Connect to MongoDB
client = MongoClient("mongodb://localhost:27017/")

# Access the collection
db = client["mydatabase"]
collection = db["mycollection"]

# Create an index on a specific field
collection.create_index("field_name")

# Execute a query using the index
result = collection.find({"field_name": "value"})

# Print the query results
for document in result:
    print(document)
b) Given a MongoDB collection named "products", write a Python function that searches for products with a specific keyword in the name or description. Optimize the query by adding appropriate indexes.


from pymongo import MongoClient

# Connect to MongoDB
client = MongoClient("mongodb://localhost:27017/")

# Access the collection
db = client["mydatabase"]
collection = db["products"]

# Create indexes on the name and description fields
collection.create_index("name")
collection.create_index("description")

# Search for products with a specific keyword
keyword = "keyword"
query = {"$or": [{"name": {"$regex": keyword, "$options": "i"}},
                 {"description": {"$regex": keyword, "$options": "i"}}]}
result = collection.find(query)

# Print the products matching the keyword
for product in result:
    print("Name:", product["name"])
    print("Description:", product["description"])
    print()