1. NoSQL Databases:
   a. Write a Python program that connects to a MongoDB database and inserts a new document into a collection named "students". The document should include fields such as "name", "age", and "grade". Print a success message after the insertion.
   b. Implement a Python function that connects to a Cassandra database and inserts a new record into a table named "products". The record should contain fields like "id", "name", and "price". Handle any potential errors that may occur during the insertion.


In [None]:
# a)

from pymongo import MongoClient

# Connect to MongoDB
client = MongoClient('mongodb://localhost:27017/')
database = client['mydatabase']  # Replace 'mydatabase' with your database name
collection = database['students']  # Replace 'students' with your collection name

# Prepare the document
document = {
    'name': 'John Doe',
    'age': 20,
    'grade': 'A'
}

# Insert the document into the collection
collection.insert_one(document)

# Print success message
print("Document inserted successfully.")


In [None]:
#b)

from cassandra.cluster import Cluster
from cassandra.auth import PlainTextAuthProvider
from cassandra import ConsistencyLevel
from cassandra.query import SimpleStatement

def insert_product(product_id, product_name, product_price):
    # Connect to Cassandra
    auth_provider = PlainTextAuthProvider(username='your_username', password='your_password')
    cluster = Cluster(['localhost'], auth_provider=auth_provider)
    session = cluster.connect('your_keyspace')  # Replace 'your_keyspace' with your keyspace name

    # Prepare the query
    query = """
        INSERT INTO products (id, name, price)
        VALUES (?, ?, ?)
    """
    prepared = session.prepare(query)
    statement = SimpleStatement(prepared, consistency_level=ConsistencyLevel.QUORUM)

    # Execute the query
    try:
        session.execute(statement, (product_id, product_name, product_price))
        print("Record inserted successfully.")
    except Exception as e:
        print("Error occurred while inserting record:", str(e))

    # Close the connection
    session.shutdown()
    cluster.shutdown()

# Example usage
insert_product(1, 'Product A', 9.99)


2. Document-oriented NoSQL Databases:
   a. Given a MongoDB collection named "books", write a Python function that fetches all the books published in the last year and prints their titles and authors.
   b. Design a schema for a document-oriented NoSQL database to store customer information for an e-commerce platform. Write a Python program to insert a new customer document into the database and handle any necessary validations.


In [None]:
#a)

from pymongo import MongoClient
from datetime import datetime, timedelta

def fetch_recent_books():
    # Connect to MongoDB
    client = MongoClient('mongodb://localhost:27017/')
    database = client['mydatabase']  # Replace 'mydatabase' with your database name
    collection = database['books']  # Replace 'books' with your collection name

    # Calculate the date one year ago from today
    one_year_ago = datetime.now() - timedelta(days=365)

    # Query for books published in the last year
    query = {'publish_date': {'$gte': one_year_ago}}
    projection = {'title': 1, 'author': 1}
    books = collection.find(query, projection)

    # Print book titles and authors
    for book in books:
        print(f"Title: {book['title']}")
        print(f"Author: {book['author']}")
        print()

    # Close the connection
    client.close()

# Example usage
fetch_recent_books()


In [None]:
# b)

{
    "_id": "unique_customer_id",
    "name": "customer_name",
    "email": "customer_email",
    "address": {
        "street": "street_address",
        "city": "city_name",
        "state": "state_name",
        "country": "country_name",
        "zip_code": "zip_code"
    },
    "phone": "phone_number",
    "orders": [
        {
            "order_id": "unique_order_id",
            "order_date": "order_date",
            "items": [
                {
                    "product_id": "unique_product_id",
                    "product_name": "product_name",
                    "quantity": "quantity",
                    "price": "price"
                },
                ...
            ],
            "total_amount": "order_total_amount"
        },
        ...
    ]
}


{'_id': 'unique_customer_id',
 'name': 'customer_name',
 'email': 'customer_email',
 'address': {'street': 'street_address',
  'city': 'city_name',
  'state': 'state_name',
  'country': 'country_name',
  'zip_code': 'zip_code'},
 'phone': 'phone_number',
 'orders': [{'order_id': 'unique_order_id',
   'order_date': 'order_date',
   'items': [{'product_id': 'unique_product_id',
     'product_name': 'product_name',
     'quantity': 'quantity',
     'price': 'price'},
    Ellipsis],
   'total_amount': 'order_total_amount'},
  Ellipsis]}

3. High Availability and Fault Tolerance:
   a. Explain the concept of replica sets in MongoDB. Write a Python program that connects to a MongoDB replica set and retrieves the status of the primary and secondary nodes.
   b. Describe how Cassandra ensures high availability and fault tolerance in a distributed database system. Write a Python program that connects to a Cassandra cluster and fetches the status of the nodes.


In [None]:
# # a)In MongoDB, a replica set is a cluster of MongoDB servers that work together to provide high availability and data redundancy. It consists of multiple nodes, where one node acts as the primary and the others act as secondary nodes. The primary node receives write operations and propagates the changes to the secondary nodes, ensuring data consistency.

# Here's a Python program that connects to a MongoDB replica set and retrieves the status of the primary and secondary nodes:

from pymongo import MongoClient

# Connect to the replica set
client = MongoClient('mongodb://localhost:27017,localhost:27018,localhost:27019/?replicaSet=myReplicaSet')

# Get the replica set status
status = client.admin.command('replSetGetStatus')

# Print the status of each node
for member in status['members']:
    if member['stateStr'] == 'PRIMARY':
        print(f"Primary Node: {member['name']}")
    elif member['stateStr'] == 'SECONDARY':
        print(f"Secondary Node: {member['name']}")

# Close the connection
client.close()


In [None]:
# b. Cassandra ensures high availability and fault tolerance in a distributed database system through the following mechanisms:

# Partitioning and Replication: Cassandra partitions data across multiple nodes using a distributed hash-based partitioner. Each partition is replicated to multiple nodes to ensure data redundancy and fault tolerance.

# Peer-to-Peer Architecture: Cassandra follows a peer-to-peer architecture, where all nodes are equal and communicate with each other without a central coordinator. This decentralized approach enables fault tolerance and scalability.

# Data Replication Strategy: Cassandra allows choosing a data replication strategy, such as SimpleStrategy or NetworkTopologyStrategy. The replication strategy determines how data is distributed across nodes and ensures replicas are placed in different data centers or racks for fault tolerance.

# Gossip Protocol: Cassandra uses a gossip protocol to disseminate information about the cluster's status and health among nodes. Nodes exchange information about the availability and status of other nodes, allowing them to detect failures and maintain a consistent view of the cluster.

# Here's an example Python program that connects to a Cassandra cluster and fetches the status of the nodes using the nodetool utility:


import subprocess

def fetch_node_status():
    # Execute nodetool status command
    output = subprocess.check_output(['nodetool', 'status']).decode('utf-8')

    # Parse and print the status of each node
    lines = output.split('\n')
    for line in lines[1:]:
        if line.startswith('UN'):
            parts = line.split()
            node = parts[0]
            status = parts[1]
            print(f"Node: {node}, Status: {status}")

# Example usage
fetch_node_status()


4. Sharding in MongoDB:
   a. Explain the concept of sharding in MongoDB and how it improves performance and scalability. Write a Python program that sets up sharding for a MongoDB cluster and inserts multiple documents into a sharded collection.
   b. Design a sharding strategy for a social media application where user data needs to be distributed across multiple shards. Write a Python program to demonstrate how data is distributed and retrieved from the sharded cluster.


In [None]:
# a. Sharding in MongoDB is a technique used to horizontally partition data across multiple servers called shards. Each shard contains a subset of the data, and collectively they form a sharded cluster. Sharding improves performance and scalability by distributing the data and workload across multiple servers, allowing for increased storage capacity, throughput, and query performance.

# When a sharded collection is created, MongoDB automatically partitions the data based on a shard key. The shard key is a field or combination of fields in the documents. MongoDB uses the shard key to determine which shard should store the document. The sharded cluster consists of the following components:

# 1. **Mongos**: The mongos acts as a router and provides the interface for client applications to interact with the sharded cluster. It receives queries from clients, determines the target shards based on the shard key, and forwards the queries to the appropriate shards.

# 2. **Config servers**: The config servers store the metadata and configuration for the sharded cluster. They maintain the mapping between the shard key ranges and the corresponding shards.

# 3. **Shards**: Shards are individual MongoDB servers that store a portion of the sharded data. Each shard operates as a standalone replica set to provide high availability and data redundancy. The data is distributed across the shards based on the shard key.

# Here's a Python program that sets up sharding for a MongoDB cluster and inserts multiple documents into a sharded collection:

from pymongo import MongoClient

# Connect to the config servers
client = MongoClient('mongodb://configServer1:27017,configServer2:27017,configServer3:27017/?replicaSet=configReplSet')

# Enable sharding for a database
client.admin.command('enableSharding', 'mydatabase')  # Replace 'mydatabase' with your database name

# Shard a collection based on a shard key
client.admin.command('shardCollection', 'mydatabase.collection', key={'shardKeyField': 1})  # Replace 'collection' and 'shardKeyField' as needed

# Connect to the mongos router
client = MongoClient('mongodb://mongosServer1:27017,mongosServer2:27017')

# Insert multiple documents into the sharded collection
documents = [
    {'shardKeyField': 1, 'field1': 'value1'},
    {'shardKeyField': 2, 'field1': 'value2'},
    # Add more documents as needed
]
client.mydatabase.collection.insert_many(documents)  # Replace 'collection' with your sharded collection name

# Close the connection
client.close()



# b. Designing a sharding strategy for a social media application depends on various factors such as the expected data volume, access patterns, and query requirements. One possible sharding strategy for a social media application is to shard the data based on the user's unique identifier or username. This approach ensures that user data is distributed across multiple shards while still allowing efficient retrieval of user-related information.

# Here's a Python program to demonstrate how data is distributed and retrieved from a sharded cluster based on the user's unique identifier:

from pymongo import MongoClient

# Connect to the mongos router
client = MongoClient('mongodb://mongosServer1:27017,mongosServer2:27017')

# Fetch user data based on the user's unique identifier
def fetch_user_data(user_id):
    collection = client.mydatabase.users  # Replace 'users' with your sharded collection name

    query = {'_id': user_id}
    user_data = collection.find_one(query)

    return user_data

# Insert a user document into the sharded collection
def insert_user(user_id, username, data):
    collection = client.mydatabase.users  # Replace 'users' with your sharded collection name

    document = {'_id': user_id, 'username': username, 'data': data}
    collection.insert_one(document)

# Demonstrate fetching user data from the sharded cluster
user_id = '12345'
user_data = fetch_user_data(user_id)
if user_data:
    print(f"User: {user_data['username']}")
    print(f"Data: {user_data['data']}")
else:
    print("User not found.")

# Close the connection
client.close()


5. Indexing in MongoDB:
   a. Describe the concept of indexing in MongoDB and its importance in query optimization. Write a Python program that creates an index on a specific field in a MongoDB collection and executes a query using that index.
   b. Given a MongoDB collection named "products", write a Python function that searches for products with a specific keyword in the name or description. Optimize the query by adding appropriate indexes.


In [None]:
# a. Indexing in MongoDB is a technique used to improve query performance by creating indexes on specific fields in a collection. An index is a data structure that allows for efficient data retrieval based on the indexed fields. MongoDB uses B-tree indexes by default, which provide fast access to data in sorted order.

# Indexes in MongoDB are essential for query optimization as they can significantly reduce the number of documents that need to be scanned to satisfy a query. By creating indexes on frequently queried fields, MongoDB can quickly locate the relevant documents, leading to improved query performance.

# Here's a Python program that creates an index on a specific field in a MongoDB collection and executes a query using that index:


from pymongo import MongoClient

# Connect to MongoDB
client = MongoClient('mongodb://localhost:27017/')
database = client['mydatabase']  # Replace 'mydatabase' with your database name
collection = database['mycollection']  # Replace 'mycollection' with your collection name

# Create an index on the 'name' field
collection.create_index('name')

# Execute a query using the index
query = {'name': 'example'}
result = collection.find(query)

# Print the query results
for document in result:
    print(document)

# Close the connection
client.close()


# b. Here's a Python function that searches for products with a specific keyword in the name or description in a MongoDB collection named "products". The function optimizes the query by adding appropriate indexes:


from pymongo import MongoClient
from pymongo.operations import IndexModel

def search_products(keyword):
    # Connect to MongoDB
    client = MongoClient('mongodb://localhost:27017/')
    database = client['mydatabase']  # Replace 'mydatabase' with your database name
    collection = database['products']  # Replace 'products' with your collection name

    # Create indexes on the 'name' and 'description' fields if they don't exist
    index_name = IndexModel([('name', 'text')])
    index_description = IndexModel([('description', 'text')])
    collection.create_indexes([index_name, index_description])

    # Execute the search query
    query = {'$text': {'$search': keyword}}
    result = collection.find(query)

    # Print the query results
    for document in result:
        print(document)

    # Close the connection
    client.close()

# Example usage
search_products('keyword')