# MongoDB : Theoretical Questions

1. What are the key differences between SQL and NoSQL databases ?

- Data Model :
  - SQL Databases are relational databases.
   - Structured data stored in tables with predefined schemas.
   - Relationships between data are defined using foreign keys.
   - Examples: MySQL, PostgreSQL, Oracle, SQL Server.

  - NoSQL Databases are non-relational.
   - Flexible data models: document, key-value, graph, or column-family.
   - Schema-less or dynamic schema.
   - Examples: MongoDB,Redis,Cassandra, Neo4j.

- Schema & Flexibility :
  - SQL Databases typically scale vertically.
   - Requires a fixed schema.
   - Changes to schema can be complex.

  - NoSQL Databases are designed to scale horizontally.
   - Schema-less or flexible schema.
   - Easier to evolve data models over time.

- Query Language and Transactional Integrity :
  - SQL Databases use Structured Query Language for defining and manipulating data.
  - NoSQL Databases don't have a single, standardized query language like SQL. Query languages vary by database type.

- Performance & Use Cases :
  - SQL
   - Strong consistency and ACID compliance.
   - Best for complex queries and transactions.
   - Ideal for financial systems, ERP, CRM.

  - NoSQL
   - High performance for large volumes of unstructured data.
   - Eventual consistency.
   - Great for real-time analytics, IoT, content management.

2. What makes MongoDB a good choice for modern applications?

 - MongoDB is an excellent choice for modern applications due to its flexible schema, horizontal scalability, high performance, and developer-friendliness.
  1. Flexible Document Model
    - Stores data as JSON-like documents, allowing nested structures.
    - No need for a fixed schema—perfect for evolving applications.
    - Ideal for handling unstructured or semi-structured data.

  2. Horizontal Scalability
    - Uses sharding to distribute data across multiple servers.
    - Supports massive datasets and high traffic without performance loss.
    - Great for applications expecting rapid growth or variable workloads.

  3. High Performance
    - Optimized for read/write-heavy workloads.
    - Uses efficient indexing and in-memory storage for fast access.
    - Suitable for real-time analytics, dashboards, and IoT systems.

  4. Developer-Friendly
    - Native support for many programming languages.
    - Integrates easily with modern frameworks.
    - JSON-like format maps naturally to objects in code.

  5. Real-Time Data Handling
    - Powerful aggregation framework for processing large datasets.
    - Supports geospatial queries and complex data relationships.

  6. High Availability & Fault Tolerance
    - Built-in replication and workload distribution.
    - Ensures data is always accessible, even during failures.

3. Explain the concept of collections in MongoDB ?

 - MongoDB, a collection is a grouping of documents, similar to a table in relational databases—but with much more flexibility.

    - Schema-less Nature: This is the most significant feature. A collection in MongoDB can hold documents with different fields and structures. For instance, a products collection could contain one document for a book with fields like title, author, and isbn, and another for a t-shirt with fields like brand, size, and color.  This flexibility makes it easy to store and manage diverse data types without having to alter the collection's structure.
    - Dynamic Creation: We don't need to explicitly create a collection before using it. When we insert the first document into a collection, MongoDB automatically creates the collection for us.
    - Documents and BSON: Collections store BSON documents, which are a binary representation of JSON. BSON supports more data types than JSON, such as Date and BinData, and is designed for efficient data traversal and manipulation.
    - Indexing: Just like tables in a relational database, collections can have indexes. These indexes allow MongoDB to quickly locate and retrieve documents, significantly improving query performance.
    - Primary Key: Each document in a collection has a unique primary key, _id, which is automatically generated by MongoDB.We can also provide our own custom _id value.

4.  How does MongoDB ensure high availability using replication?

  - MongoDB ensures high availability through a powerful feature called replication, specifically using replica sets.This architecture provides redundancy and an automatic failover mechanism, ensuring the database remains operational even if one server fails.
 - A replica set is a collection of MongoDB servers, each running a mongod process, that work together to maintain a consistent dataset.
     - Primary Node: Handles all write operations.
     - Secondary Nodes: Replicate data from the primary and can handle read operations.
     - Arbiter Node: Participates in elections but doesn't store data.

  - Replication Ensures High Availability :
     1. Automatic Failover
     - If the primary node fails, the replica set automatically elects a new primary from the secondaries.
     - This ensures that the database remains operational without manual intervention.
     2. Data Redundancy
     - All changes made to the primary are recorded in an oplog.
     - Secondary nodes continuously replicate this log to stay in sync.
     3. . Disaster Recovery
     - Multiple copies of data across different servers protect against hardware failures or data corruption.
     4. Read Scaling
     - Secondary nodes can handle read queries, distributing the load and improving performance.
     5. No Downtime for Maintenance
     - We can perform backups or maintenance on secondary nodes without affecting the availability of the database.

5. What are the main benefits of MongoDB Atlas ?

 - MongoDB Atlas is a fully managed cloud database service designed to simplify and supercharge how developers build and scale applications.

  1.  Simplicity and Automation : MongoDB Atlas simplifies database management, allowing us to focus on building applications rather than on administrative tasks.

    - Easy Deployment: We can provision a new MongoDB cluster in just a few clicks across major cloud providers like AWS, Google Cloud, and Microsoft Azure.
    - Automated Operations: Atlas handles critical tasks such as backups, patching, and upgrades automatically, which reduces manual effort and potential human error.
    - Built-in Monitoring: It provides real-time performance monitoring and an easy-to-use dashboard to help us identify and resolve issues quickly.

  2. Scalability and Performance : Atlas is designed to scale with our application's needs, ensuring high performance and availability.

    - Horizontal Scaling: It supports sharding, which automatically distributes our data across multiple servers to handle increasing data volumes and traffic.
    - High Availability: Every cluster is a replica set with automatic failover, meaning our database remains online even if a primary server fails.
    - Global Clusters: We can deploy clusters across multiple regions and clouds to reduce latency for global users and meet data residency requirements.

  3. Robust Security : Atlas provides enterprise-grade security features out of the box, protecting our data with minimal configuration.

    - Encryption: Data is encrypted both in transit and at rest, ensuring it's secure from unauthorized access.
    - Network Isolation: It allows us to restrict database access to specific IP addresses or connect through private networks, which provides an extra layer of security.
    - Authentication and Access Control: We can manage user roles and permissions with fine-grained control, ensuring that only authorized users can access specific data.

6.  What is the role of indexes in MongoDB, and how do they improve performance?

 - Indexes in MongoDB play a crucial role in improving query performance by allowing the database to quickly locate and retrieve documents without scanning the entire collection.
 - Indexes Improve Performance :
  1. Faster Query Execution
     - Indexes reduce the number of documents MongoDB needs to examine.
     - Queries can jump directly to relevant data instead of scanning everything.
  2. Efficient Sorting
     - Indexes maintain order, allowing MongoDB to return sorted results quickly.
     - Especially useful for queries with sort() operations.
  3. Optimized Range Queries
     - Indexes allow MongoDB to skip irrelevant data in range-based queries.
  4. Covered Queries
     -  If all fields in a query are part of an index, MongoDB can return results without accessing the actual documents, improving speed and reducing I/O.
  5. Uniqueness Enforcement
     - Unique indexes ensure no duplicate values, maintaining data integrity.

7. Describe the stages of the MongoDB aggregation pipeline ?

 - The MongoDB aggregation pipeline is a powerful framework that processes data through a sequence of stages, each transforming the documents in some way.
 - Core Aggregation Pipeline Stages
    - $match	: Filters documents based on specified criteria.
    - $group : Groups documents by a field and performs aggregations.
    - $project	: Reshapes documents by including, excluding, or computing fields.
    - $sort : Sorts documents by specified fields.
    - $limit	: Restricts the number of documents passed to the next stage.
    - $skip : Skips a specified number of documents.
    - $count	: Returns a count of documents at that stage.
    - $addFields	: Adds new fields or modifies existing ones.
    - $set :	Alias for $addFields; overwrites existing fields if names match.
    - $unwind :	Deconstructs arrays into multiple documents.
    - $lookup	: Performs a left outer join with another collection.
    - $facet	: Runs multiple pipelines in parallel on the same input.
    - $bucket	: Categorizes documents into buckets based on boundaries.
    - $bucketAuto	: Automatically creates buckets based on distribution.

8. What is sharding in MongoDB? How does it differ from replication ?

 - Sharding works by partitioning a large dataset into smaller, more manageable subsets called shards. Each shard is an independent MongoDB instance that holds only a portion of the data.
    - Shards: The actual servers that store a subset of the data. To ensure high availability, each shard is typically a replica set.
    - Query Routers : These act as a front-end to the cluster. Applications connect to a mongos instance, which routes queries to the appropriate shards.
    - Config Servers: These servers store the metadata for the cluster, including which data ranges are located on which shards.

 - Both sharding and replication are horizontal scaling strategies, they serve fundamentally different purposes :
  1. Purpose
     - Scalability. Distributes data to increase storage capacity and handle high write and read throughput.
     - High Availability & Data Redundancy. Creates copies of the same data to prevent data loss and ensure the database remains operational.
  2. Data Distribution
    - Data is partitioned. Each server holds only a subset of the entire dataset.
    - Data is duplicated. Each server holds a full copy of the entire dataset.
  3. Scaling
    - Scales writes and reads horizontally. By adding more shards, we increase the total capacity for both.
    - Scales reads horizontally by distributing queries across secondary nodes. Writes are still handled by a single primary node.
  4. Failure
    - If a shard fails, the data on that shard becomes unavailable, but the rest of the cluster remains operational.
    - If the primary node fails, a new primary is automatically elected from the secondary nodes, and all data remains available.

9. What is PyMongo, and why is it used ?

 - PyMongo is the official Python driver for MongoDB, developed and maintained by MongoDB Inc. It provides a robust and intuitive interface for interacting with MongoDB databases directly from Python applications. It provides a simple and intuitive way to perform database operations, such as creating, reading, updating, and deleting documents.
     - Connection Management : Easily connect to local or cloud-hosted MongoDB instances.
     - CRUD Operations : Perform Create, Read, Update, and Delete operations on documents.
     - Aggregation Framework : Build powerful data pipelines for analytics.
     - Indexing & Query Optimization : Create indexes and optimize queries.
     - Geospatial Queries : Handle location-based data.
     - GridFS Support : Store and retrieve large files like images and videos.
     - Example :
           from pymongo import MongoClient
           # Connect to MongoDB
           client = MongoClient("mongodb://localhost:27017/")
           db = client["mydatabase"]
           collection = db["users"]

           # Insert a document
           collection.insert_one({"name": "Alice", "age": 30})

           # Query documents
           for user in collection.find({"age": {"$gt": 25}}):
           print(user)


10. What are the ACID properties in the context of MongoDB transactions ?

 - The Atomicity, Consistency, Isolation, and Durability (ACID)properties ensure that database transactions are processed reliably.
 - ACID Properties in MongoDB Transactions
   1. Atomicity
     - All operations in a transaction are treated as a single unit.
     - Either all succeed, or none are applied.
     - Example: Transferring money between two accounts—if the debit fails, the credit won't happen either.

   2. Consistency
     - Transactions move the database from one valid state to another.
     - MongoDB enforces consistency through:
     - Schema validation
     - Unique indexes
     - Transaction-level checks
     - Example: If a document violates schema rules during a transaction, the entire transaction is aborted.

  3. Isolation
     - Transactions are isolated from each other.
     - Intermediate states are not visible to other operations.
     - Prevents dirty reads and ensures that concurrent transactions don't interfere.

  4. Durability
     - Once a transaction is committed, its changes are permanently saved, even in the event of a crash.
     - MongoDB writes to disk and uses journaling to ensure durability.

11.  What is the purpose of MongoDB’s explain() function?

 - The explain() function in MongoDB is a powerful diagnostic tool used to analyze and understand how queries are executed. It helps developers and database administrators optimize performance by revealing the internal workings of the query planner and execution engine.

 - The main purpose of MongDB's explain () function :

   1. Query Optimization
      - Reveals how MongoDB plans to execute a query.
      - Shows whether indexes are used, and which ones.
      - Helps identify inefficient queries or missing indexes.

  2. Execution Statistics
      - Provides detailed metrics like number of documents scanned, time taken, and index usage.
      - Useful for performance tuning and debugging.

   3. Plan Selection Insight
      - Displays the winning plan chosen by MongoDB's query optimizer.
      - In allPlansExecution mode, shows stats for all candidate plans, not just the winner.

12.  How does MongoDB handle schema validation?

 - MongoDB handles schema validation using a flexible yet powerful mechanism that allows us to enforce rules on the structure and content of documents in a collection.
 - We define validation rules using a standard JSON schema, which is then applied to a collection. The rules are specified using the validator option when we create or modify a collection. The validator can check for various conditions, such as:
       - Field existence: Ensuring a document has all the required fields.
       - Data types: Verifying that a field's value is of a specific type.
       - Value constraints: Setting minimum or maximum values, or enforcing a specific list of allowed values.

 - MongoDB provides flexibility in how it enforces validation:

       - validationLevel: This option determines how strictly validation is applied.
       - strict : Validation is applied to all inserts and updates.
       - moderate : Validation is applied to inserts and to updates on existing valid documents. It skips validation on documents that were already invalid.
       - validationAction: This option specifies what happens when a document fails validation.
       - error : The write operation is rejected, and an error is returned.
       - warn : The write operation is permitted, but a warning message is  logged in the MongoDB server logs.

13.  What is the difference between a primary and a secondary node in a replica set?

 - Primary Node
      - Role: Handles all write operation.
      - Oplog Source: Maintains an oplog that records all changes.
      - Replication: Secondary nodes replicate data from the primary's oplog.
      - Uniqueness: Only one primary exists at a time in a replica set.
      - Failover: If the primary fails, an election is triggered to promote a secondary to primary.
 - Secondary Node
      - Role: Replicates data from the primary and can handle read operations.
      - Read Preference: Can be used for reads to reduce load on the primary.
      - No Writes: Cannot accept writes unless promoted to primary.
      - Failover Capability: Eligible to become primary during automatic failove.

14. What security mechanisms does MongoDB provide for data protection ?

 - MongoDB provides a comprehensive set of security mechanisms to protect data across various deployment environments.

 1. Authentication
  - MongoDB supports multiple authentication methods to verify user identity:

      - SCRAM - Default and secure password-based authentication.
      - x.509 Certificates - Used for client and server authentication over TLS/SSL.
      - LDAP & Kerberos - Integration with enterprise identity systems for centralized user management.
      - OIDC/OAuth 2.0 - Available in MongoDB Atlas for federated identity management.

  2. Authorization
   - MongoDB uses Role-Based Access Control to define what authenticated users can do:

      - Create custom roles with granular permissions.
      - Limit access to specific databases, collections, or operations.
      - Enforce least privilege principles for enhanced security.

  3. Encryption
   - MongoDB offers encryption for both data in transit and data at rest:    
      - In Transit - TLS/SSL encryption secures communication between clients and servers.
      - At Rest -	Available in MongoDB Enterprise and Atlas; uses AES encryption via WiredTiger.
      - Field-Level	- Client-side field-level encryption for sensitive fields like SSNs or credit cards.

  4. Auditing
   - MongoDB Enterprise includes an auditing framework to track:
       - User activity.
       - Access attempts.
       - Configuration changes.
  
  5. Network Security
       - IP Whitelisting - Restrict access to trusted IPs.
       - Private Endpoints & VPC Peering - Secure cloud deployments with isolated network paths.
       - Firewall Rules - Limit exposure by configuring firewalls and disabling unused ports.

  6. Additional Best Practices
       - Change default ports to reduce exposure to automated attacks.
       - Regularly rotate encryption keys.
       - Use strong passwords and enforce password policies.
       - Enable logging and monitor for suspicious activity.
       
15. Explain the concept of embedded documents and when they should be used ?

 - Embedded documents, also known as nested documents, are BSON documents that are stored inside another document. This allows us to represent a one-to-one or one-to-many relationship without using separate collections or performing joins, which is a key feature of MongoDB's flexible schema.
 - Embedded documents are best used when the nested data is closely related to the parent document and we'll typically access them together. This design choice can significantly improve performance and simplify our application logic.
     - We need to avoid joins: Retrieving related data from a single document is much faster than performing a join between two separate collections. This is a significant performance advantage in MongoDB, as it eliminates the need for multiple queries.
     - The relationship is a one-to-many relationship where the "many" side is relatively small and frequently accessed: For example, storing comments within a blog post document, or an address within a customer document.
     - The data is a logical part of the parent document: The embedded data has no meaning outside the context of the parent. For instance, an address embedded in a user document is a property of that user, not an independent entity.

16. What is the purpose of MongoDB’s $lookup stage in aggregation ?

 - The $lookup stage in the MongoDB aggregation pipeline performs a left outer join from one collection to another within the same database. Its main purpose is to enrich documents in a source collection with data from a "foreign" or target collection, allowing you to combine related data without needing to perform multiple queries.

 - The $lookup stage operates by taking documents from the input collection and matching a specific field to a field in the target collection . It then adds a new array field to the output documents, which contains the matching documents from the foreign collection.

 - The syntax for $lookup typically includes these parameters:
     - from: The name of the foreign collection to join with.
     - localField: The field from the input documents.
     - foreignField: The field from the documents in the from collection.
     - as: The name of the new array field to add to the output documents.

 - We should use $lookup when you need to denormalize data on the fly within an aggregation pipeline. This is a powerful feature for:

     - Combining related data from different collections: For example, you can join an orders collection with a products collection to get the details of each product within an order.
     - Building a complete view of an entity: You could join a users collection with a posts collection to get a list of all posts written by each user.

17. What are some common use cases for MongoDB ?

 - MongoDB is well-suited for a wide range of modern applications due to its flexible schema, scalability, and performance.
   1. E-Commerce Platforms
      - Why MongoDB? Flexible schema supports dynamic product catalogs, user profiles, and shopping carts.
      - Benefits: Fast search, personalized recommendations, and real-time inventory updates
  2. Mobile & Web Applications
      - Why MongoDB? JSON-like documents map naturally to app data structures.
      - Benefits: Rapid development, easy integration with frameworks like MERN/MEAN, and real-time sync.
  3. Real-Time Analytics & Operational Intelligence
      - Why MongoDB? Powerful aggregation framework and horizontal scalability.
      - Benefits: Live dashboards, fraud detection, and instant decision-making.
  4. Content Management Systems
      - Why MongoDB? Schema-less design accommodates varied content types.
      - Benefits: Easy updates, flexible metadata handling, and fast content delivery.
  5. Healthcare Systems
      - Why MongoDB? Handles complex, nested patient records and medical histories.
      - Benefits: Secure data storage, fast retrieval, and support for compliance standards.
   6. Internet of Things
      - Why MongoDB? Efficient storage of time-series and sensor data.
      - Benefits: Scalable ingestion, real-time monitoring, and predictive analytics.
   7. Big Data Applications
      - Why MongoDB? Sharding and flexible schema make it ideal for massive datasets.
      - Benefits: Distributed processing, fast queries, and support for unstructured data.
  8. Financial Services
      - Why MongoDB? ACID transactions and secure data handling.
      - Benefits: Customer segmentation, transaction tracking, and compliance reporting.

18. What are the advantages of using MongoDB for horizontal scaling?

 - MongoDB's horizontal scaling primarily achieved through sharding offers several compelling advantages, especially for applications that demand high performance, scalability, and resilienc.
      - Cost-Effectiveness: Horizontal scaling uses commodity hardware, which is cheaper than the high-end, powerful servers required for vertical scaling. We can start with a small cluster and add more servers as our data and traffic grow, making it a more economical long-term solution
      - High Performance and Throughput: Sharding distributes the database load across multiple servers. This means that read and write operations are processed in parallel, dramatically increasing the overall performance and throughput of the system. Each shard handles only a subset of the data, so queries are more efficient.
      - Increased Storage Capacity: As our dataset grows, we can simply add more shards to the cluster. This allows us to scale our storage capacity virtually without limits, making MongoDB ideal for big data applications and those with a high volume of data.
      - Improved Fault Tolerance: In a sharded cluster, the failure of a single shard does not bring down the entire database. The remaining shards can continue to operate, and the data from the failed shard can be recovered from its replica set. This provides a more resilient system compared to a single, monolithic database.
      - Simplified Management: MongoDB's sharding is designed to be relatively easy to manage. The mongos query router handles the complexity of routing queries to the correct shards, and the balancer automatically distributes data chunks to new shards, reducing the need for manual intervention.

19.  How do MongoDB transactions differ from SQL transactions?

 - MongoDB transactions differ from SQL transactions primarily in their granularity, scope, and implementation. While both ensure ACID properties, SQL transactions are designed for a relational model with joins and a fixed schema, whereas MongoDB transactions work with a flexible document model and were introduced later to handle multi-document consistency.
  1. Granularity and Data Model
      - SQL Transactions: SQL transactions are a fundamental part of the relational model. They typically involve multiple rows across multiple tables. Because SQL databases normalize data across different tables, a single logical operation often requires updates to multiple rows in different tables. SQL transactions are designed to handle this naturally.

      - MongoDB Transactions: Before version 4.0, MongoDB only had atomic operations for a single document. The document model encourages embedding related data, which often eliminates the need for multi-document transactions. When multi-document transactions were introduced, they provided ACID guarantees across multiple documents within a single replica set or sharded cluster. However, the best practice in MongoDB is to design our data model to avoid multi-document transactions when possible, as they can add performance overhead.

  2. Scope and Implementation
      - SQL Transactions: SQL transactions are a native, long-standing feature of the language and database design. They are managed with simple commands like BEGIN TRANSACTION, COMMIT, and ROLLBACK. They are deeply integrated with the database's locking and concurrency control mechanisms to ensure isolation and consistency.
      - MongoDB Transactions: MongoDB's multi-document transactions are managed through a session. We start a session, begin a transaction within that session, perform our operations, and then commit or abort the transaction. This session-based approach is different from the traditional SQL model and reflects MongoDB's architecture, which prioritizes scalability and horizontal distribution.

  3. ACID Guarantees
      - Both SQL and MongoDB transactions guarantee ACID properties:
      - Atomicity: Both ensure that all operations within a transaction succeed, or none do.
      - Consistency: Both ensure that the database remains in a valid state after a transaction.
      - Isolation: Both use various mechanisms to ensure that concurrent transactions do not interfere with each other.
      - Durability: Both guarantee that once a transaction is committed, the changes are permanent and survive system failures.

20. What are the main differences between capped collections and regular collections ?

- Capped collections are fixed-size, circular collections that are designed for high-performance logging and data caching, while regular collections are dynamic in size and are the default type for general-purpose data storage.

 - Capped collections have a maximum size in bytes or a maximum number of documents, whichever limit is reached first. When the collection reaches its limit, the oldest documents are automatically deleted to make room for new ones, which is why they are often referred to as "circular." This makes them ideal for use cases where we only need to store a finite amount of the most recent data.
     - Fixed Size: We must specify the size or document limit at creation.
     - High Performance: Capped collections are optimized for fast insertion and retrieval in insertion order. Since documents are always added to the end and never moved, there's no need for disk-level updates, making them very efficient for logging.
     - Insertion Order: Documents are stored in the order they are inserted, and they cannot be updated if the update changes the document's size.
     - Automatic Deletion: They automatically remove the oldest documents to free up space for new ones.

 - Regular collections are the default and most common type of collection. They have no predefined size limit and are designed for general-purpose use where data persistence and flexibility are priorities.

      - Dynamic Size: They can grow and shrink as needed, without a predefined limit.
      - Flexibility: We can insert, update, and delete documents without restrictions on size or position.
      - Indexing: Regular collections support all types of indexes, which are essential for optimizing complex queries.
      - Manual Deletion: Data is not automatically removed; we must explicitly delete documents.

21.  What is the purpose of the $match stage in MongoDB’s aggregation pipeline ?

 - The $match stage in MongoDB's aggregation pipeline is used to filter the documents that enter the pipeline. Its purpose is to efficiently reduce the number of documents passed to subsequent stages, which significantly improves the performance of the entire aggregation operation.

      - Performance Improvement: By filtering documents early, we reduce the workload for all the following stages. This is especially critical for large collections, as it avoids unnecessary processing.
     - Efficient Index Usage: If the $match stage is the first in the pipeline, MongoDB's query optimizer can use an index to quickly find the matching documents. This transforms a potentially slow collection scan into a much faster index scan.
      - Reduced Memory Usage: Processing fewer documents means the aggregation pipeline uses less memory, which helps to avoid hitting the 100MB memory limit for non-sharded collections.

22.  How can you secure access to a MongoDB database ?

 - Securing access to a MongoDB database is essential to protect sensitive data and prevent unauthorized access. MongoDB offers a range of built-in features and best practices to help us lock down your deployment effectively.
 - Secure MongoDB :
  1. Enable Authentication
      - Prevents anonymous access to the database.
      - Use mechanisms like : SCRAM-SHA-256 , X.509 certificates and LDAP or Kerberos for enterprise setups
      - Create users with specific roles and privileges using Role-Based Access Control.

  2. Use Authorization
      - Define what authenticated users can do.
      - Assign roles like read, readWrite, dbAdmin, etc., to control access to collections and operations.

  3. Encrypt Data
      - In Transit: Enable TLS/SSL to protect data exchanged between clients and servers.
      - At Rest: Use MongoDB's built-in encryption or integrate with external key management systems.

  4. Restrict Network Access
      - Bind MongoDB to localhost or specific IPs.
      - Use firewalls or security groups to limit access.
      - In MongoDB Atlas, configure an IP Access List to allow only trusted sources.

  5. Change Default Port
      - MongoDB uses port 27017 by default, which is commonly targeted.
      - Modify the port in the mongod.conf file to reduce exposure to automated attacks.

  6. Enable Auditing and Monitoring
      - Track database activity and detect suspicious behavior.
      - Use MongoDB's auditing features or integrate with external monitoring tools.

  7. Regular Backups and Updates
      - Keep MongoDB updated with the latest security patches.
      - Implement automated backups and test recovery procedures regularly.

23. What is MongoDB’s WiredTiger storage engine, and why is it important?

 - WiredTiger is MongoDB's default storage engine. It's the component of the database responsible for how data is managed, both in memory and on disk. WiredTiger is important because it introduced several key performance and efficiency improvements that made MongoDB a more robust and scalable database.
     - Document-Level Concurrency: Unlike the older storage engine which used collection-level locking, WiredTiger provides document-level locking. This means that multiple clients can write to different documents within the same collection simultaneously. This significantly increases throughput and reduces contention, making MongoDB perform much better under high-write workloads.

      - Data Compression: WiredTiger supports compression for both collections and indexes. This reduces the disk space required to store our data and, by extension, the amount of I/O operations needed. Lower I/O helps to improve performance, especially when dealing with large datasets. The engine uses different compression algorithms, with Snappy being the default.

      - Efficient Memory Usage: WiredTiger uses an internal cache to manage frequently accessed data. It also works in conjunction with the operating system's file system cache to optimize data retrieval. This intelligent use of memory ensures that data is served from the fastest available source, improving read and write performance.

     - Durability and Recovery: WiredTiger uses checkpoints and a write-ahead log to ensure data durability. It periodically writes a consistent snapshot of the data to disk . If the database crashes, it can recover to the last good checkpoint and then use the journal to reapply any changes that occurred after that point, ensuring no data is lost.
      










