1. What are the key differences between SQL and NoSQL databases?



**SQL Databases:** Relational, use structured schemas with tables, enforce strict data consistency, support SQL for querying, and are vertically scalable. Best for structured data and complex joins.

**NoSQL Databases:** Non-relational, schema-less or flexible schemas, support various data models (e.g., document, key-value), horizontally scalable, and prioritize performance and scalability over strict consistency. Ideal for unstructured or semi-structured data and high-traffic applications.

2. What makes MongoDB a good choice for modern applications?



* Flexible schema (document-based) supports dynamic data structures.

* Scales horizontally via sharding for high traffic.

* Supports rich queries, aggregation, and indexing for performance.

* High availability through replication.

* Cloud-native with MongoDB Atlas for easy deployment and management.

* Handles large-scale, unstructured, or semi-structured data well.

3. Explain the concept of collections in MongoDB.



Collections are analogous to tables in SQL but store JSON-like BSON documents without a fixed schema. Documents within a collection can have different structures, allowing flexibility. Collections are stored in a database and can be queried, indexed, or aggregated.

4. How does MongoDB ensure high availability using replication?



MongoDB uses replica sets, a group of nodes (primary and secondary) that maintain copies of the data. The primary node handles writes, while secondaries replicate data asynchronously. If the primary fails, an election process promotes a secondary to primary, ensuring continuous availability.

5. What are the main benefits of MongoDB Atlas?



* Fully managed cloud database service, reducing administrative overhead.
* Automated backups, scaling, and upgrades.
* Built-in security features like encryption and role-based access control.
* Global cluster support for low-latency access.
* Monitoring and performance optimization tools.

6. What is the role of indexes in MongoDB, and how do they improve performance?



Indexes store a subset of data (e.g., field values) in a structure optimized for fast queries. They reduce the need to scan entire collections, improving query performance for searches, sorts, and filters. Common types include single-field, compound, and text indexes.

7. Describe the stages of the MongoDB aggregation pipeline.



The aggregation pipeline processes documents through a series of stages, each transforming the data:

* $match: Filters documents.

* $group: Groups documents and applies accumulators (e.g., sum, count).

* $sort: Sorts documents.

* $project: Reshapes documents (select or compute fields).

* $limit/ $skip: Limits or skips documents.

* $lookup: Joins data from another collection.

* $unwind: Expands arrays into separate documents.

8. What is sharding in MongoDB? How does it differ from replication?



* Sharding: Distributes data across multiple servers (shards) to handle large datasets and high throughput. Each shard holds a subset of data.

* Replication: Creates copies of data across nodes for redundancy and high availability. All nodes in a replica set hold the same data.

* Difference: Sharding focuses on scalability by partitioning data; replication focuses on availability and fault tolerance.

9. What is PyMongo, and why is it used?



PyMongo is a Python driver for MongoDB, enabling Python applications to interact with MongoDB databases. It’s used to perform CRUD operations, manage collections, and execute queries programmatically.

10.
What are the ACID properties in the context of MongoDB transactions?





* Atomicity: Ensures all operations in a transaction complete successfully or are rolled back.

* Consistency: Transactions bring the database from one valid state to another, maintaining data integrity.

* Isolation: Transactions are executed independently, preventing partial changes from being visible.

* Durability: Committed transactions are permanently saved, even in case of a system failure.

* MongoDB supports ACID transactions in replica sets and sharded clusters since version 4.0.

11. What is the purpose of MongoDB's explain() function?



The explain() function provides details about how MongoDB executes a query, including index usage, query plan, and execution statistics. It helps optimize queries by identifying performance bottlenecks.

12. How does MongoDB handle schema validation?



MongoDB supports schema validation using JSON Schema. You can define validation rules (e.g., required fields, data types) for a collection using the validator option in createCollection or collMod. Invalid documents are rejected during inserts or updates.

13. What is the difference between a primary and a secondary node in a replica set?



* Primary Node: Handles all write operations and serves read operations (by default). There’s only one primary per replica set.

* Secondary Node: Replicates data from the primary and can serve read operations. It can become primary during failover.

14. What security mechanisms does MongoDB provide for data protection?



* Authentication (e.g., SCRAM, LDAP, x.509 certificates).

* Role-based access control (RBAC) for user permissions.

* Encryption at rest (using WiredTiger) and in transit (TLS/SSL).

* Auditing for tracking database activity.

* IP whitelisting and VPC peering in MongoDB Atlas.

15. Explain the concept of embedded documents and when they should be used.



Embedded documents are nested JSON-like structures within a document. They’re used for one-to-few relationships or when data is frequently accessed together (e.g., user profile with address). They reduce the need for joins but can increase document size.

16. What is the purpose of MongoDB's $lookup stage in aggregation?



The $lookup stage performs a left outer join, linking documents from one collection to documents in another based on a matching condition. It’s used to combine related data across collections.

17. What are some common use cases for MongoDB?



* E-commerce (product catalogs, user profiles).

* Real-time analytics (event logging, IoT).
* Content management (blogs, CMS).
* Social media platforms (user feeds, comments).
* Big data applications requiring horizontal scaling.

18. What are the advantages of using MongoDB for horizontal scaling?



* Sharding distributes data across servers, handling large datasets and high traffic.

* Flexible schema supports dynamic data growth.
* Automatic balancing ensures even data distribution.
* Easy integration with cloud platforms like MongoDB Atlas.

19. How do MongoDB transactions differ from SQL transactions?



* **MongoDB:** Transactions are supported in replica sets and sharded clusters (since v4.0), but they’re designed for document-based operations and may have performance overhead due to distributed systems.

* **SQL:** Transactions are table-based, typically faster for relational operations, and have mature support for complex joins and constraints.

20. What are the main differences between capped collections and regular collections?



* **Capped Collections:** Fixed-size, automatically overwrite old data (FIFO), ideal for logging or caching. No indexes by default, and updates must preserve document size.

* **Regular Collections:** No size limit, support indexes, and allow flexible updates. Suitable for general-purpose data storage.

21. What is the purpose of the $match stage in MongoDB's aggregation pipeline?



The $match stage filters documents based on specified criteria, similar to a find query. It’s used early in the pipeline to reduce the dataset for subsequent stages.

22. How can you secure access to a MongoDB database?



* Enable authentication and use strong credentials.

* Implement RBAC to limit user access.
* Use TLS/SSL for encrypted connections.
* Configure firewall rules or IP whitelisting.
* Enable encryption at rest.
* Regularly audit and monitor database activity.

23. What is MongoDB's WiredTiger storage engine, and why is it important?



WiredTiger is MongoDB’s default storage engine, offering high performance, compression, and concurrency control. It supports document-level locking, reducing contention, and provides efficient storage with compression, making it critical for scalability and performance.

# FOR PRACTICAL REFER THE BELOW LINK:

* click on view raw

https://github.com/DM2ih/Python_data-structure/blob/main/mongo%20project.zip

* or refer pdf link below


https://github.com/DM2ih/Python_data-structure/blob/main/mongo-project.pdf