# 17_February_16th_Assignment

### Q1. What is MongoDB? Explain non-relational databases in short. In which scenarios it is preferred to use MongoDB over SQL databases?

**MongoDB** is a popular **NoSQL (non-relational)** database that stores data in a flexible, JSON-like format called **BSON** (Binary JSON). It is designed to handle unstructured or semi-structured data, making it highly scalable and suitable for modern applications.

---

### **What Are Non-Relational Databases?**

**Non-relational databases** (or NoSQL databases) are databases that do not use the traditional table-based schema of relational databases. Instead, they store data in various formats like:
1. **Document-based** (e.g., MongoDB, CouchDB)
2. **Key-value pairs** (e.g., Redis, DynamoDB)
3. **Graph-based** (e.g., Neo4j)
4. **Column-family stores** (e.g., Cassandra, HBase)

**Key Features of Non-Relational Databases:**
- **Schema flexibility**: No fixed schema; data structures can evolve over time.
- **Horizontal scalability**: Easily scales across multiple servers.
- **High performance**: Optimized for specific data models and use cases.

---

### **When to Prefer MongoDB Over SQL Databases**

**Scenarios Where MongoDB Is Preferred:**
1. **Dynamic Schema Requirements**:
   - When the data structure is frequently changing or unstructured, MongoDB’s schema-less nature is advantageous.

2. **Handling Large Volumes of Data**:
   - MongoDB scales horizontally, making it suitable for applications with massive amounts of data.

3. **High-Performance Applications**:
   - Applications requiring low-latency data access and high throughput.

4. **Document-Oriented Data**:
   - When the data is naturally represented as documents (e.g., JSON), such as in content management systems or product catalogs.

5. **Real-Time Applications**:
   - Use cases like IoT, social media platforms, and analytics where real-time data processing is crucial.

6. **Geospatial Data**:
   - MongoDB provides excellent support for geospatial queries, making it ideal for location-based applications.

7. **Cloud-Native Applications**:
   - MongoDB integrates seamlessly with cloud platforms, enabling developers to build scalable, distributed systems.

---

### **When to Prefer SQL Databases Over MongoDB**
- **Structured Data**: When the data has a fixed schema and requires complex relationships.
- **ACID Compliance**: For applications requiring strong transactional guarantees (e.g., banking systems).
- **Complex Joins**: When complex queries and joins are frequent, SQL databases are more efficient.

---

### **Comparison Between MongoDB and SQL Databases**

| **Aspect**          | **MongoDB**                          | **SQL Databases**                   |
|----------------------|---------------------------------------|--------------------------------------|
| **Data Model**       | Document-based (JSON/BSON)           | Table-based (rows and columns)      |
| **Schema**           | Flexible (schema-less)               | Fixed (schema-dependent)            |
| **Scalability**      | Horizontal (distributed systems)     | Vertical (adding resources to a single server) |
| **Transactions**     | Supports multi-document transactions (limited) | Fully ACID-compliant                |
| **Performance**      | Optimized for unstructured data and high throughput | Optimized for structured data and complex queries |
| **Use Cases**        | Real-time, big data, IoT, CMS        | Banking, ERP, CRM                   |


## Q2. State and Explain the features of MongoDB.

### **Features of MongoDB**

MongoDB is a powerful NoSQL database known for its flexibility, scalability, and high performance. Below are its key features:

---

### **1. Schema-less Database**
- MongoDB is schema-less, meaning it does not require a predefined schema for storing data.
- Documents in a collection can have different fields, data types, and structures.
- **Example:** A collection can store documents like:

In [13]:
{ "name": "Alice", "age": 25 }
{ "name": "Bob", "address": "123 Main St" }


{'name': 'Bob', 'address': '123 Main St'}

### **2. Document-Oriented Storage**
- Data is stored in the form of **documents** (JSON-like BSON format).
- Each document represents a record and contains key-value pairs.
- **Example:**

In [16]:
{
  "name": "John Doe",
  "email": "john@example.com",
  "age": 30
}


{'name': 'John Doe', 'email': 'john@example.com', 'age': 30}

### **3. High Scalability**
- MongoDB supports **horizontal scaling** through **sharding**, where data is distributed across multiple servers.
- This makes it ideal for handling large datasets and high-traffic applications.



### **4. Indexing**
- MongoDB supports indexing to improve query performance.
- You can create indexes on any field, including embedded fields within documents.
- **Example:** Indexing the `email` field:


### **5. Aggregation Framework**
- MongoDB provides a powerful **aggregation framework** for data processing and analytics.
- Operations like filtering, grouping, and sorting can be performed on the server side.



### **6. Geospatial Queries**
- MongoDB supports geospatial data and queries, making it suitable for location-based applications.


### **7. High Availability**
- MongoDB ensures high availability through **replication**.
- A **replica set** consists of multiple copies of data, with one primary node and multiple secondary nodes for failover.


### **8. Ad Hoc Queries**
- MongoDB supports ad hoc queries, allowing you to query documents using dynamic conditions.

### **9. Flexible Data Model**
- MongoDB allows embedding documents and arrays, reducing the need for complex joins.

### **10. Transaction Support**
- MongoDB supports **multi-document ACID transactions**, ensuring data consistency across multiple documents.


### **11. Open-Source**
- MongoDB is open-source, allowing developers to use and modify it freely.


### **12. Integration with Big Data and Analytics**
- MongoDB integrates seamlessly with big data tools like Apache Spark, Hadoop, and BI tools.


### **13. Platform Independence**
- MongoDB runs on multiple platforms, including Windows, Linux, and macOS.


### **14. Easy Integration with Programming Languages**
- MongoDB has drivers for many programming languages, including Python, Java, Node.js, C#, and PHP.


### **15. GridFS**
- MongoDB supports storing large files (like images and videos) using **GridFS**, splitting files into smaller chunks.


### **Summary Table**

| **Feature**           | **Description**                                                   |
|------------------------|-------------------------------------------------------------------|
| Schema-less            | No predefined schema; flexible data structure.                  |
| Document-Oriented      | Data stored as JSON-like BSON documents.                        |
| High Scalability       | Horizontal scaling with sharding.                               |
| Indexing               | Improves query performance.                                     |
| Aggregation Framework  | Powerful analytics and data processing.                        |
| Geospatial Queries     | Supports location-based queries.                                |
| High Availability      | Ensures availability through replication.                      |
| Ad Hoc Queries         | Dynamic and flexible query capabilities.                       |
| Flexible Data Model    | Supports embedded documents and arrays.                        |
| Transaction Support    | Multi-document ACID transactions.                              |
| Open-Source            | Freely available and modifiable.                               |
| Big Data Integration   | Works with Hadoop, Spark, and BI tools.                        |
| Platform Independence  | Compatible with various operating systems.                     |
| Language Integration   | Drivers for multiple programming languages.                    |
| GridFS                | Handles large files efficiently.                                |

## Q3. Write a code to connect MongoDB to Python. Also, create a database and a collection in MongoDB.

Here is a complete code example to connect MongoDB to Python, create a database, and create a collection using the `pymongo` library.

---

### **Step 1: Install `pymongo`**


In [44]:
pip install pymongo


Note: you may need to restart the kernel to use updated packages.


### **Step 2: Python Code to Connect to MongoDB, Create a Database, and Create a Collection**

In [55]:
from pymongo import MongoClient

# Step 1: Connect to MongoDB
client = MongoClient("mongodb://localhost:27017/")  # Replace with your MongoDB connection string
print("Connected to MongoDB successfully!")

# Step 2: Create a Database
db = client["my_database"]  # Replace 'my_database' with your desired database name
print(f"Database created: {db.name}")

# Step 3: Create a Collection
collection = db["my_collection"]  # Replace 'my_collection' with your desired collection name
print(f"Collection created: {collection.name}")



Connected to MongoDB successfully!
Database created: my_database
Collection created: my_collection


---

### **Explanation of the Code**

1. **Connecting to MongoDB**:
   - `MongoClient("mongodb://localhost:27017/")` connects to the MongoDB server running locally. Replace the connection string with your remote MongoDB URI if needed.

2. **Creating a Database**:
   - `client["my_database"]` creates a database named `my_database`. MongoDB creates the database only when data is inserted into it.

3. **Creating a Collection**:
   - `db["my_collection"]` creates a collection named `my_collection`. Similar to databases, MongoDB creates the collection only when data is inserted into it.

4. **Inserting a Document**:
   - `collection.insert_one()` inserts a single document into the collection.
   - The document is a Python dictionary (e.g., `{"name": "John Doe", "email": "john.doe@example.com", "age": 30}`).

