
# MongoDB Assignment â€“ Questions & Answers

## Theoretical Questions

### 1. Key differences between SQL and NoSQL databases
- SQL: Structured schema, relational tables, ACID by default.
- NoSQL: Flexible schema, document/key-value based, horizontally scalable.

### 2. Why MongoDB is good for modern applications
- Schema flexibility, scalability, JSON-like documents, high performance.

### 3. Collections in MongoDB
- A collection is a group of documents similar to a table in SQL.

### 4. High availability using replication
- Replica sets with primary and secondary nodes.

### 5. Benefits of MongoDB Atlas
- Managed cloud DB, backups, scaling, monitoring.

### 6. Role of indexes
- Improve query performance by reducing scanned documents.

### 7. Aggregation pipeline stages
- $match, $group, $project, $sort, $lookup, $limit.

### 8. Sharding vs Replication
- Sharding = data distribution, Replication = data redundancy.

### 9. PyMongo
- Python driver to interact with MongoDB.

### 10. ACID properties
- Atomicity, Consistency, Isolation, Durability.

### 11. explain() function
- Shows query execution plan.

### 12. Schema validation
- Enforces rules using JSON Schema.

### 13. Primary vs Secondary node
- Primary handles writes, Secondary replicates data.

### 14. Security mechanisms
- Authentication, authorization, encryption.

### 15. Embedded documents
- Store related data together for faster reads.

### 16. $lookup stage
- Performs join-like operations.

### 17. Common use cases
- E-commerce, real-time analytics, IoT.

### 18. Horizontal scaling advantages
- Handles large data & traffic efficiently.

### 19. MongoDB vs SQL transactions
- MongoDB supports multi-document ACID (limited).

### 20. Capped vs regular collections
- Capped has fixed size, FIFO.

### 21. $match stage
- Filters documents in aggregation.

### 22. Securing MongoDB access
- Use auth, roles, firewall, TLS.

### 23. WiredTiger engine
- High performance, compression, concurrency.

---

## Practical Questions (Python + PyMongo)

```python
from pymongo import MongoClient
import pandas as pd

client = MongoClient("mongodb://localhost:27017/")
db = client["superstore"]
collection = db["orders"]
```

### Load CSV into MongoDB
```python
df = pd.read_csv("Superstore.csv")
collection.insert_many(df.to_dict("records"))
```

### Retrieve all documents
```python
for doc in collection.find():
    print(doc)
```

### Count documents
```python
collection.count_documents({})
```

### Orders from West region
```python
collection.find({"Region": "West"})
```

### Sales > 500
```python
collection.find({"Sales": {"$gt": 500}})
```

### Top 3 highest profit
```python
collection.find().sort("Profit", -1).limit(3)
```

### Update Ship Mode
```python
collection.update_many({"Ship Mode": "First Class"}, {"$set": {"Ship Mode": "Premium Class"}})
```

### Delete Sales < 50
```python
collection.delete_many({"Sales": {"$lt": 50}})
```

### Total sales by region
```python
collection.aggregate([
    {"$group": {"_id": "$Region", "totalSales": {"$sum": "$Sales"}}}
])
```

### Distinct Ship Mode
```python
collection.distinct("Ship Mode")
```

### Count orders per category
```python
collection.aggregate([
    {"$group": {"_id": "$Category", "count": {"$sum": 1}}}
])
```
