In [1]:
### Feb 17 Assignment

###### Q1. What is MongoDB? Explain non-relational databases in short. In which scenarios it is preferred to use MongoDB over SQL databases?

**MongoDB** is a popular open-source, document-oriented NoSQL database management system that provides a flexible and scalable approach to data storage and retrieval. Unlike traditional SQL databases, MongoDB falls under the category of non-relational databases, also known as NoSQL databases.

**Non-relational databases**, often referred to as NoSQL databases, are designed to handle large volumes of unstructured or semi-structured data. They offer a more flexible and dynamic approach to data storage compared to traditional relational databases. Non-relational databases are characterized by the following features:

1. **Schema Flexibility**: Non-relational databases do not require a fixed schema. Data can be stored in a more dynamic and self-describing manner, allowing for changes in data structure without disrupting existing data.

2. **Scalability**: NoSQL databases are built to scale out horizontally across multiple servers or nodes, making them suitable for handling large amounts of data and high traffic loads.

3. **High Performance**: Non-relational databases are optimized for specific use cases, such as read-heavy or write-heavy workloads, and can deliver high performance for these scenarios.

4. **Variety of Data Models**: NoSQL databases support various data models, such as key-value, document, column-family, and graph, to accommodate different data storage requirements.

5. **Distributed Architecture**: Many NoSQL databases are designed for distributed and decentralized architectures, allowing them to be easily deployed across clusters of servers.

**When to Use MongoDB Over SQL Databases**:

MongoDB is often preferred over traditional SQL databases in the following scenarios:

1. **Unstructured or Semi-Structured Data**: When dealing with data that doesn't fit neatly into tabular structures, such as JSON-like documents, MongoDB provides a more natural and flexible way to store and retrieve this data.

2. **Dynamic Schema**: In cases where the data schema evolves frequently or is not well-defined upfront, MongoDB's schema-less nature allows for easy modifications without changing the existing data.

3. **Scalability**: MongoDB excels in handling large-scale applications that require horizontal scalability, especially in scenarios with high write or read throughput.

4. **Agile Development**: MongoDB's flexible schema and ease of use are well-suited for agile development environments where requirements may change rapidly.

5. **Complex Queries**: MongoDB's rich querying capabilities, including support for geospatial queries and text search, make it suitable for applications that require complex querying.

6. **Real-time Analytics**: When real-time data analytics and reporting are required, MongoDB's ability to handle fast and dynamic data ingestion can be advantageous.

It's important to note that the choice between MongoDB and SQL databases depends on the specific requirements of your application. While MongoDB offers benefits in terms of flexibility and scalability, SQL databases have their own strengths, particularly when dealing with well-defined, structured data and complex joins. The decision should be based on a careful evaluation of your application's needs and constraints.

###### Q2. State and Explain the features of MongoDB.

MongoDB is a popular NoSQL database management system known for its flexibility, scalability, and ease of use. It offers a range of features that make it well-suited for handling various types of data and applications. Here are some key features of MongoDB:

1. **Document-Oriented**: MongoDB stores data in flexible, self-describing documents using a format similar to JSON (BSON). Each document can have a different structure, allowing for dynamic and evolving schemas.

2. **Schema Flexibility**: MongoDB's schema-less nature allows you to easily add or modify fields in documents without affecting other documents. This is particularly useful in agile development environments where requirements change frequently.

3. **Dynamic Queries**: MongoDB supports dynamic queries using a rich query language that allows you to specify complex conditions and filters for data retrieval.

4. **Indexing**: MongoDB supports indexing to improve query performance. It provides various types of indexes, including single-field and compound indexes, as well as geospatial and text indexes.

5. **Aggregation Framework**: MongoDB's aggregation framework allows you to perform complex data transformations and aggregations within the database, reducing the need for extensive data processing in the application layer.

6. **High Availability**: MongoDB supports replica sets, which are groups of database instances that maintain the same data. This provides automatic failover and data redundancy, enhancing availability and reliability.

7. **Horizontal Scalability**: MongoDB can be easily scaled horizontally across multiple servers or clusters to handle increased data volumes and traffic loads.

8. **Geospatial Capabilities**: MongoDB supports geospatial data storage and queries, making it suitable for applications that require location-based services and spatial analysis.

9. **Full-Text Search**: MongoDB includes full-text search capabilities, allowing you to perform text-based searches across documents and collections.

10. **Auto-Sharding**: MongoDB's built-in auto-sharding distributes data across multiple shards (partitions), enabling seamless scaling of read and write operations.

11. **Document Versioning**: MongoDB supports document versioning and timestamping, making it useful for auditing and tracking changes to data over time.

12. **Ease of Development**: MongoDB provides native drivers for various programming languages, a flexible and intuitive query language, and comprehensive documentation, making it easy for developers to work with.

13. **Ad Hoc Queries**: MongoDB allows you to run ad hoc queries on large datasets without the need for predefined views or complex joins.

14. **Replica Sets**: MongoDB's replica sets offer data redundancy and high availability. They consist of primary and secondary nodes, with automatic failover in case of primary node failure.

15. **Atomic Transactions**: MongoDB supports multi-document ACID transactions, ensuring data consistency and integrity for complex operations.

16. **Security Features**: MongoDB provides authentication, authorization, role-based access control, and encryption options to secure your data.

Overall, MongoDB's features make it a versatile choice for a wide range of applications, from content management systems and e-commerce platforms to real-time analytics and Internet of Things (IoT) applications. Its flexible schema, scalability, and support for various data models make it suitable for both small-scale projects and large, high-performance applications.

###### Q3. Write a code to connect MongoDB to Python. Also, create a database and a collection in MongoDB.

To connect to MongoDB from Python, we can use the `pymongo` library, which provides a convenient way to interact with MongoDB databases. First, you need to install the library if you haven't already:

```bash
pip install pymongo
```

Here's an example Python code to connect to MongoDB, create a database, and a collection:

```python
import pymongo

# Establish a connection to the MongoDB server
client = pymongo.MongoClient("mongodb://localhost:27017/")

# Create a new database named "mydatabase"
mydb = client["mydatabase"]

# Create a new collection named "customers"
customers = mydb["customers"]

# Insert a document (record) into the "customers" collection
customer_data = {
    "first_name": "abc",
    "last_name": "cde",
    "email": "abc@example.com"
}

# Insert the document into the collection
customer_id = customers.insert_one(customer_data)

# Print the inserted document's ID
print("Inserted document ID:", customer_id.inserted_id)

# Close the MongoDB connection
client.close()
```

Explanation:

1. Import the `pymongo` module.

2. Establish a connection to the MongoDB server using the `MongoClient` class and specifying the connection URI. In this example, the URI specifies the default MongoDB port (`27017`) and the database to connect to.

3. Create a new database named `"mydatabase"` using the dictionary-like syntax of the `client` object.

4. Create a new collection named `"customers"` within the `"mydatabase"` database using the dictionary-like syntax of the `mydb` object.

5. Define a dictionary `customer_data` containing the customer's information.

6. Use the `insert_one()` method of the `customers` collection to insert the `customer_data` document into the collection. The method returns an object that contains the inserted document's ID.

7. Print the ID of the inserted document.

8. Close the MongoDB connection using the `close()` method of the `client` object.

###### Q4. Using the database and the collection created in question number 3, write a code to insert one record, and insert many records. Use the find() and find_one() methods to print the inserted record.

 `find()` and `find_one()` methods are used to retrieve and print the inserted records:

```python
import pymongo

# Establish a connection to the MongoDB server
client = pymongo.MongoClient("mongodb://localhost:27017/")

# Access the "mydatabase" database and "customers" collection
mydb = client["mydatabase"]
customers = mydb["customers"]

# Insert one record into the "customers" collection
customer_data_one = {
    "first_name": "abc",
    "last_name": "def",
    "email": "abc@example.com"
}
customers.insert_one(customer_data_one)

# Insert many records into the "customers" collection
customer_data_many = [
    {"first_name": "xyz", "last_name": "ijk", "email": "xyz@example.com"},
    {"first_name": "pqr", "last_name": "lmn", "email": "pqr@example.com"}
]
customers.insert_many(customer_data_many)

# Retrieve and print the inserted records using find()
print("All inserted records:")
for customer in customers.find():
    print(customer)

# Retrieve and print one inserted record using find_one()
print("\nOne inserted record:")
one_record = customers.find_one({"first_name": "abc"})
print(one_record)

# Close the MongoDB connection
client.close()
```

Explanation:

1. Establish a connection to the MongoDB server and access the "mydatabase" database and "customers" collection as done in the previous example.

2. Insert one record (document) using the `insert_one()` method.

3. Insert multiple records (documents) using the `insert_many()` method.

4. Use the `find()` method to retrieve and iterate through all the inserted records, printing each record.

5. Use the `find_one()` method to retrieve and print one specific inserted record based on a query.

6. Close the MongoDB connection.

###### Q5. Explain how you can use the find() method to query the MongoDB database. Write a simple code to demonstrate this. 

The `find()` method in MongoDB is used to query a collection and retrieve documents that match a specified query criteria. The `find()` method allows you to filter, sort, and limit the results based on your requirements. It returns a cursor, which is an iterator that can be used to iterate over the retrieved documents.

Here's a basic explanation of how to use the `find()` method and a simple code example to demonstrate it:

**Basic Syntax of `find()` Method**:
```python
cursor = collection.find(query, projection)
```

- `collection`: The MongoDB collection you want to query.
- `query`: A dictionary specifying the query criteria. You can use operators like `$eq`, `$gt`, `$lt`, etc., to specify conditions.
- `projection`: A dictionary specifying which fields to include or exclude from the retrieved documents.

**Example Code**:
Let's assume we have a MongoDB collection named `"students"` with documents containing information about students. We'll use the `find()` method to retrieve students who are older than 20 years.

```python
import pymongo

# Establish a connection to the MongoDB server
client = pymongo.MongoClient("mongodb://localhost:27017/")

# Access the "mydatabase" database and "students" collection
mydb = client["mydatabase"]
students = mydb["students"]

# Query to find students older than 20 years
query = {"age": {"$gt": 20}}

# Use the find() method to retrieve matching documents
cursor = students.find(query)

# Iterate through the cursor and print the retrieved documents
print("Students older than 20 years:")
for student in cursor:
    print(student)

# Close the MongoDB connection
client.close()
```

In this example:
1. We establish a connection to the MongoDB server and access the "mydatabase" database and "students" collection.
2. We define a query to find students older than 20 years using the `$gt` (greater than) operator.
3. We use the `find()` method to retrieve documents that match the query criteria. The result is a cursor.
4. We iterate through the cursor and print the retrieved documents.

The `find()` method allows you to perform various types of queries by specifying different query criteria, projection, sorting, and limiting options. It's a powerful tool for retrieving specific data from MongoDB collections based on your application's needs.

###### Q6. Explain the sort() method. Give an example to demonstrate sorting in MongoDB.

The `sort()` method in MongoDB is used to sort the documents in a collection based on one or more fields. Sorting can be done in ascending (default) or descending order, and it allows you to organize query results according to your desired criteria.

**Basic Syntax of `sort()` Method**:
```python
cursor = collection.find(query).sort(sort_field, sort_order)
```

- `collection`: The MongoDB collection you want to query.
- `query`: A dictionary specifying the query criteria.
- `sort_field`: The field based on which you want to sort the documents.
- `sort_order`: The sorting order, which can be `pymongo.ASCENDING` (1) for ascending order or `pymongo.DESCENDING` (-1) for descending order.

**Example Code**:
Let's continue with the "students" collection and sort the students by their ages in descending order.

```python
import pymongo

# Establish a connection to the MongoDB server
client = pymongo.MongoClient("mongodb://localhost:27017/")

# Access the "mydatabase" database and "students" collection
mydb = client["mydatabase"]
students = mydb["students"]

# Query to retrieve all students and sort by age in descending order
query = {}
sort_field = "age"
sort_order = pymongo.DESCENDING
cursor = students.find(query).sort(sort_field, sort_order)

# Iterate through the cursor and print the sorted documents
print("Students sorted by age (descending):")
for student in cursor:
    print(student)

# Close the MongoDB connection
client.close()
```

In this example:
1. We establish a connection to the MongoDB server and access the "mydatabase" database and "students" collection.
2. We define an empty query to retrieve all students.
3. We specify the field `"age"` as the sort field and use `pymongo.DESCENDING` to indicate descending order.
4. We use the `sort()` method along with the `find()` method to retrieve and sort the documents based on the specified criteria.
5. We iterate through the cursor and print the sorted documents.

The `sort()` method is useful for arranging query results in a specific order, which can be valuable when presenting data to users or when performing analysis on the retrieved data. It allows you to customize the order in which documents are returned based on the values of specific fields.

###### Q7. Explain why delete_one(), delete_many(), and drop() is used.

In MongoDB, the `delete_one()`, `delete_many()`, and `drop()` methods are used to remove data from collections or entire collections. Each method serves a different purpose and is used in specific scenarios:

1. **`delete_one()` Method**:
   The `delete_one()` method is used to remove a single document that matches a specified filter from a collection.

   Syntax:
   ```python
   result = collection.delete_one(filter)
   ```

   - `collection`: The MongoDB collection from which to delete a document.
   - `filter`: A dictionary specifying the filter criteria to identify the document to be deleted.

   The `delete_one()` method returns a `DeleteResult` object that provides information about the deletion operation, such as the number of documents deleted.

   Use Case:
   When you want to delete a single document that matches specific criteria, such as deleting a single user account based on a unique identifier.

2. **`delete_many()` Method**:
   The `delete_many()` method is used to remove multiple documents that match a specified filter from a collection.

   Syntax:
   ```python
   result = collection.delete_many(filter)
   ```

   - `collection`: The MongoDB collection from which to delete documents.
   - `filter`: A dictionary specifying the filter criteria to identify the documents to be deleted.

   Like `delete_one()`, the `delete_many()` method returns a `DeleteResult` object.

   Use Case:
   When you want to delete multiple documents that meet certain criteria, such as removing all user accounts that haven't been active for a specified period.

3. **`drop()` Method**:
   The `drop()` method is used to completely remove an entire collection from the database.

   Syntax:
   ```python
   collection.drop()
   ```

   - `collection`: The MongoDB collection to be dropped.

   The `drop()` method doesn't take a filter and doesn't return any result. It permanently deletes the collection and all its data.

   Use Case:
   When you want to completely remove a collection and all its documents, typically because it's no longer needed or you want to restructure your data model.

It's important to use these methods with caution, especially the `delete_many()` and `drop()` methods, as they can result in permanent data loss. Always make sure to double-check your filter criteria before performing deletions. Additionally, consider taking database backups or using mechanisms like soft deletes if you need to retain historical data.