<a href="https://colab.research.google.com/github/afzalasar7/Data-Science/blob/main/Week%206/MongoDB_Assignment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Q1. What is MongoDB? Explain non-relational databases in short. In which scenarios is it preferred to use MongoDB over SQL databases?

A1. MongoDB is a popular open-source NoSQL database management system. It is classified as a document-oriented database, belonging to the category of non-relational or NoSQL databases. Non-relational databases are designed to store and retrieve data in a flexible, schema-less manner, which differs from the traditional tabular structure of SQL databases.

In non-relational databases like MongoDB, data is organized and stored in flexible, self-contained documents (typically in JSON-like format) rather than structured tables. The key characteristics of non-relational databases include:

- Schema flexibility: Non-relational databases allow for dynamic and flexible schemas, meaning each document can have its own structure and fields without enforcing a predefined schema.

- Horizontal scalability: Non-relational databases excel at scaling horizontally by distributing data across multiple servers, allowing for easy scaling as data volume or user load increases.

- High performance: Non-relational databases often prioritize high-speed data access and can provide low-latency read and write operations, making them suitable for use cases that require real-time or near-real-time data processing.

MongoDB is preferred over SQL databases in scenarios where:

- Flexibility is required: MongoDB's flexible schema allows easy adaptation to evolving data requirements, making it suitable for projects with changing or undefined schemas.

- Handling unstructured or semi-structured data: If the data being stored does not fit neatly into a fixed schema or has varying attributes, MongoDB's document-oriented approach is more appropriate.

- Scalability and performance: MongoDB's horizontal scalability and distributed nature make it a good choice for high-volume, high-throughput applications that need to handle large amounts of data and concurrent requests.

- Rapid development: MongoDB's flexible and intuitive document model, along with its support for complex data types and query capabilities, can expedite the development process for certain use cases.

It's important to note that the choice between MongoDB and SQL databases depends on the specific requirements of the project and the nature of the data being stored.

# Q2. State and explain the features of MongoDB.

A2. MongoDB offers several key features that make it a popular choice for data storage and retrieval:

- **Document-oriented**: MongoDB stores data in flexible, self-contained documents (BSON format) that can vary in structure and fields within a collection. This allows for easy handling of semi-structured or unstructured data.

- **Ad hoc queries**: MongoDB supports dynamic queries using a rich query language, including filtering, sorting, and field projection. It allows for complex queries and indexing to optimize query performance.

- **Scalability**: MongoDB provides horizontal scalability by allowing the distribution of data across multiple servers or clusters. It supports sharding, which enables data partitioning and automatic load balancing.

- **High availability**: MongoDB supports replica sets, which provide automatic failover and data redundancy. Replica sets ensure that data remains available even in the event of server failures.

- **Flexible indexing**: MongoDB offers various indexing options, including single-field, compound, geospatial, and text indexes. Indexes help improve query performance and allow efficient data retrieval.

- **Aggregation framework**: MongoDB's powerful aggregation framework allows for advanced data analysis and processing, including grouping, filtering, joining, and transformation operations.

- **Support for transactions**: MongoDB supports multi-document transactions, enabling atomic operations across multiple documents within a single replica set or a sharded cluster.

- **Schema evolution**: MongoDB allows schema changes without requiring application downtime, making it suitable for agile development and accommodating evolving data models.

- **Geospatial capabilities**: MongoDB provides extensive support for geospatial data and queries, making it well-suited for location-based services and applications.

These features, along with a vibrant community

 and extensive documentation, contribute to the popularity and versatility of MongoDB as a NoSQL database.

# Q3. Write a code to connect MongoDB to Python. Also, create a database and a collection in MongoDB.

A3. Here's an example code to connect to MongoDB in Python, create a database, and create a collection:

```python
import pymongo

# Establish a connection to MongoDB
client = pymongo.MongoClient("mongodb://localhost:27017/")

# Create a database
mydb = client["mydatabase"]

# Create a collection
mycol = mydb["customers"]
```

In this code, the `pymongo` library is used to interact with MongoDB. The `pymongo.MongoClient` class is used to establish a connection to MongoDB, specifying the connection URL (in this case, connecting to the local MongoDB instance on the default port).

A database named "mydatabase" is created using the `client["mydatabase"]` syntax. Similarly, a collection named "customers" is created within the "mydatabase" using the `mydb["customers"]` syntax.

Note that this code assumes you have MongoDB installed and running on your local machine.

# Q4. Using the database and the collection created in question number 3, write a code to insert one record and insert many records. Use the find() and find_one() methods to print the inserted record.

A4. Here's an example code to insert one record and insert many records into a MongoDB collection and then retrieve the inserted records using the `find()` and `find_one()` methods:

```python
import pymongo

# Establish a connection to MongoDB
client = pymongo.MongoClient("mongodb://localhost:27017/")

# Access the database and collection
mydb = client["mydatabase"]
mycol = mydb["customers"]

# Insert one record
record1 = {"name": "John", "age": 30, "city": "New York"}
inserted_record1 = mycol.insert_one(record1)
print("Inserted record ID:", inserted_record1.inserted_id)

# Insert many records
records = [
    {"name": "Alice", "age": 25, "city": "London"},
    {"name": "Bob", "age": 35, "city": "Paris"},
    {"name": "Eve", "age": 40, "city": "Berlin"}
]
inserted_records = mycol.insert_many(records)
print("Inserted records IDs:", inserted_records.inserted_ids)

# Retrieve the inserted records using find() and find_one()
print("All records:")
for record in mycol.find():
    print(record)

print("One record:")
print(mycol.find_one())
```

In this code, the `insert_one()` method is used to insert a single record (`record1`) into the "customers" collection. The `inserted_id` property of the returned `InsertOneResult` object is printed to display the ID of the inserted record.

The `insert_many()` method is used to insert multiple records (`records`) into the "customers" collection. The `inserted_ids` property of the returned `InsertManyResult` object is printed to display the IDs of the inserted records.

To retrieve the inserted records, the `find()` method is used to fetch all records from the collection, and the `find_one()` method is used to fetch a single record. The retrieved records are printed using a loop and the `print()` function.

# Q5. Explain how you can use the find() method to query the MongoDB database. Write a simple code to demonstrate this.

A5. The `find()` method in MongoDB allows you to query the database and retrieve documents based on specified criteria. You can provide a query filter to match

 documents that meet certain conditions.

Here's an example code that demonstrates the usage of the `find()` method to query a MongoDB database:

```python
import pymongo

# Establish a connection to MongoDB
client = pymongo.MongoClient("mongodb://localhost:27017/")

# Access the database and collection
mydb = client["mydatabase"]
mycol = mydb["customers"]

# Find documents that match a specific condition
query = {"city": "New York"}
results = mycol.find(query)

# Print the matching documents
for document in results:
    print(document)
```

In this code, a connection to MongoDB is established, and the "mydatabase" database and "customers" collection are accessed.

A query filter is defined as `{"city": "New York"}`, specifying the condition that documents with the "city" field equal to "New York" should be retrieved.

The `find()` method is then called with the query filter. It returns a cursor that can be iterated over to access the matching documents. In this example, each matching document is printed using a loop.

# Q6. Explain the sort() method. Give an example to demonstrate sorting in MongoDB.

A6. The `sort()` method in MongoDB is used to sort the documents in a collection based on specified fields. It allows sorting in ascending (1) or descending (-1) order.

Here's an example to demonstrate sorting in MongoDB:

```python
import pymongo

# Establish a connection to MongoDB
client = pymongo.MongoClient("mongodb://localhost:27017/")

# Access the database and collection
mydb = client["mydatabase"]
mycol = mydb["customers"]

# Sort the documents based on the "age" field in descending order
results = mycol.find().sort("age", -1)

# Print the sorted documents
for document in results:
    print(document)
```

In this code, after establishing a connection to MongoDB and accessing the "mydatabase" database and "customers" collection, the `find()` method is called without any query filter to retrieve all documents.

The `sort()` method is then chained to the `find()` method, specifying the field "age" and the sort order as -1 (descending order). This means the documents will be sorted based on the "age" field in descending order.

The sorted documents are printed using a loop, and you will see the documents sorted based on the "age" field in descending order.

# Q7. Explain why `delete_one()`, `delete_many()`, and `drop()` are used.

A7. In MongoDB, the `delete_one()` and `delete_many()` methods are used to remove documents from a collection based on specified criteria. The `drop()` method is used to completely remove an entire collection from the database.

- `delete_one(filter)` deletes a single document that matches the specified filter criteria. If multiple documents match the filter, it deletes only the first matching document encountered.

- `delete_many(filter)` deletes multiple documents that match the specified filter criteria. All documents that match the filter will be removed from the collection.

- `drop()` is used to remove an entire collection from the database. It permanently deletes the collection and all the documents it contains. The operation is not reversible.

These methods are used when you need to remove specific documents or an entire collection from the database. They provide flexibility in data management and allow you to maintain the integrity and cleanliness of your data. It's important to use these methods with caution and ensure that you have appropriate backups or safeguards in place before performing deletion operations.