Q1. What is MongoDB? Explain non-relational databases in short. In which scenarios it is preferred to use MongoDB over SQL databases?

Answer:
MongoDB is an open-source, document-oriented database that is part of the NoSQL database family. Instead of storing data in tables with rows and columns like traditional relational databases (SQL), MongoDB stores data in flexible, JSON-like documents.
Non-relational databases (NoSQL) are databases that do not use the traditional table-based relational model. They are designed for handling large volumes of unstructured or semi-structured data and are known for their flexibility, scalability, and performance. Common types include document stores (like MongoDB), key-value stores, wide-column stores, and graph databases.
MongoDB is preferred over SQL databases in the following scenarios:
Unstructured or Semi-structured Data: When dealing with data that doesn't fit neatly into a fixed schema, such as user-generated content, logs, or IoT data.
Agile Development: The flexible, schema-less nature allows for rapid iteration and changes to the data model as application requirements evolve.
Horizontal Scalability: MongoDB is designed to scale out easily across multiple servers using a technique called sharding, making it ideal for applications with large datasets and high traffic.
Hierarchical Data: Data that has a nested or tree-like structure can be stored naturally in a single document, which is more efficient than joining multiple tables in SQL.
Big Data and Real-time Analytics: Its high performance for both read and write operations makes it suitable for big data applications.


Q2. State and Explain the features of MongoDB.

Answer:
The key features of MongoDB are:
Document-Oriented: Data is stored in BSON (Binary JSON) documents, which map naturally to objects in application code, making it intuitive for developers to work with.
Schema-less: Documents within the same collection do not need to have the same set of fields or structure. This flexibility allows the data model to evolve over time.
High Performance: MongoDB provides high performance for read and write operations. This is achieved through features like indexing, in-memory computing, and the absence of complex joins.
High Availability: MongoDB's replica sets provide automatic failover and data redundancy. A replica set is a group of MongoDB instances that host the same data set, ensuring that if one server fails, the others can take over.
Horizontal Scalability (Sharding): MongoDB can be scaled horizontally using sharding. Sharding distributes data across multiple servers, allowing it to handle massive datasets and high throughput that would be difficult for a single server to manage.
Rich Query Language: MongoDB supports a powerful query language that allows for filtering, sorting, and projecting data, as well as performing aggregations to process and analyze data.

Q3. Write a code to connect MongoDB to Python. Also, create a database and a collection in MongoDB.

In [None]:
!pip install pymongo
import pymongo


client = pymongo.MongoClient("mongodb://localhost:xxxxx")  ### not giving my server details

# Create a new database named 'mydatabase'
# MongoDB creates the database on the first write operation
db = client['mydatabase']

# Create a new collection named 'mycollection'
# MongoDB creates the collection on the first write operation
collection = db['mycollection']

print("Connected to MongoDB.")
print(f"Database 'mydatabase' created.")
print(f"Collection 'mycollection' created.")

# You can verify by listing the collections in the database
print("Collections in 'mydatabase':", db.list_collection_names())

Collecting pymongo
  Downloading pymongo-4.13.2-cp310-cp310-win_amd64.whl.metadata (22 kB)
Collecting dnspython<3.0.0,>=1.16.0 (from pymongo)
  Downloading dnspython-2.7.0-py3-none-any.whl.metadata (5.8 kB)
Downloading pymongo-4.13.2-cp310-cp310-win_amd64.whl (800 kB)
   ---------------------------------------- 0.0/800.2 kB ? eta -:--:--
   ------------- -------------------------- 262.1/800.2 kB ? eta -:--:--
   ---------------------------------------- 800.2/800.2 kB 3.8 MB/s eta 0:00:00
Downloading dnspython-2.7.0-py3-none-any.whl (313 kB)
Installing collected packages: dnspython, pymongo

   ---------------------------------------- 0/2 [dnspython]
   ---------------------------------------- 0/2 [dnspython]
   ---------------------------------------- 0/2 [dnspython]
   ---------------------------------------- 0/2 [dnspython]
   ---------------------------------------- 0/2 [dnspython]
   -------------------- ------------------- 1/2 [pymongo]
   -------------------- ------------------- 

ServerSelectionTimeoutError: localhost:27017: [WinError 10061] No connection could be made because the target machine actively refused it (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms), Timeout: 30s, Topology Description: <TopologyDescription id: 6875fca9076ade3ca9ac4679, topology_type: Unknown, servers: [<ServerDescription ('localhost', 27017) server_type: Unknown, rtt: None, error=AutoReconnect('localhost:27017: [WinError 10061] No connection could be made because the target machine actively refused it (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms)')>]>

Q4. Using the database and the collection created in question number 3, write a code to insert one record, and insert many records. Use the find() and find_one() methods to print the inserted record.

In [None]:
import pymongo

client = pymongo.MongoClient("mongodb://localhost:xxxxx")
db = client['mydatabase']
collection = db['mycollection']


# 1. Insert one record
one_record = {
    "name": "John Doe",
    "age": 30,
    "city": "New York"
}
collection.insert_one(one_record)
print("One record inserted.")

# 2. Insert many records
many_records = [
    {"name": "Jane Smith", "age": 25, "city": "Los Angeles"},
    {"name": "Peter Jones", "age": 35, "city": "Chicago"},
    {"name": "Alice Williams", "age": 28, "city": "Houston"}
]
collection.insert_many(many_records)
print("Many records inserted.")

# 3. Use find_one() to print one record
print("\n--- Using find_one() ---")
found_one = collection.find_one({"name": "John Doe"})
print(found_one)

# 4. Use find() to print all inserted records
print("\n--- Using find() ---")
all_records = collection.find()
for record in all_records:
    print(record)

# Close the connection
client.close()

Q5. Explain how you can use the find() method to query the MongoDB database. Write a simple code to demonstrate this.

Answer:
The find() method is used to retrieve multiple documents from a collection that match a specific query filter. It returns a cursor, which is an iterable object that you can loop through to access the matching documents.
You can pass a query document (a dictionary) to find() to specify the criteria for the documents you want to retrieve. An empty query document {} will match all documents in the collection.

In [None]:
import pymongo

client = pymongo.MongoClient("mongodb://localhost:27017/")
db = client['mydatabase']
collection = db['mycollection']

# Query for documents where 'age' is greater than 28
# The '$gt' operator means 'greater than'
query = {"age": {"$gt": 28}}

print("Finding documents where age > 28:")
results = collection.find(query)

for document in results:
    print(document)

client.close()

Q6. Explain the sort() method. Give an example to demonstrate sorting in MongoDB.
Answer:
The sort() method is used to order the results returned by a find() query. It is chained to the find() method and takes a document as an argument that specifies the field(s) to sort by and the sort order.
The sort order is specified with:
1 for ascending order.
-1 for descending order.

In [None]:
import pymongo

client = pymongo.MongoClient("mongodb://localhost:27017/")
db = client['mydatabase']
collection = db['mycollection']

# Find all documents and sort by age in descending order
print("Sorting documents by age (descending):")
sorted_results = collection.find().sort("age", -1)

for document in sorted_results:
    print(document)
    
client.close()

Q7. Explain why delete_one(), delete_many(), and drop() is used.
Answer:
These three methods are used for removing data from MongoDB, but they operate at different levels:

delete_one(filter):
Purpose: To delete a single document from a collection.
How it works: It finds the first document that matches the provided filter and removes it. If multiple documents match the filter, only the first one encountered is deleted.
Use Case: Deleting a specific, unique record, like a user with a specific ID.

delete_many(filter):
Purpose: To delete multiple documents from a collection.
How it works: It finds all documents that match the provided filter and removes them.
Use Case: Cleaning up data, such as deleting all logs older than 30 days or removing all records associated with a deactivated account.

drop():
Purpose: To delete an entire collection.
How it works: This method is called on a collection object (e.g., db.mycollection.drop()) and completely removes the collection and all of its documents and indexes.
Use Case: When a collection is no longer needed. This is much more efficient than using delete_many({}) to delete all documents, as it removes the collection metadata as well.