# Week 6 - Assignment 2 (MongoDB Assignment) Solutions


## Q1. What is MongoDB? Explain non-relational databases in short. In which scenarios it is preferred to use MongoDB over SQL databases?


**Answer -**
MongoDB is a popular document-oriented NoSQL(Not Only SQL) database that stores large scale of data in flexible, JSON-like documents called BSON(Binary JSON) and allow us to work with that data efficiently. It is designed to provide high performance, scalability, and flexibility for modern applications that require dynamic, complex data structures.

Non-relational databases, also known as NoSQL or Not Only SQL databases, are databases that do not use the traditional table-based relational database model used in SQL databases. Instead, they use flexible data models that can handle large amount of semi-structured and unstructured data such as key-value, document-oriented, or graph-based, making them ideal for handling big data and real-time data processing. They can be more flexible and scalable than traditional SQL databases.

MongoDB is preferred over SQL databases in scenarios where:

* **Flexibility and scalability are important**: MongoDB is designed to be flexible and scalable, making it a good choice for handling large and complex data sets.

* **Agile development**: MongoDB's flexible schema allows for more agile and iterative development, as changes can be made to the database schema without requiring a complete redesign of the application.

* **Big Data and real-time analytics**: MongoDB is well-suited for big data and real-time analytics, as it can easily handle unstructured and semi-structured data.

* **Cloud-based applications**: MongoDB is often used for cloud-based applications, as it can easily scale to meet the demands of a growing user base.

* **Geospatial data**: MongoDB has built-in support for geospatial data, making it a popular choice for applications that involve location-based services.

## Q2. State and Explain the features of MongoDB.


**Answer -**

MongoDB is a popular document-oriented NoSQL database that provides a wide range of features that make it a popular choice for modern application development. In MongoDB, data is organized into a hierarchy of **databases, collections, and documents**. A **database** is a container for a set of collections, each with its own set of documents. In MongoDB, you can have multiple databases within a single MongoDB server instance. A **collection** is a group of MongoDB documents, similar to a table in a relational database. Collections are used to group related documents together and typically have a common schema, although MongoDB's flexible document model allows for some variation within a collection. Each collection within a MongoDB database is given a unique name that is used to identify and access the collection. A **document** is a set of key-value pairs that is stored within a collection. In MongoDB, a document is similar to a record in a relational database, but it is stored in a flexible and dynamic format known as BSON (Binary JSON). Each document can have its own unique structure and can include nested sub-documents and arrays.

Some important features of MongoDB are:

* **Document-based data model**: MongoDB stores data in documents, which are similar to JSON objects. This allows for a flexible and scalable data model that can easily handle complex data structures.
* **Scalability**: MongoDB is highly scalable and can handle large amounts of data and traffic with ease. It supports sharding, which allows you to distribute data across multiple servers for improved performance and availability.
* **High availability**: MongoDB provides automatic failover and replica sets to ensure that your data is always available, even in the event of hardware or network failures.
* **Flexible schema**: MongoDB has a flexible schema, which means that you can easily modify your data model as your application evolves.
* **Rich query language**: MongoDB provides a powerful and flexible query language that allows you to perform complex queries on your data.
* **Aggregation framework**: MongoDB provides a built-in aggregation framework that allows you to perform complex data processing tasks such as grouping, filtering, and transforming data.
* **Geospatial support**: MongoDB provides built-in support for geospatial data, allowing you to easily perform queries based on location.
* **Full-text search**: MongoDB provides full-text search capabilities that allow you to perform text-based searches on your data.
* **ACID transactions**: MongoDB supports ACID transactions, allowing you to perform multiple operations on your data as a single, atomic transaction.

## Q3. Write a code to connect MongoDB to Python. Also, create a database and a collection in MongoDB.


**Answer -**

To use MongoDB, we first need to install either MongoDB Compass desktop application or use MongoDB Atlas web application. Suppose we use the web based application. After logging in the website mongodb.com and setting up the account, we need to Database -> Connect -> Driver. We need to select driver as Python and version 3.6 as it is very stable. We need to copy the Connection String URL. 

Then in our local system, we need to first install a library called pymongo

In [None]:
# Installing library.
%pip install pymongo

Next we need to import pymongo module

In [None]:
# Importing pymongo
import pymongo

Next, we need to connect to mongoDB with MongoClient() method of pymongo. Here we create a connection variable and need to paste the copied Connection String URL. It is to **note** that we need to input the connection password within the connection string url. 

In [None]:
# Connecting MongoDB with Python
client=pymongo.MongoClient("mongodb+srv://subhajits:subhajits@cluster0.v2rp7tf.mongodb.net/?retryWrites=true&w=majority")

Now we can create a database. Say our database name is 'pwDB'. This database will be created as a database variable 'db' that uses connection variable 'client'. 

In [None]:
#creating DB
db = client['pwDB']

Once the database is created, we use the database variable db to create a collection 'my_collection' within it. We save the collection under a collection variable 'my_coll'.

In [None]:
my_coll = db['my_collection']

## Q4. Using the database and the collection created in question number 3, write a code to insert one record, and insert many records. Use the find() and find_one() methods to print the inserted record.


**Answer -**
Continuing with the previous coding, we have to create a variable `data1` having one record and another variable `data2` with multiple records.

In [None]:
# Creating one record
data1 = { "name" : "Subhajit Sarkar",
          "mail_id" : "subhajitsakar@gmail.com" , 
          "phone" : "1212121212", 
          "address" : "Kolkata, India"}

In [None]:
# Creating multiple records - in a list
data2 = [
{ "name" : "Susanta Sarkar",
          "mail_id" : "susantasakar@gmail.com" , 
          "phone" : "2323232323", 
          "address" : "Purulia, India"},
{ "name" : "Chhabi Sarkar",
          "mail_id" : "chhabisakar@gmail.com" , 
          "phone" : "3434343434", 
          "address" : "Purulia, India"},
{ "name" : "Subhadeep Sarkar",
          "mail_id" : "subhadeepsakar@gmail.com" , 
          "phone" : "4545454545", 
          "address" : "Delhi, India"},
{ "name" : "Soumita Sarkar",
          "mail_id" : "soumitasakar@gmail.com" , 
          "phone" : "5656565656", 
          "address" : "Kolkata, India"}
]

Next we insert data1 and data2 into collection `my_collection` by using collection variable `my_coll` which will be in format ***collection_variable.insert_one(data_variable)*** and ***collection_variable.insert_many(data_variable)***

In [None]:
# Inserting single data into collection
x = my_coll.insert_one(data1)

In [None]:
# Inserting multiple data into collection
y = my_coll.insert_many(data2)

Next we check/print all the records by using ***collection_variable.find()***. This saves the output in object result. If we iterate over result, we would get the data. To fetch only one record, we use ***collection_variable.find_one()***

In [None]:
# To fetch all the records,
result = my_coll.find()

# Iterating over output object,
for i in result:
    print (i)

In [None]:
# To fetch only one record (first record)
my_coll.find_one()

## Q5. Explain how you can use the find() method to query the MongoDB database. Write a simple code to demonstrate this.


**Answer -**
The **find()** method is used to query a MongoDB database and retrieve documents that match a specific criteria. It takes one or more arguments that define the search criteria, and returns a cursor object that can be used to iterate over the results.
The find() method supports a wide variety of search criteria, including exact matches, ranges, regular expressions, and more.
The find() method takes an optional query object as its argument, which defines the criteria for the search. This query object is typically constructed using a set of key-value pairs, where each key represents a field in the document and each value represents the value to search for in that field. The find() method returns a cursor to the resulting documents, which can be iterated over to access the data. Consider the following example : We create another collection inside the database and insert multiple records in it.

In [None]:
# setting up
import pymongo

client = pymongo.MongoClient("mongodb+srv://subhajits:subhajits@cluster0.v2rp7tf.mongodb.net/?retryWrites=true&w=majority")

mycol1 = db["col1"]

In [None]:
# multiple data

data2 = [
{ "name" : "Subhajit Sarkar",
          "mail_id" : "subhajitsakar@gmail.com" , 
          "phone" : "2323232323", 
          "age" : "30"},
{ "name" : "Susanta Sarkar",
          "mail_id" : "susantasakar@gmail.com" , 
          "phone" : "2323232323", 
          "age" : "72"},
{ "name" : "Chhabi Sarkar",
          "mail_id" : "chhabisakar@gmail.com" , 
          "phone" : "3434343434", 
          "age" : "68"},
{ "name" : "Subhadeep Sarkar",
          "mail_id" : "subhadeepsakar@gmail.com" , 
          "phone" : "4545454545", 
          "age" : "33"},
{ "name" : "Soumita Sarkar",
          "mail_id" : "soumitasakar@gmail.com" , 
          "phone" : "5656565656", 
          "age" : "28"}
]

In [None]:
# Inserting multiple data into collection
a = coll.insert_many(data2)

In [None]:
# To print all the records,
for record in col1.find():
    print(record)

# To filter records basis a value
res1 = col1.find({ "name": "Subhajit Sarkar" })
for record in res1:
    print(record)

res2 = col1.find({ "age": "30" })
for record in res2:
    print(record)

# To filter records basis GREATER THAN criteria - age greater than 30
res3 = col1.find({ "age": { "$gte": "30" } })
for record in res3:
    print(record)

# To filter records basis GREATER THAN & LESSER THAN criteria - age greater than 30 and lesser than 70   
res4 = col1.find({ "age": { "$gt": "30", "$lt": "70" } })
for record in res4:
    print(record)

## Q6. Explain the sort() method. Give an example to demonstrate sorting in MongoDB.


**Answer -**
The sort() method in MongoDB is used to sort the results of a query in ascending or descending order based on one or more fields. The sort() method takes one or more arguments that specify the sorting criteria, and returns a cursor object that can be used to iterate over the sorted results. By default, the sort() method sorts the results in ascending order based on the specified field(s). To sort the results in descending order, you can pass the value -1 as the sorting criteria for the field(s). Consider the example in Q5. We can sort the data in collection basis ascending or descending age.

In [None]:
# Sorting data by age in Ascending order

res1 = col1.find().sort("name")
for record in res1:
    print(record)
    
# Sorting data by age in Descending order
res2 = col1.find().sort("age", -1)
for record in res2:
    print(record)

# Sorting data by name and age
res3 = col1.find().sort([("name", 1), ("age", -1)])
for record in res3:
    print(record)

## Q7. Explain why delete_one(), delete_many(), and drop() is used.


**Answer -**
To demonstrate these functions, we use the example from Q5.

**delete_one()**: This method deletes the first document that matches the specified filter criteria. If multiple documents match the filter, only the first one is deleted. This method is useful when you want to delete a single document from a collection based on a specific filter criteria. For example, we use delete_one() to remove record with "name" : "Subhajit Sarkar"

In [None]:
# Deleting Record
coll.delete_one({"name": "Subhajit Sarkar"}

In [None]:
# Checking Record for confirmation
for record in col1.find():
    print(record)

**delete_many()**: This method deletes all documents that match the specified filter criteria. This method is useful when you want to delete multiple documents from a collection based on a specific filter criteria. For example, we use delete_many() to remove all orders having age > 40.

In [None]:
# To delete records in which age > 40
coll.delete_many({"age":{'$gte':"40"}})

In [None]:
# Checking Record for confirmation
for record in col1.find():
    print(record)

**drop()**: This method deletes an entire collection and all of its documents. This method is useful when you want to delete an entire collection and all of its documents. For example, you might use drop() to remove a collection that is no longer needed.

In [None]:
# Deleting the entire collection
coll.drop()

**Finally we close the connection by using connection_variable.close()**

In [None]:
client.close()