# Assignment  16 - Feb 17' 23 - MongoDB

### 1. What is MongoDB? Explain non-relational databases in short. In which scenarios it is preferred to use MongoDB over SQL databases?

#### WHAT?
* MongoDB is an open-source document-oriented database that is designed to store a large scale of data and also allows you to work with that data very efficiently. It is categorized under the NoSQL (Not only SQL) database because the storage and retrieval of data in the MongoDB are not in the form of tables.

#### Non-relational databases
* A non-relational database is a database that does not incorporate the table/key model that relational database management systems (RDBMS) promote. These kinds of databases require data manipulation techniques and processes designed to provide solutions to big data problems that big companies face. The most popular emerging non-relational database is called NoSQL (Not Only SQL).
* Non-relational databases might be based on data structures like documents. A document can be highly detailed while containing a range of different types of information in different formats. This ability to digest and organize various types of information side by side makes non-relational databases much more flexible than relational databases.
* As a document database, MongoDB makes it easy for developers to store structured or unstructured data. It uses a JSON-like format to store documents. This format directly maps to native objects in most modern programming languages, making it a natural choice for developers, as they don’t need to think about normalizing data. MongoDB can also handle high volume and can scale both vertically or horizontally to accommodate large data loads.

#### Scenarios it is preferred to use MongoDB over SQL databases
* MongoDB is best suitable to store unstructured data. And this can organize your data into document format. 
* These RDBMS altenatives called NoSQL data stores (e.g . MongoDB) are very useful for applications that scales massively and require faster data access from these big data stores.
* The implementation of these databases are simpler than the regular RDBMS. Since these are simple key-valued or document style binary objects directly serialized into disk. 
* These data stores don't enforce the ACID properties, and any schemas. This doesn't provide any transaction abilities. So this can scale big and we can achieve faster access (both read and write).
* MongoDB is best used for:
> * to store this unstructured data
> * high write loads
> * unstable schema
> * when the dataset is set to grow big (scale)
> * data is location based
> * high availibility in unstable environment is required
> * when there are no database administrators

###  2. State and Explain the features of MongoDB.

* **Schema-less Database** : 
> It is the great feature provided by the MongoDB. A Schema-less database means one collection can hold different types of documents in it. Or in other words, in the MongoDB database, a single collection can hold multiple documents and these documents may consist of the different numbers of fields, content, and size. It is not necessary that the one document is similar to another document like in the relational databases. Due to this cool feature, MongoDB provides great flexibility to databases.
* **Document Oriented** : 
> In MongoDB, all the data stored in the documents instead of tables like in RDBMS. In these documents, the data is stored in fields(key-value pair) instead of rows and columns which make the data much more flexible in comparison to RDBMS. And each document contains its unique object id.
* **Indexing** : 
> In MongoDB database, every field in the documents is indexed with primary and secondary indices this makes easier and takes less time to get or search data from the pool of the data. If the data is not indexed, then database search each document with the specified query which takes lots of time and not so efficient.
* **Scalability**: 
> MongoDB provides horizontal scalability with the help of sharding. Sharding means to distribute data on multiple servers, here a large amount of data is partitioned into data chunks using the shard key, and these data chunks are evenly distributed across shards that reside across many physical servers. It will also add new machines to a running database.
* **Replication** : 
> MongoDB provides high availability and redundancy with the help of replication, it creates multiple copies of the data and sends these copies to a different server so that if one server fails, then the data is retrieved from another server.
* **Aggregation** : 
> It allows to perform operations on the grouped data and get a single result or computed result. It is similar to the SQL GROUPBY clause. It provides three different aggregations i.e, aggregation pipeline, map-reduce function, and single-purpose aggregation methods
* **High Performance** : 
> The performance of MongoDB is very high and data persistence as compared to another database due to its features like scalability, indexing, replication, etc.

### 3. Write a code to connect MongoDB to Python. Also, create a database and a collection in MongoDB.

In [3]:
pip install pymongo

Collecting pymongo
  Downloading pymongo-4.3.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (492 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m492.9/492.9 kB[0m [31m6.7 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hCollecting dnspython<3.0.0,>=1.16.0
  Downloading dnspython-2.3.0-py3-none-any.whl (283 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m283.7/283.7 kB[0m [31m40.7 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: dnspython, pymongo
Successfully installed dnspython-2.3.0 pymongo-4.3.3
Note: you may need to restart the kernel to use updated packages.


In [4]:
# code to connect MongoDB to Python
# and to create a database and a collection in MongoDB

import pymongo

# establish a connection with mongoDB
client = pymongo.MongoClient("mongodb+srv://payas:startpsn@cluster0.tbaeqts.mongodb.net/?retryWrites=true&w=majority")

# creation of database
db = client['pwskills']

# creating collection inside database
coll_create = db['my_record']

### 4. Using the database and the collection created in question number 3, write a code to insert one record, and insert many records. Use the find() and find_one() methods to print the inserted record.

In [4]:
# inserting records in collection

# inserting only one record
data = {
    "name" : "Payas",
    "class" : "Data Science Masters"
}

coll_create.insert_one(data)  # insert_one() will insert only one record at a time

<pymongo.results.InsertOneResult at 0x7f71366604c0>

In [6]:
# inserting multiple records
data1 = [
    { "name": "Amy", "address": "Apple st 652" },
    { "name": "Hannah", "address": "Mountain 21" },
    { "name": "Michael", "address": "Valley 345" },
    { "name": "Sandy", "address": "Ocean blvd 2" },
    { "name": "Betty", "address": "Green Grass 1" },
    { "name": "Richard", "address": "Sky st 331" },
    { "name": "Susan", "address": "One way 98" },
    { "name": "Vicky", "address": "Yellow Garden 2" },
    { "name": "Ben", "address": "Park Lane 38" },
    { "name": "William", "address": "Central st 954" },
    { "name": "Chuck", "address": "Main Road 989" },
    { "name": "Viola", "address": "Sideway 1633" }
]

coll_create.insert_many(data1)  # insert_many() will insert multiple record at a time

<pymongo.results.InsertManyResult at 0x7f7135e55270>

In [7]:
# Fetching the records in collection

# find() to fetch all the records
for i in coll_create.find():
    print(i)

{'_id': ObjectId('63f267860bd662bc09b2e384'), 'name': 'Payas', 'class': 'Data Science Masters'}
{'_id': ObjectId('63f268450bd662bc09b2e386'), 'name': 'Amy', 'address': 'Apple st 652'}
{'_id': ObjectId('63f268450bd662bc09b2e387'), 'name': 'Hannah', 'address': 'Mountain 21'}
{'_id': ObjectId('63f268450bd662bc09b2e388'), 'name': 'Michael', 'address': 'Valley 345'}
{'_id': ObjectId('63f268450bd662bc09b2e389'), 'name': 'Sandy', 'address': 'Ocean blvd 2'}
{'_id': ObjectId('63f268450bd662bc09b2e38a'), 'name': 'Betty', 'address': 'Green Grass 1'}
{'_id': ObjectId('63f268450bd662bc09b2e38b'), 'name': 'Richard', 'address': 'Sky st 331'}
{'_id': ObjectId('63f268450bd662bc09b2e38c'), 'name': 'Susan', 'address': 'One way 98'}
{'_id': ObjectId('63f268450bd662bc09b2e38d'), 'name': 'Vicky', 'address': 'Yellow Garden 2'}
{'_id': ObjectId('63f268450bd662bc09b2e38e'), 'name': 'Ben', 'address': 'Park Lane 38'}
{'_id': ObjectId('63f268450bd662bc09b2e38f'), 'name': 'William', 'address': 'Central st 954'}
{'

In [9]:
# find_one() to fetch only one record
coll_create.find_one()  # this will fetch only one record, the first record

{'_id': ObjectId('63f267860bd662bc09b2e384'),
 'name': 'Payas',
 'class': 'Data Science Masters'}

### 5. Explain how you can use the find() method to query the MongoDB database. Write a simple code to demonstrate this.

* In MongoDB, find() method is used to select documents in a collection from database and return a cursor to the selected documents. 
* Cursor means a pointer that points to a document, when we use find() method it returns a pointer on the selected documents and returns one by one. 
* find() method is used to select data from the database. 
* It returns all the occurrences of the information stored in the collection. 
* It has 2 types of parameters :
> * The first parameter of the find() method is a query object. In the below first example we will use an empty Query object, which will select all information from the collection. **Note** : It works the same as SELECT* without any parameter.
> * The second parameter to the find() method is that you can specify the field to include in the result. The second parameter passed in the find() method is of object type describing the field. Thus, this parameter is optional. If omitted then all the fields from the collection/database will be displayed in the result. To include the field in the result the value of the parameter passed should be 1, if the value is 0 then it will be excluded from the result. 
* Syntax:
> * *find(query_object, specific_field_to _be_included)*

In [5]:
# code to demonstrate find()

# find() to fetch all the records
for i in coll_create.find():
    print(i)

{'_id': ObjectId('63f267860bd662bc09b2e384'), 'name': 'Payas', 'class': 'Data Science Masters'}
{'_id': ObjectId('63f268450bd662bc09b2e386'), 'name': 'Amy', 'address': 'Apple st 652'}
{'_id': ObjectId('63f268450bd662bc09b2e387'), 'name': 'Hannah', 'address': 'Mountain 21'}
{'_id': ObjectId('63f268450bd662bc09b2e388'), 'name': 'Michael', 'address': 'Valley 345'}
{'_id': ObjectId('63f268450bd662bc09b2e389'), 'name': 'Sandy', 'address': 'Ocean blvd 2'}
{'_id': ObjectId('63f268450bd662bc09b2e38a'), 'name': 'Betty', 'address': 'Green Grass 1'}
{'_id': ObjectId('63f268450bd662bc09b2e38b'), 'name': 'Richard', 'address': 'Sky st 331'}
{'_id': ObjectId('63f268450bd662bc09b2e38c'), 'name': 'Susan', 'address': 'One way 98'}
{'_id': ObjectId('63f268450bd662bc09b2e38d'), 'name': 'Vicky', 'address': 'Yellow Garden 2'}
{'_id': ObjectId('63f268450bd662bc09b2e38e'), 'name': 'Ben', 'address': 'Park Lane 38'}
{'_id': ObjectId('63f268450bd662bc09b2e38f'), 'name': 'William', 'address': 'Central st 954'}
{'

In [6]:
# find_one() to fetch only one record
coll_create.find_one()  # this will fetch only one record, the first record

{'_id': ObjectId('63f267860bd662bc09b2e384'),
 'name': 'Payas',
 'class': 'Data Science Masters'}

In [8]:
# find() to fetch specific record
for i in coll_create.find({'class' : 'Data Science Masters'}):
    print(i)

{'_id': ObjectId('63f267860bd662bc09b2e384'), 'name': 'Payas', 'class': 'Data Science Masters'}


### 6. Explain the sort() method. Give an example to demonstrate sorting in MongoDB.

* The sort() method specifies the order in which the query returns the matching documents from the given collection. 
* You must apply this method to the cursor before retrieving any documents from the database.
* sort() method is used for sorting the database in some order. This method accepts two parameters first is the fieldname and the second one is for the direction to sort. (By default it sorts in ascending order) 
* The value is 1 or -1 specify an ascending or descending sort respectively.
* SYmtax:
> *sort(key_or_list, direction)*
* Parameter:
> * **key_or_list** : a single key or a list of (key, direction) pairs specifying the keys to sort on
> * **direction (optional)** : only used if key_or_list is a single key, if not given ASCENDING is assumed

In [17]:
# example of sorting
# Using sort() function to sort the result alphabetically by name

for i in coll_create.find().sort("name", 1):
    print(i)

{'_id': ObjectId('63f268450bd662bc09b2e386'), 'name': 'Amy', 'address': 'Apple st 652'}
{'_id': ObjectId('63f268450bd662bc09b2e38e'), 'name': 'Ben', 'address': 'Park Lane 38'}
{'_id': ObjectId('63f268450bd662bc09b2e38a'), 'name': 'Betty', 'address': 'Green Grass 1'}
{'_id': ObjectId('63f268450bd662bc09b2e390'), 'name': 'Chuck', 'address': 'Main Road 989'}
{'_id': ObjectId('63f268450bd662bc09b2e387'), 'name': 'Hannah', 'address': 'Mountain 21'}
{'_id': ObjectId('63f268450bd662bc09b2e388'), 'name': 'Michael', 'address': 'Valley 345'}
{'_id': ObjectId('63f267860bd662bc09b2e384'), 'name': 'Payas', 'class': 'Data Science Masters'}
{'_id': ObjectId('63f268450bd662bc09b2e38b'), 'name': 'Richard', 'address': 'Sky st 331'}
{'_id': ObjectId('63f268450bd662bc09b2e389'), 'name': 'Sandy', 'address': 'Ocean blvd 2'}
{'_id': ObjectId('63f268450bd662bc09b2e38c'), 'name': 'Susan', 'address': 'One way 98'}
{'_id': ObjectId('63f268450bd662bc09b2e38d'), 'name': 'Vicky', 'address': 'Yellow Garden 2'}
{'_id

### 7. Explain why delete_one(), delete_many(), and drop() is used.

#### delete_one() :
* In MongoDB, a single document can be deleted by the method delete_one(). 
* The first parameter of the method would be a query object which defines the document to be deleted. 
* If there are multiple documents matching the filter query, only the first appeared document would be deleted. 
> * **Note**: Deleting a document is the same as deleting a record in the case of SQL.

#### delete_many() :
* Delete_many() is used when one needs to delete more than one document. 
* A query object containing which document to be deleted is created and is passed as the first parameter to the delete_many().

#### drop() :
* You can delete a table, or collection as it is called in MongoDB, by using the drop() method.
* The drop() method returns true if the collection was dropped successfully, and false if the collection does not exist.