# Task 1: Create an account and getting familiar with MongoDB

## 1.1 - Setup an MongoDB ATLAS account
Follow the instructions from here: https://docs.google.com/document/d/1Puyz0RLfEqiCRl-ZaKdtKloEqVsN8GKdMuraKn1ZdoI/edit?usp=sharing 


## 1.2 - MongoDB concepts compared to Relational DB concepts
In MongoDB, a **database** is the container for collections. A single **collection**  is the container for documents. **Documents** are usually key/value pairs but it can include arrays and subdocuments. It can support different data types. More information here: https://docs.mongodb.com/manual/reference/bson-types/. 

| Relational DB  | MongoDB  |
|---|---|
|  Database | Database  |  
| Tables  |  Collections |
| Rows  | Documents  |
| Index  |  Index |
 	


## 1.3 Document structure
You can find more information about MongoDB document structure https://docs.mongodb.com/manual/core/document/. If you are not familiar with JSON and BSON specifications, you might wish to read about them here:
- JSON: https://www.json.org/json-en.html
- BSON: http://bsonspec.org



# Task 2: Query a dataset

We need `pymongo`, `dnspython` and `python==3.6`

In [1]:
# install missing library
!pip install pymongo
!pip install dnspython

Collecting pymongo
  Using cached pymongo-4.3.2-cp39-cp39-win_amd64.whl (381 kB)
Collecting dnspython<3.0.0,>=1.16.0
  Using cached dnspython-2.2.1-py3-none-any.whl (269 kB)
Installing collected packages: dnspython, pymongo
Successfully installed dnspython-2.2.1 pymongo-4.3.2


If you use Google's collab, you now have to select `Runtime -> Restart runtime` or `Ctrl+M`.


In [2]:
#@title Imports
import pymongo
from pprint import pprint
from random import randint

## 2.1 - Establish a connection to MongoDB

- **_[TO DO]_** : Connect to MongoDB using the MongoClient class from PyMongo library.




In [3]:
###########################
# Task: 
#   use MongoClient class to connect to MongoDB
#
###########################


client = pymongo.MongoClient("mongodb+srv://Ys1ong:123@cluster0.mxp06uy.mongodb.net/?retryWrites=true&w=majority")
db=client.admin


#########


Let's check whether everything works properly by retrieving the server status and printing the results, as follows:

In [4]:
serverStatusResult=db.command("serverStatus")
pprint(serverStatusResult)

{'$clusterTime': {'clusterTime': Timestamp(1667014203, 2),
                  'signature': {'hash': b'\xc5\x19\xe6\xd9\xc6y\xb3G'
                                        b'r\xfe\x8d\x93*\x95\xfb\xdc\t>Uy',
                                'keyId': 7098350483697303556}},
 'atlasVersion': {'gitVersion': '14bc9397d8af3fc806b476e052a5cf881cc9ff27',
                  'version': '20220914.0.0.1663348381'},
 'connections': {'available': 497, 'current': 3, 'totalCreated': 112},
 'extra_info': {'note': 'fields vary by platform', 'page_faults': 0},
 'host': 'ac-u8le93h-shard-00-02.mxp06uy.mongodb.net:27017',
 'localTime': datetime.datetime(2022, 10, 29, 3, 30, 3, 567000),
 'mem': {'bits': 64,
         'mapped': 0,
         'mappedWithJournal': 0,
         'resident': 0,
         'supported': True,
         'virtual': 0},
 'metrics': {'aggStageCounters': {'search': 0,
                                  'searchBeta': 0,
                                  'searchMeta': 0},
             'atlas': {'conne


## 2.2 - Create sample data

Let's create our synthetic dataset with students, their mark (scaled from 1-10) and the reviewer on DAPS 2020.


In [5]:
names = ['Anna','Maria','George', 'Mike', 'Alex','Paul','Nick', 'Andrew','Ellie', 'Natalia']
surname = ['Adams', 'Baker', 'Palmer', 'Peterson', 'Roberts', 'Turner', 'Armstrong']
reviewer = ['Laura','Miguel']
student=[]
for i in range(1, 30):
    student.append({
        'name' : names[randint(0, (len(names)-1))] + ' '  + surname[randint(0, (len(surname)-1))],
        'DAPS_assignment' : randint(1, 10),
        'reviewer':  reviewer[randint(0, (len(reviewer)-1))] })
pprint(student)

[{'DAPS_assignment': 3, 'name': 'Nick Baker', 'reviewer': 'Laura'},
 {'DAPS_assignment': 4, 'name': 'Mike Roberts', 'reviewer': 'Miguel'},
 {'DAPS_assignment': 7, 'name': 'George Palmer', 'reviewer': 'Miguel'},
 {'DAPS_assignment': 10, 'name': 'Andrew Peterson', 'reviewer': 'Miguel'},
 {'DAPS_assignment': 1, 'name': 'Alex Baker', 'reviewer': 'Laura'},
 {'DAPS_assignment': 3, 'name': 'Nick Baker', 'reviewer': 'Miguel'},
 {'DAPS_assignment': 3, 'name': 'Andrew Peterson', 'reviewer': 'Miguel'},
 {'DAPS_assignment': 9, 'name': 'Paul Baker', 'reviewer': 'Miguel'},
 {'DAPS_assignment': 1, 'name': 'Maria Baker', 'reviewer': 'Miguel'},
 {'DAPS_assignment': 3, 'name': 'Nick Palmer', 'reviewer': 'Miguel'},
 {'DAPS_assignment': 1, 'name': 'Maria Armstrong', 'reviewer': 'Miguel'},
 {'DAPS_assignment': 7, 'name': 'Nick Peterson', 'reviewer': 'Laura'},
 {'DAPS_assignment': 2, 'name': 'Alex Turner', 'reviewer': 'Miguel'},
 {'DAPS_assignment': 5, 'name': 'Alex Adams', 'reviewer': 'Miguel'},
 {'DAPS_as



**_[TO DO]_**: Upload this database using `insert_one` or `insert_many` command.


In [6]:
db.client.students.daps.drop()

In [7]:
# Create a database object called “students”
db = client.students
# Create a new collection object called “daps” in database "students"
daps = db.daps

In [8]:
###########################
# Task: 
#   upload this database using insert_one or insert_many command
#
###########################


### TO DO
daps.insert_many(student)

#########

<pymongo.results.InsertManyResult at 0x2c9a5b14bb0>

In [9]:
for student_list in daps.find():
    print(student_list)

{'_id': ObjectId('635c9e3cb4eda1e5369e5596'), 'name': 'Nick Baker', 'DAPS_assignment': 3, 'reviewer': 'Laura'}
{'_id': ObjectId('635c9e3cb4eda1e5369e5597'), 'name': 'Mike Roberts', 'DAPS_assignment': 4, 'reviewer': 'Miguel'}
{'_id': ObjectId('635c9e3cb4eda1e5369e5598'), 'name': 'George Palmer', 'DAPS_assignment': 7, 'reviewer': 'Miguel'}
{'_id': ObjectId('635c9e3cb4eda1e5369e5599'), 'name': 'Andrew Peterson', 'DAPS_assignment': 10, 'reviewer': 'Miguel'}
{'_id': ObjectId('635c9e3cb4eda1e5369e559a'), 'name': 'Alex Baker', 'DAPS_assignment': 1, 'reviewer': 'Laura'}
{'_id': ObjectId('635c9e3cb4eda1e5369e559b'), 'name': 'Nick Baker', 'DAPS_assignment': 3, 'reviewer': 'Miguel'}
{'_id': ObjectId('635c9e3cb4eda1e5369e559c'), 'name': 'Andrew Peterson', 'DAPS_assignment': 3, 'reviewer': 'Miguel'}
{'_id': ObjectId('635c9e3cb4eda1e5369e559d'), 'name': 'Paul Baker', 'DAPS_assignment': 9, 'reviewer': 'Miguel'}
{'_id': ObjectId('635c9e3cb4eda1e5369e559e'), 'name': 'Maria Baker', 'DAPS_assignment': 1,

## 2.3 - Query a document


**_[TO DO]_** : Find one student with score of 5. You can use the command `find_one`.



In [10]:
###########################
# Task: 
#   find one student with final DAPS_assignment score equal 5
#
###########################


### TO DO
for student_list in daps.find({"DAPS_assignment": 5}):
    print(student_list)

print('\n')
print(daps.find_one({"DAPS_assignment": 5}))


#########

{'_id': ObjectId('635c9e3cb4eda1e5369e55a3'), 'name': 'Alex Adams', 'DAPS_assignment': 5, 'reviewer': 'Miguel'}
{'_id': ObjectId('635c9e3cb4eda1e5369e55ae'), 'name': 'Paul Armstrong', 'DAPS_assignment': 5, 'reviewer': 'Laura'}


{'_id': ObjectId('635c9e3cb4eda1e5369e55a3'), 'name': 'Alex Adams', 'DAPS_assignment': 5, 'reviewer': 'Miguel'}



**_[TO DO]_** : Query the database to find the total number of students with score 8 and 3? You can use `aggregation` or `find` command.


In [11]:
###########################
# Task: 
#   Count the total students with final DAPS_assignment score equal to 3 and 8.
#
###########################


### TO DO
for student_list in daps.find({"DAPS_assignment": 8}):
    print(student_list)
    
for student_list in daps.find({"DAPS_assignment": 3}):
    print(student_list)
    
score_8_or_3 = daps.count_documents({"$or": [{"DAPS_assignment": 8}, {"DAPS_assignment": 3}]})
print('\nThe total number of students with score 8 and 3 is', score_8_or_3)


#########

{'_id': ObjectId('635c9e3cb4eda1e5369e55a8'), 'name': 'Anna Turner', 'DAPS_assignment': 8, 'reviewer': 'Laura'}
{'_id': ObjectId('635c9e3cb4eda1e5369e55ac'), 'name': 'Paul Roberts', 'DAPS_assignment': 8, 'reviewer': 'Miguel'}
{'_id': ObjectId('635c9e3cb4eda1e5369e5596'), 'name': 'Nick Baker', 'DAPS_assignment': 3, 'reviewer': 'Laura'}
{'_id': ObjectId('635c9e3cb4eda1e5369e559b'), 'name': 'Nick Baker', 'DAPS_assignment': 3, 'reviewer': 'Miguel'}
{'_id': ObjectId('635c9e3cb4eda1e5369e559c'), 'name': 'Andrew Peterson', 'DAPS_assignment': 3, 'reviewer': 'Miguel'}
{'_id': ObjectId('635c9e3cb4eda1e5369e559f'), 'name': 'Nick Palmer', 'DAPS_assignment': 3, 'reviewer': 'Miguel'}
{'_id': ObjectId('635c9e3cb4eda1e5369e55a4'), 'name': 'George Peterson', 'DAPS_assignment': 3, 'reviewer': 'Miguel'}
{'_id': ObjectId('635c9e3cb4eda1e5369e55ab'), 'name': 'Maria Turner', 'DAPS_assignment': 3, 'reviewer': 'Laura'}

The total number of students with score 8 and 3 is 8


 Dr. Laura Toni is happy today and she is going to pass all students with final score 4.
 
**_[TO DO]_** : Change the score on all the students with final mark 4 to 5. You can use one of the following commands: `update_one`, `update_many` and `replace_one`.

In [12]:
###########################
# Task: 
#   Adjust the score on all the students with final mark 4 to 5.
#
###########################


### TO DO
print('Original Students with Score 4:')
for student_list in daps.find({"DAPS_assignment": 4}):
    print(student_list)
    
print('\nOriginal Students with Score 5:')
for student_list in daps.find({"DAPS_assignment": 5}):
    print(student_list)
    
    
print('\n\nUpdate the score on students with final mark 4 to 5 marked by Dr. Laura')

myquery = {"$and": [{"DAPS_assignment": 4}, {"reviewer": "Laura"}]}
myupdate = {"$set":  {"DAPS_assignment": 5}}
daps.update_many(myquery, myupdate)

print('\nAfter update Students with Score 4:')
for student_list in daps.find({"DAPS_assignment": 4}):
    print(student_list)
    
print('\nAfter Update Students with Score 5:')
for student_list in daps.find({"DAPS_assignment": 5}):
    print(student_list)


#########

Original Students with Score 4:
{'_id': ObjectId('635c9e3cb4eda1e5369e5597'), 'name': 'Mike Roberts', 'DAPS_assignment': 4, 'reviewer': 'Miguel'}
{'_id': ObjectId('635c9e3cb4eda1e5369e55a6'), 'name': 'Anna Turner', 'DAPS_assignment': 4, 'reviewer': 'Laura'}
{'_id': ObjectId('635c9e3cb4eda1e5369e55a7'), 'name': 'Maria Turner', 'DAPS_assignment': 4, 'reviewer': 'Laura'}
{'_id': ObjectId('635c9e3cb4eda1e5369e55ad'), 'name': 'Maria Turner', 'DAPS_assignment': 4, 'reviewer': 'Laura'}

Original Students with Score 5:
{'_id': ObjectId('635c9e3cb4eda1e5369e55a3'), 'name': 'Alex Adams', 'DAPS_assignment': 5, 'reviewer': 'Miguel'}
{'_id': ObjectId('635c9e3cb4eda1e5369e55ae'), 'name': 'Paul Armstrong', 'DAPS_assignment': 5, 'reviewer': 'Laura'}


Update the score on students with final mark 4 to 5 marked by Dr. Laura

After update Students with Score 4:
{'_id': ObjectId('635c9e3cb4eda1e5369e5597'), 'name': 'Mike Roberts', 'DAPS_assignment': 4, 'reviewer': 'Miguel'}

After Update Students with Sco

That was an unfair move!

**_[TO DO]_** : Let's delete all the documents that Dr. Laura Toni marked!

In [13]:
###########################
# Task: 
#   Delete all documents with `reviewer:Laura`.
#
###########################


### TO DO
print('\nOriginal Students marked by Dr. Laura Toni:')
for student_list in daps.find({"reviewer": "Laura"}):
    print(student_list)

students_by_Laura = daps.count_documents({"reviewer": "Laura"})
all_students = daps.count_documents({})
print('\nThe total number of students marked by Dr. Laura Toni is', students_by_Laura, ', Total = ', all_students)


print('\n\nDelete all the documents that Dr. Laura Toni marked')
delete_Laura = daps.delete_many({"reviewer": "Laura"})
students_by_Laura = daps.count_documents({"reviewer": "Laura"})
all_students = daps.count_documents({})
print('\nThe total number of students marked by Dr. Laura Toni is', students_by_Laura, ', Total = ', all_students)



#########


Original Students marked by Dr. Laura Toni:
{'_id': ObjectId('635c9e3cb4eda1e5369e5596'), 'name': 'Nick Baker', 'DAPS_assignment': 3, 'reviewer': 'Laura'}
{'_id': ObjectId('635c9e3cb4eda1e5369e559a'), 'name': 'Alex Baker', 'DAPS_assignment': 1, 'reviewer': 'Laura'}
{'_id': ObjectId('635c9e3cb4eda1e5369e55a1'), 'name': 'Nick Peterson', 'DAPS_assignment': 7, 'reviewer': 'Laura'}
{'_id': ObjectId('635c9e3cb4eda1e5369e55a5'), 'name': 'Alex Palmer', 'DAPS_assignment': 1, 'reviewer': 'Laura'}
{'_id': ObjectId('635c9e3cb4eda1e5369e55a6'), 'name': 'Anna Turner', 'DAPS_assignment': 5, 'reviewer': 'Laura'}
{'_id': ObjectId('635c9e3cb4eda1e5369e55a7'), 'name': 'Maria Turner', 'DAPS_assignment': 5, 'reviewer': 'Laura'}
{'_id': ObjectId('635c9e3cb4eda1e5369e55a8'), 'name': 'Anna Turner', 'DAPS_assignment': 8, 'reviewer': 'Laura'}
{'_id': ObjectId('635c9e3cb4eda1e5369e55aa'), 'name': 'Mike Baker', 'DAPS_assignment': 7, 'reviewer': 'Laura'}
{'_id': ObjectId('635c9e3cb4eda1e5369e55ab'), 'name': 'Mari

GOOD JOB! You finished the tasks!


You might be asking yourself now: Why and when a non-elational database is useful? MongoDB allows storing data in documents. This is very useful when you have a lot of many-to-many relationships. Other advantages include:
- it enables the fast development of applications, 
- it supports highly diverse data types, 
- and allows efficient interations with applications at scale.
Read more here: https://www.mongodb.com/compare/mongodb-mysql 


You can learn more about developing MongoDB-based applications here:
- https://university.mongodb.com/courses/M121/about?jmp=M101Pap
- https://university.mongodb.com/courses/M220P/about?jmp=M101Pap
- https://university.mongodb.com/courses/M320/about?jmp=M101Pap