## Using pymongo 

#### Installing pyMongo in jupyter notebook
Using pip : !pip install pymongo==3.12.3

For anaconda : open anaconda prompt, type conda install -c anaconda pymongo or in the Environment, add pymongo module (make sure it is a version that is not higher than 3.12.3

#### Import the books json files to the mongoDB server
* Start the mongoDB server
* Check that there are no existing bookdb database
* Remove any bookdb database if it exists
* Use monoimport to import the books.json file into the mongoDB server.
* ``` mongoimport --db bookdb --collection book --file c:\data\books.json ```


#### Connecting to the mongoDB database using pymongo

```
##import pymongo
from pymongo import MongoClient

##connect the client(program) to a mongoDB server
client = MongoClient("localhost", 27017)

##list databases in your mongoDB server
dbs = client.list_database_names()
for d in dbs:
    print(d)
print(client.database_names())  #deprecated method - use list_database_names

##close the connection
client.close()
```
#### Exercise 1
Copy the above code and run it! Make sure you have pymongo install in your jupyter notebook!

In [1]:
from pymongo import MongoClient

##connect the client(program) to a mongoDB server
client = MongoClient("localhost", 27017)

##list databases in your mongoDB server
dbs = client.list_database_names()
for d in dbs:
    print(d)
print(client.database_names())  #deprecated method - use list_database_names

admin
bookdb
local
['admin', 'bookdb', 'local']


  print(client.database_names())  #deprecated method - use list_database_names


#### Creating Collection using pymongo 

```
from pymongo import MongoClient

#connect the client(program) to a mongoDB server
client = MongoClient("localhost", 27017)

mydb = client["mydatabase"]   ## use the database mydatabase
mycoll = mydb["customers"]    ## get the collection - customers
client.close()
```
#### Dropping Collection using pymongo 

```
from pymongo import MongoClient

#connect the client(program) to a mongoDB server
client = MongoClient("localhost", 27017)

mydb = client["mydatabase"]   ## use the database mydatabase
mycoll = mydb["customers"]    ## get the collection - customers
mycoll.drop()
client.close()
```
#### Exercise 2 :

Copy the above code and run it!  You can try dropping some of the collections you have created in your mongoDB server!

In [2]:
mydb = client["mydatabase"]   ## use the database mydatabase
mycoll = mydb["customers"]    ## get the collection - customers

In [3]:
mydb = client["mydatabase"]   ## use the database mydatabase
mycoll = mydb["customers"]    ## get the collection - customers
mycoll.drop()
client.close()

#### List the collections in the database

```
from pymongo import MongoClient

##connect the client(program) to a mongoDB server
try:
    client = MongoClient("localhost", 27017)
    print("Connected")
except:
    print("Cannot connect to database")

mydb = client["mydatabase"]   ## use the database mydatabase
mycoll = mydb["customers"]    ## get the collection - customers

print(mydb.list_collection_names())
client.close()

```
#### Exercise 3

Copy the above code and run it! 

In [4]:
from pymongo import MongoClient

##connect the client(program) to a mongoDB server
try:
    client = MongoClient("localhost", 27017)
    print("Connected")
except:
    print("Cannot connect to database")

mydb = client["mydatabase"]   ## use the database mydatabase
mycoll = mydb["customers"]    ## get the collection - customers

print(mydb.list_collection_names())
client.close()

Connected
[]


#### Exercise 4

List the collections in the bookdb

In [5]:
print(mydb.list_collection_names())

[]


#### Inserting document/s into collection

* Using insert_one() to insert document. 

The insert_one() method returns an instance of InsertOneResult, which has a property, inserted_id, that holds the id of the inserted document.


```
from pymongo import MongoClient

#connect the client(program) to a mongoDB server
try:
    client = MongoClient("localhost", 27017)
    print("Connected")
except:
    print("Cannot connect to database")

mydb = client["mydatabase"]   ## use the database mydatabase
mycoll = mydb["customers"]    ## get the collection - customers

document = {"name":"Peter", "address":"abc def"} ##dictionary
x=mycoll.insert_one(document)
print(x.inserted_id)
client.close()
```
* Using insert_many to insert documents. 

The insert_many() method returns an instance of InsertManyResult, which has a property, inserted_ids, that holds the list of ids of the inserted documents

```
from pymongo import MongoClient

custList =[
    {"name": "Amy", "address":"Apple ST 652"},
    {"name": "Hannah", "address":"Montain 21"},
    {"name": "Michael", "address":"Valley 345"},
    {"name": "Sandy", "address":"Ocean blvd 2"},
    {"name": "Betty", "address":"Green Grass 1"},
    {"name": "Richard", "address":"Sky st 331"}
]

#connect the client(program) to a mongoDB server
try:
    client = MongoClient("localhost", 27017)
    print("Connected")
except:
    print("Cannot connect to database")

mydb = client["mydatabase"]   ## use the database mydatabase
mycoll = mydb["customers"]    ## get the collection - customers
x=mycoll.insert_many(custList)  ## insert many - a list of customers
print(x.inserted_ids)
client.close()
```
#### Exercise 5

Copy and try the code above.


In [6]:
from pymongo import MongoClient

#connect the client(program) to a mongoDB server
try:
    client = MongoClient("localhost", 27017)
    print("Connected")
except:
    print("Cannot connect to database")

mydb = client["mydatabase"]   ## use the database mydatabase
mycoll = mydb["customers"]    ## get the collection - customers

document = {"name":"Peter", "address":"abc def"} ##dictionary
x=mycoll.insert_one(document)
print(x.inserted_id)
client.close()

Connected
625e308b99667a96a7be10a9


In [7]:
from pymongo import MongoClient

custList =[
    {"name": "Amy", "address":"Apple ST 652"},
    {"name": "Hannah", "address":"Montain 21"},
    {"name": "Michael", "address":"Valley 345"},
    {"name": "Sandy", "address":"Ocean blvd 2"},
    {"name": "Betty", "address":"Green Grass 1"},
    {"name": "Richard", "address":"Sky st 331"}
]

#connect the client(program) to a mongoDB server
try:
    client = MongoClient("localhost", 27017)
    print("Connected")
except:
    print("Cannot connect to database")

mydb = client["mydatabase"]   ## use the database mydatabase
mycoll = mydb["customers"]    ## get the collection - customers
x=mycoll.insert_many(custList)  ## insert many - a list of customers
print(x.inserted_ids)
client.close()

Connected
[ObjectId('625e308b99667a96a7be10ab'), ObjectId('625e308b99667a96a7be10ac'), ObjectId('625e308b99667a96a7be10ad'), ObjectId('625e308b99667a96a7be10ae'), ObjectId('625e308b99667a96a7be10af'), ObjectId('625e308b99667a96a7be10b0')]


#### Exercise 6

Insert the following book to the book collection in bookdb.

```
{ "title" : "Cryptography Demystified", "isbn" : "0071406387", "pageCount" : 356, "thumbnailUrl" : "https://m.media-amazon.com/images/I/710c8mSVe9L._AC_UY218_ML3_.jpg", "status" : "MEAP", "authors" : [ "John Hershey"], "categories" : ['Demystified'] }

```


In [8]:
from pymongo import MongoClient

# Connect the client(program) to a mongoDB server
try:
    client = MongoClient("localhost", 27017)
    print("Connected")
except:
    print("Cannot connect to database")
    
mydb = client["bookdb"]   ## use the database mydatabase
mycoll = mydb["book"]    ## get the collection - customers

document = { "title" : "Cryptography Demystified", "isbn" : "0071406387", 
            "pageCount" : 356, "thumbnailUrl" : "https://m.media-amazon.com/images/I/710c8mSVe9L._AC_UY218_ML3_.jpg", 
            "status" : "MEAP", "authors" : [ "John Hershey"], "categories" : ['Demystified']}
x=mycoll.insert_one(document)
print(x.inserted_id)
client.close()

Connected
625e308b99667a96a7be10b2


#### Exercise 7

Insert the following list of books to the book collection in bookdb.

```
[
{ "title" : "Serious Cryptography: A Practical Introduction to Modern Encryption", "isbn" : "1593278268", "pageCount" : 313, "thumbnailUrl" : "https://m.media-amazon.com/images/I/51wv16GC0FL.jpg", "status" : "MEAP", "authors" : [ "Jean-Philippe Aumasson"], "categories" : [] },

{ "title" : "Rootkits and Bootkits: Reversing Modern Malware and Next Generation Threats", "isbn" : "B07P8J5HZJ", "pageCount" : 448, "thumbnailUrl" : "https://m.media-amazon.com/images/I/51+Zko5mWpL.jpg", "status" : "MEAP", "authors" : [ "Alex Matrosov", "Eugene Rodionov", "Sergey Bratus" ], "categories" : ["Viruses & Malware", "Computer Viruses"] },

{ "title" : "Understanding Cryptography: A Textbook for Students and Practitioners", "isbn" : "B014P9I39Q", "pageCount" : 390, "thumbnailUrl" : "https://m.media-amazon.com/images/I/61TXcy7R+kL._AC_UY218_ML3_.jpg", "status" : "MEAP", "authors" : [ "Prof. Dr.-Ing. Christof Paar", "Prof. Dr.-Ing. January Pelzl"], "categories" : [" Computer Information Theory", "Encryption"] }
]

```

In [9]:
from pymongo import MongoClient

# Connect the client(program) to a mongoDB server
try:
    client = MongoClient("localhost", 27017)
    print("Connected")
except:
    print("Cannot connect to database")
    
mydb = client["bookdb"]   ## use the database mydatabase
mycoll = mydb["book"]    ## get the collection - customers

document = [{ "title" : "Serious Cryptography: A Practical Introduction to Modern Encryption", "isbn" : "1593278268", "pageCount" : 313, "thumbnailUrl" : "https://m.media-amazon.com/images/I/51wv16GC0FL.jpg", "status" : "MEAP", "authors" : [ "Jean-Philippe Aumasson"], "categories" : [] },
            { "title" : "Rootkits and Bootkits: Reversing Modern Malware and Next Generation Threats", "isbn" : "B07P8J5HZJ", "pageCount" : 448, "thumbnailUrl" : "https://m.media-amazon.com/images/I/51+Zko5mWpL.jpg", "status" : "MEAP", "authors" : [ "Alex Matrosov", "Eugene Rodionov", "Sergey Bratus" ], "categories" : ["Viruses & Malware", "Computer Viruses"] },
            { "title" : "Understanding Cryptography: A Textbook for Students and Practitioners", "isbn" : "B014P9I39Q", "pageCount" : 390, "thumbnailUrl" : "https://m.media-amazon.com/images/I/61TXcy7R+kL._AC_UY218_ML3_.jpg", "status" : "MEAP", "authors" : [ "Prof. Dr.-Ing. Christof Paar", "Prof. Dr.-Ing. January Pelzl"], "categories" : [" Computer Information Theory", "Encryption"] }
           ]
x=mycoll.insert_many(document)
client.close()


Connected


#### Find document/s in the collection

* Using find() to list the documents in a collection.

    ```
    from pymongo import MongoClient
    #connect the client(program) to a mongoDB server
    try:
        client = MongoClient("localhost", 27017)
        print("Connected")
    except:
        print("Cannot connect to database")

    mydb = client["mydatabase"]   ## use the database mydatabase
    mycoll = mydb["customers"]    ## get the collection - customers

    for document in mycoll.find():
        print(document)
    client.close()
    
    ```
 * List documents with conditions/queries.
 
 ```
    from pymongo import MongoClient
    #connect the client(program) to a mongoDB server
    try:
        client = MongoClient("localhost", 27017)
        print("Connected")
    except:
        print("Cannot connect to database")

    mydb = client["mydatabase"]   ## use the database mydatabase
    mycoll = mydb["customers"]    ## get the collection - customers

    myquery = {"address": {"$gt":"S"}}
    myprojections ={"_id":0}
    myresult = mycoll.find(myquery, myprojections)
    for x in myresult:
        print(x)
 ```
 


#### Exercise 8
Find books with title containing the word **Hadoop**. Display the result in the following format:

```
Title                                    ISBN                                     Author/s                                          
Hadoop in Action                         1935182196                               Chuck Lam                                         
Hadoop in Practice                       1617290238                               Alex Holmes                                       
Hadoop in Practice, Second Edition       1617292222                               Alex Holmes              
```


In [10]:
# your code here


#### Exercise 9

Find books with title containing the word **Programming**, pageCount is not zero and has author information. 
Display the documents as shown. The document should be sorted by title in ascending order
and only the first 5 documents are listed.
```
Title                                              ISBN           Author/s                                          
Distributed Programming with Java                  1884777651     Qusay H. Mahmoud                                  
Elements of Programming with Perl                  1884777805     Andrew L. Johnson                                 
Graphics Programming with Perl                     1930110022     Martien Verbruggen                                
Java 3D Programming                                1930110359     Daniel Selman                                     
Java Applets and Channels Without Programming      1884777392     Ronny Richardson,Michael Shoffner,Marq Singer,Bruce Murray,,Jack Gambol                                                    
```

In [11]:
# your code here

#### Update document/s in the collection

* Using update_one() to update the first document that satisfies the query/filterin a collection.

    ```
    from pymongo import MongoClient
    #connect the client(program) to a mongoDB server
    try:
        client = MongoClient("localhost", 27017)
        print("Connected")
    except:
        print("Cannot connect to database")

    mydb = client["mydatabase"]   ## use the database mydatabase
    mycoll = mydb["customers"]    ## get the collection - customers
    myquery = {"address": "Valley 345"}
    newval = {"$set":{"address":"Canyon 123"}}
    mycoll.update_one(myquery, newval)
    
    for document in mycoll.find():
        print(document)
    client.close()
    
    ```
* Using update_many() to update the all documents that satisfy the query/filter in a collection.

    ```
    from pymongo import MongoClient
        #connect the client(program) to a mongoDB server
    try:
        client = MongoClient("localhost", 27017)
        print("Connected")
    except:
        print("Cannot connect to database")

    mydb = client["mydatabase"]   ## use the database mydatabase
    mycoll = mydb["customers"]    ## get the collection - customers
    myquery = {"name": "Minnies"}
    newval = {"$set":{"name":"Minnie"}}
    x = mycoll.update_many(myquery, newval)

    print("No of documents updated : {}".format(x.modified_count))

    client.close()
    ```

#### Exercise 10

Find the first book with title containing the word **in Action** and replace the word **Action** to **Motion**.
Take note that only the word **Action** is replace and not the whole title.
After the replacement, display the document as follows (the book shown might not be the same):
```
Title                                              ISBN           Author/s                                          
Distributed Programming with Java in Motion        1884777651     Qusay H. Mahmoud                                  
```

In [12]:
# your code here

#### Other useful methods

* count() - to count the documents a collection.
    ```
        mycoll = mydb["cars"]
        print(mycoll.count())
        
    ```
* Using cursor 
    ```
    cursor = mycoll.find()
    for x in cursor:
        print(x)
    ```

#### Exercise 11

Print out the number of documents left in the **book** collection.

```

#### Exercise 12

For each status in the book collection, print out the number of documents.
```
Status:PUBLISH Count:363
Status:MEAP Count:68
```

In [13]:
# your code here

#### Exercise 13

Replace the value of the **status** field of the documents with "PUBLISH" to "PUBLISHED".

After the replacement, print out the number of the documents with status = "PUBLISHED".
```
Status:PUBLISHED Count:363  
       
```

In [14]:
# your code here

#### Delete document/s in the collection

* Using delete_one() to delete the first document that satisfies the query/filter in a collection.

    ```
    from pymongo import MongoClient
    #connect the client(program) to a mongoDB server
    try:
        client = MongoClient("localhost", 27017)
        print("Connected")
    except:
        print("Cannot connect to database")

    mydb = client["mydatabase"]   ## use the database mydatabase
    mycoll = mydb["customers"]    ## get the collection - customers
    myquery = {"address": "Canyon 123"}
    mycoll.delete_one(myquery)
    
    for document in mycoll.find():
        print(document)
    client.close()
    
    ```
* Using delete_many() to update the all documents that satisfy the query/filter in a collection.

    ```
    from pymongo import MongoClient
        #connect the client(program) to a mongoDB server
    try:
        client = MongoClient("localhost", 27017)
        print("Connected")
    except:
        print("Cannot connect to database")

    mydb = client["mydatabase"]   ## use the database mydatabase
    mycoll = mydb["customers"]    ## get the collection - customers
    myquery = {"name": "Minnie"}
    x = mycoll.delete_many(myquery)

    print("No of documents updated : {}".format(x.modified_count))

    client.close()
    ```
* Note that mycoll.delete_many({}) will delete all document in the collection.  Use with care!

#### Exercise 14

Delete the book with isbn = '1930110596' from the collection. Verify that the document is deleted using mongodb shell.


In [15]:
# your code here

#### Exercise 15

Delete all the books that have pageCount=0 from the collection. Verify that the document is deleted using mongodb shell.


In [16]:
# your code here

#### Exercise 16

Verify that the documents with pageCount=0 are deleted by counting the documents with pageCount=0.

In [17]:
# your code here

#### Exercise 17
Import data in csv file into MongoDB

a. Write Python code to read **language.csv** and insert the records into the collection **language** in database **lang**  of MongoDB.  

b. Verify that the data has been inserted.

In [18]:
# your code here
import csv

#### Exercise 18
Import data in json file into MongoDB

a. Write Python code to read **intern.json** and insert the records into **intern** collection of database **hrdb** in mongoDB.

b. Verify that the data has been inserted.