# Learning MongoDB

# Create Database

Create a database called "mydatabase".

MongoDB waits until you have created a collection (table), with at least one document (record) before it actually creates the database (and collection)

In [4]:
import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")

mydb = myclient["mydatabase"]

In [6]:
print(myclient.list_database_names())

dblist = myclient.list_database_names()
if "mydatabase" in dblist:
    print("The database exists.")
else:
    print("The database does not exist.")

['admin', 'config', 'local']
The database does not exist.


# Create Collection

Create a collection called "customers".

MongoDB waits until you have inserted a document before it actually creates the collection.

In [7]:
mycol = mydb["customers"]

print(mydb.list_collection_names())

collist = mydb.list_collection_names()
if "customers" in collist:
    print("The collection exists.")
else:
    print("The collection does no exist.")

[]
The collection does no exist.


# Insert Single Document

To insert a record, or document as it is called in MongoDB, into a collection, we use the insert_one() method.

The first parameter of the insert_one() method is a dictionary containing the name(s) and value(s) of each field in the document you want to insert.

The insert_one() method returns a InsertOneResult object, which has a property, inserted_id, that holds the id of the inserted document.

If you do not specify an _id field, then MongoDB will add one for you and assign a unique id for each document.

In the example below, no _id field was specified, so MongoDB assigned a unique _id for the record (document).

In [8]:
mydict = { "name": "John", "address": "Highway 37" }

x = mycol.insert_one(mydict)

mydict = { "name": "Peter", "address": "Lowstreet 27" }

x = mycol.insert_one(mydict)

print(x.inserted_id) 

5dd30ba1da8cebec98809bf0


# Insert Multiple Documents

To insert multiple documents into a collection in MongoDB, we use the insert_many() method.

The first parameter of the insert_many() method is a list containing dictionaries with the data you want to insert:

In [9]:
mylist = [
  { "name": "Amy", "address": "Apple st 652"},
  { "name": "Hannah", "address": "Mountain 21"},
  { "name": "Michael", "address": "Valley 345"},
  { "name": "Sandy", "address": "Ocean blvd 2"},
  { "name": "Betty", "address": "Green Grass 1"},
  { "name": "Richard", "address": "Sky st 331"},
  { "name": "Susan", "address": "One way 98"},
  { "name": "Vicky", "address": "Yellow Garden 2"},
  { "name": "Ben", "address": "Park Lane 38"},
  { "name": "William", "address": "Central st 954"},
  { "name": "Chuck", "address": "Main Road 989"},
  { "name": "Viola", "address": "Sideway 1633"}
]

x = mycol.insert_many(mylist)

#print list of the _id values of the inserted documents:
print(x.inserted_ids) 

[ObjectId('5dd30ba1da8cebec98809bf1'), ObjectId('5dd30ba1da8cebec98809bf2'), ObjectId('5dd30ba1da8cebec98809bf3'), ObjectId('5dd30ba1da8cebec98809bf4'), ObjectId('5dd30ba1da8cebec98809bf5'), ObjectId('5dd30ba1da8cebec98809bf6'), ObjectId('5dd30ba1da8cebec98809bf7'), ObjectId('5dd30ba1da8cebec98809bf8'), ObjectId('5dd30ba1da8cebec98809bf9'), ObjectId('5dd30ba1da8cebec98809bfa'), ObjectId('5dd30ba1da8cebec98809bfb'), ObjectId('5dd30ba1da8cebec98809bfc')]


# Insert Multiple Documents, with Specified IDs

In [10]:
mylist = [
  { "_id": 1, "name": "John", "address": "Highway 37"},
  { "_id": 2, "name": "Peter", "address": "Lowstreet 27"},
  { "_id": 3, "name": "Amy", "address": "Apple st 652"},
  { "_id": 4, "name": "Hannah", "address": "Mountain 21"},
  { "_id": 5, "name": "Michael", "address": "Valley 345"},
  { "_id": 6, "name": "Sandy", "address": "Ocean blvd 2"},
  { "_id": 7, "name": "Betty", "address": "Green Grass 1"},
  { "_id": 8, "name": "Richard", "address": "Sky st 331"},
  { "_id": 9, "name": "Susan", "address": "One way 98"},
  { "_id": 10, "name": "Vicky", "address": "Yellow Garden 2"},
  { "_id": 11, "name": "Ben", "address": "Park Lane 38"},
  { "_id": 12, "name": "William", "address": "Central st 954"},
  { "_id": 13, "name": "Chuck", "address": "Main Road 989"},
  { "_id": 14, "name": "Viola", "address": "Sideway 1633"}
]

x = mycol.insert_many(mylist)

#print list of the _id values of the inserted documents:
print(x.inserted_ids)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]


# Find One

To select data from a collection in MongoDB, we can use the find_one() method.

The find_one() method returns the first occurrence in the selection.

In [11]:
x = mycol.find_one()

print(x) 

{'_id': ObjectId('5dd30ba1da8cebec98809bef'), 'name': 'John', 'address': 'Highway 37'}


# Find All

To select data from a table in MongoDB, we can also use the find() method.

The find() method returns all occurrences in the selection.

The first parameter of the find() method is a query object. In this example we use an empty query object, which selects all documents in the collection.

In [12]:
for x in mycol.find():
  print(x) 

{'_id': ObjectId('5dd30ba1da8cebec98809bef'), 'name': 'John', 'address': 'Highway 37'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf0'), 'name': 'Peter', 'address': 'Lowstreet 27'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf1'), 'name': 'Amy', 'address': 'Apple st 652'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf2'), 'name': 'Hannah', 'address': 'Mountain 21'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf3'), 'name': 'Michael', 'address': 'Valley 345'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf4'), 'name': 'Sandy', 'address': 'Ocean blvd 2'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf5'), 'name': 'Betty', 'address': 'Green Grass 1'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf6'), 'name': 'Richard', 'address': 'Sky st 331'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf7'), 'name': 'Susan', 'address': 'One way 98'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf8'), 'name': 'Vicky', 'address': 'Yellow Garden 2'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf9'), 'name': 'Ben', 'address': 'Park Lane 38'}
{'_id': ObjectI

# Return Only Some Fields

The second parameter of the find() method is an object describing which fields to include in the result.

This parameter is optional, and if omitted, all fields will be included in the result.

In [13]:
for x in mycol.find({},{ "_id": 0, "name": 1, "address": 1 }):
  print(x) 

{'name': 'John', 'address': 'Highway 37'}
{'name': 'Peter', 'address': 'Lowstreet 27'}
{'name': 'Amy', 'address': 'Apple st 652'}
{'name': 'Hannah', 'address': 'Mountain 21'}
{'name': 'Michael', 'address': 'Valley 345'}
{'name': 'Sandy', 'address': 'Ocean blvd 2'}
{'name': 'Betty', 'address': 'Green Grass 1'}
{'name': 'Richard', 'address': 'Sky st 331'}
{'name': 'Susan', 'address': 'One way 98'}
{'name': 'Vicky', 'address': 'Yellow Garden 2'}
{'name': 'Ben', 'address': 'Park Lane 38'}
{'name': 'William', 'address': 'Central st 954'}
{'name': 'Chuck', 'address': 'Main Road 989'}
{'name': 'Viola', 'address': 'Sideway 1633'}
{'name': 'John', 'address': 'Highway 37'}
{'name': 'Peter', 'address': 'Lowstreet 27'}
{'name': 'Amy', 'address': 'Apple st 652'}
{'name': 'Hannah', 'address': 'Mountain 21'}
{'name': 'Michael', 'address': 'Valley 345'}
{'name': 'Sandy', 'address': 'Ocean blvd 2'}
{'name': 'Betty', 'address': 'Green Grass 1'}
{'name': 'Richard', 'address': 'Sky st 331'}
{'name': 'Susa

You are not allowed to specify both 0 and 1 values in the same object (except if one of the fields is the _id field). If you specify a field with the value 0, all other fields get the value 1, and vice versa

In [14]:
for x in mycol.find({},{ "address": 0 }):
  print(x) 

{'_id': ObjectId('5dd30ba1da8cebec98809bef'), 'name': 'John'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf0'), 'name': 'Peter'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf1'), 'name': 'Amy'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf2'), 'name': 'Hannah'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf3'), 'name': 'Michael'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf4'), 'name': 'Sandy'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf5'), 'name': 'Betty'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf6'), 'name': 'Richard'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf7'), 'name': 'Susan'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf8'), 'name': 'Vicky'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf9'), 'name': 'Ben'}
{'_id': ObjectId('5dd30ba1da8cebec98809bfa'), 'name': 'William'}
{'_id': ObjectId('5dd30ba1da8cebec98809bfb'), 'name': 'Chuck'}
{'_id': ObjectId('5dd30ba1da8cebec98809bfc'), 'name': 'Viola'}
{'_id': 1, 'name': 'John'}
{'_id': 2, 'name': 'Peter'}
{'_id': 3, 'name': 'Amy'}
{'_id': 4, 'name': 'Hannah'}
{'_id'

In [15]:
try:
    for x in mycol.find({},{ "name": 1, "address": 0 }):
        print(x) 
except Exception as error:
    print(repr(error))

OperationFailure('Projection cannot have a mix of inclusion and exclusion.')


# Filter the Result

When finding documents in a collection, you can filter the result by using a query object.

The first argument of the find() method is a query object, and is used to limit the search.

In [16]:
myquery = { "address": "Park Lane 38" }

mydoc = mycol.find(myquery)

for x in mydoc:
  print(x) 

{'_id': ObjectId('5dd30ba1da8cebec98809bf9'), 'name': 'Ben', 'address': 'Park Lane 38'}
{'_id': 11, 'name': 'Ben', 'address': 'Park Lane 38'}


# Advanced Query

To make advanced queries you can use modifiers as values in the query object.

E.g. to find the documents where the "address" field starts with the letter "S" or higher (alphabetically), use the greater than modifier: {"$gt": "S"}

In [17]:
# Find documents where the address starts with the letter "S" or higher

myquery = { "address": { "$gt": "S" } }

mydoc = mycol.find(myquery)

for x in mydoc:
  print(x) 

{'_id': ObjectId('5dd30ba1da8cebec98809bf3'), 'name': 'Michael', 'address': 'Valley 345'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf6'), 'name': 'Richard', 'address': 'Sky st 331'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf8'), 'name': 'Vicky', 'address': 'Yellow Garden 2'}
{'_id': ObjectId('5dd30ba1da8cebec98809bfc'), 'name': 'Viola', 'address': 'Sideway 1633'}
{'_id': 5, 'name': 'Michael', 'address': 'Valley 345'}
{'_id': 8, 'name': 'Richard', 'address': 'Sky st 331'}
{'_id': 10, 'name': 'Vicky', 'address': 'Yellow Garden 2'}
{'_id': 14, 'name': 'Viola', 'address': 'Sideway 1633'}


# Filter With Regular Expressions

You can also use regular expressions as a modifier.

Regular expressions can only be used to query strings.

To find only the documents where the "address" field starts with the letter "S", use the regular expression {"$regex": "^S"}

In [18]:
# Find documents where the address starts with the letter "S"

myquery = { "address": { "$regex": "^S" } }

mydoc = mycol.find(myquery)

for x in mydoc:
  print(x) 

{'_id': ObjectId('5dd30ba1da8cebec98809bf6'), 'name': 'Richard', 'address': 'Sky st 331'}
{'_id': ObjectId('5dd30ba1da8cebec98809bfc'), 'name': 'Viola', 'address': 'Sideway 1633'}
{'_id': 8, 'name': 'Richard', 'address': 'Sky st 331'}
{'_id': 14, 'name': 'Viola', 'address': 'Sideway 1633'}


# Sort the Result

Use the sort() method to sort the result in ascending or descending order.

The sort() method takes one parameter for "fieldname" and one parameter for "direction" (ascending is the default direction).

In [19]:
# Sort the result alphabetically by name

mydoc = mycol.find().sort("name")

for x in mydoc:
  print(x) 

{'_id': ObjectId('5dd30ba1da8cebec98809bf1'), 'name': 'Amy', 'address': 'Apple st 652'}
{'_id': 3, 'name': 'Amy', 'address': 'Apple st 652'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf9'), 'name': 'Ben', 'address': 'Park Lane 38'}
{'_id': 11, 'name': 'Ben', 'address': 'Park Lane 38'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf5'), 'name': 'Betty', 'address': 'Green Grass 1'}
{'_id': 7, 'name': 'Betty', 'address': 'Green Grass 1'}
{'_id': ObjectId('5dd30ba1da8cebec98809bfb'), 'name': 'Chuck', 'address': 'Main Road 989'}
{'_id': 13, 'name': 'Chuck', 'address': 'Main Road 989'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf2'), 'name': 'Hannah', 'address': 'Mountain 21'}
{'_id': 4, 'name': 'Hannah', 'address': 'Mountain 21'}
{'_id': ObjectId('5dd30ba1da8cebec98809bef'), 'name': 'John', 'address': 'Highway 37'}
{'_id': 1, 'name': 'John', 'address': 'Highway 37'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf3'), 'name': 'Michael', 'address': 'Valley 345'}
{'_id': 5, 'name': 'Michael', 'address': 'Valley

# Sort Descending

Use the value -1 as the second parameter to sort descending.

sort("name", 1) #ascending
sort("name", -1) #descending

In [20]:
# Sort the result reverse alphabetically by name

mydoc = mycol.find().sort("name", -1)

for x in mydoc:
  print(x) 

{'_id': ObjectId('5dd30ba1da8cebec98809bfa'), 'name': 'William', 'address': 'Central st 954'}
{'_id': 12, 'name': 'William', 'address': 'Central st 954'}
{'_id': ObjectId('5dd30ba1da8cebec98809bfc'), 'name': 'Viola', 'address': 'Sideway 1633'}
{'_id': 14, 'name': 'Viola', 'address': 'Sideway 1633'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf8'), 'name': 'Vicky', 'address': 'Yellow Garden 2'}
{'_id': 10, 'name': 'Vicky', 'address': 'Yellow Garden 2'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf7'), 'name': 'Susan', 'address': 'One way 98'}
{'_id': 9, 'name': 'Susan', 'address': 'One way 98'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf4'), 'name': 'Sandy', 'address': 'Ocean blvd 2'}
{'_id': 6, 'name': 'Sandy', 'address': 'Ocean blvd 2'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf6'), 'name': 'Richard', 'address': 'Sky st 331'}
{'_id': 8, 'name': 'Richard', 'address': 'Sky st 331'}
{'_id': ObjectId('5dd30ba1da8cebec98809bf0'), 'name': 'Peter', 'address': 'Lowstreet 27'}
{'_id': 2, 'name': 'Peter',

# Delete Document

To delete one document, we use the delete_one() method.

The first parameter of the delete_one() method is a query object defining which document to delete.

Note: If the query finds more than one document, only the first occurrence is deleted.

In [21]:
# Delete the document with the address "Mountain 21"
myquery = { "address": "Mountain 21" }

mycol.delete_one(myquery)

<pymongo.results.DeleteResult at 0x10b8c9fa0>

# Delete Many Documents

To delete more than one document, use the delete_many() method.

The first parameter of the delete_many() method is a query object defining which documents to delete.

In [22]:
# Delete all documents were the address starts with the letter S
myquery = { "address": {"$regex": "^S"} }

x = mycol.delete_many(myquery)

print(x.deleted_count, " documents deleted.")

4  documents deleted.


# Delete All Documents in a Collection

To delete all documents in a collection, pass an empty query object to the delete_many() method:

In [23]:
# Delete all documents in the "customers" collection
x = mycol.delete_many({})

print(x.deleted_count, " documents deleted.") 

23  documents deleted.


# Delete Collection

You can delete a table, or collection as it is called in MongoDB, by using the drop() method.

In [24]:
# Delete the "customers" collection

mycol.drop() 