config contains the client (similar to mysql connector), which gives us access to the database

NOTE: Throughout this tutorial, there are REFER tags. This notebook is meant to get you started and aware of what's available in mongo that we will commonly use. The REFER tags will point you to additional operators, commands, and functions outside of what is shown here.

In [57]:
import sys
sys.path.append("../..")
from pprint import pprint

from config import client

Declaring the Database to be used from the Mongo Server. Here, we'll just use the playground.

In [58]:
db = client['playground']

Creating a New Collection in the Database (replace with your name, otherwise you may be using someone else's!)

In [59]:
YOUR_NAME = "victor" # goes here

In [60]:
basket = db[f'{YOUR_NAME}s_fruit_basket'] 

Inserting a document into the collection (notice the freedom you have)

In [61]:
basket.insert_one(
    {
        'fruit': 'apple', 
        'price': 1.25,
        'details': {
            'color': 'red'
        }
    },
)

<pymongo.results.InsertOneResult at 0x7fa381981d00>

Inserting many documents into the collection

In [62]:
basket.insert_many([
    {
        'fruit': 'banana',
        'price': .50,
        'details': {
            'color': 'yellow'
        }
    },
    {
        'fruit': 'strawberry',
        'price': .20,
        'details': {
            'color': 'red'
        }
    },
])

<pymongo.results.InsertManyResult at 0x7fa3818a0ac0>

Finding a single document with a field

In [63]:
basket.find_one({
    'fruit': 'banana'
})

{'_id': ObjectId('60669e1914bbdb315c63aa60'),
 'fruit': 'banana',
 'price': 0.5,
 'details': {'color': 'yellow'}}

Finding a document with a nested field

In [64]:
result = basket.find({
    'details': {
        'color': 'red'
    }
})

result

<pymongo.cursor.Cursor at 0x7fa381975eb0>

Wait what happened, we didn't get apple and strawberry back! That's because find_one returns the document itself. The find function returns a cursor to retrieve the documents we want.

In [65]:
for fruit in result:
    pprint(fruit)

{'_id': ObjectId('60669e1914bbdb315c63aa5f'),
 'details': {'color': 'red'},
 'fruit': 'apple',
 'price': 1.25}
{'_id': ObjectId('60669e1914bbdb315c63aa61'),
 'details': {'color': 'red'},
 'fruit': 'strawberry',
 'price': 0.2}


There we go! The cursor no longer works anymore:

In [66]:
result.next() # gets the next document, but none are left

StopIteration: 

Some other filters commonly used are:
1. < (lt), > (gt), <= (lte), >= (gte) 
2. in

REFER: https://docs.mongodb.com/manual/reference/operator/query/

In [81]:
list(
    basket.find({
        'price': {
            '$lte': 1.00
        }
    })
)

[{'_id': ObjectId('60669e1914bbdb315c63aa60'),
  'fruit': 'banana',
  'price': 0.75,
  'details': {'color': 'yellow'}}]

In [86]:
list(
    basket.find({
        'details.color': {
            '$in': ['red', 'green']
        }
    })
)

[{'_id': ObjectId('60669e1914bbdb315c63aa5f'),
  'fruit': 'apple',
  'price': 1.75,
  'details': {'color': 'red'}},
 {'_id': ObjectId('60669e1914bbdb315c63aa61'),
  'fruit': 'strawberry',
  'price': 1.75,
  'details': {'color': 'red',
   'updated': datetime.datetime(2021, 4, 2, 4, 32, 21, 195000)}},
 {'_id': ObjectId('60669e5514bbdb315c63aa62'),
  'fruit': 'melon',
  'price': 3.0,
  'details': {'color': 'green'}}]

Find accepts a second argument that allows you to pick the fields you want. This is useful when we handle really large data, but only need a couple fields:

In [None]:
result = basket.find({}, {
    'fruit': True # Indicate the fields you want selected
}) # empty dict for first argument means we are selecting all documents

list(result) # waill convert the cursor into the list of documents found

The \_id is always passed. Instead of choosing the fields you wish to _include_, you can choose the ones to _exclude_.

In [67]:
result = basket.find(
    {
        'fruit': 'banana' # You can use both arguments together!
    }, {
        'details.color': False  # Indicate the fields you want excluded (you can do nested with '.')
    })  

list(result)

[{'_id': ObjectId('60669e1914bbdb315c63aa60'),
  'fruit': 'banana',
  'price': 0.5,
  'details': {}}]

Update a document by filtering with the first argument, and setting with the second

In [68]:
basket.update_one({
    'fruit': 'banana'
},
    {
        '$set': {
            'price': .75
        }
    }
)

basket.find_one({'fruit': 'banana'})

{'_id': ObjectId('60669e1914bbdb315c63aa60'),
 'fruit': 'banana',
 'price': 0.75,
 'details': {'color': 'yellow'}}

Update many documents

For more update operators REFER: https://docs.mongodb.com/manual/reference/operator/update/

In [71]:
basket.update_many({
    'details.color': 'red'
},
    {
        '$set': {
            'price': 1.75
        }
    }
)

list(basket.find({'details.color': 'red'}))

[{'_id': ObjectId('60669e1914bbdb315c63aa5f'),
  'fruit': 'apple',
  'price': 1.75,
  'details': {'color': 'red'}},
 {'_id': ObjectId('60669e1914bbdb315c63aa61'),
  'fruit': 'strawberry',
  'price': 1.75,
  'details': {'color': 'red'}}]

There may be situations where you'd like several different tasks done together. Introducing bulk write:

In [72]:
from pymongo import UpdateOne, InsertOne
from datetime import datetime

updates_to_fruit_basket = [
    UpdateOne(  # same syntax as .update_one
        {
            'fruit': 'strawberry'
        },
        {
            '$set': {
                'details.updated': datetime.now()  # new date field
            }
        }
    ),
    InsertOne(
        {
            'fruit': 'melon',
            'price': 3.00,
            'details': {
                'color': 'green'
            }
        }
    )
]

basket.bulk_write(updates_to_fruit_basket)

<pymongo.results.BulkWriteResult at 0x7fa3818a0b80>

In [73]:
list(basket.find({}))

[{'_id': ObjectId('60669e1914bbdb315c63aa5f'),
  'fruit': 'apple',
  'price': 1.75,
  'details': {'color': 'red'}},
 {'_id': ObjectId('60669e1914bbdb315c63aa60'),
  'fruit': 'banana',
  'price': 0.75,
  'details': {'color': 'yellow'}},
 {'_id': ObjectId('60669e1914bbdb315c63aa61'),
  'fruit': 'strawberry',
  'price': 1.75,
  'details': {'color': 'red',
   'updated': datetime.datetime(2021, 4, 2, 4, 32, 21, 195000)}},
 {'_id': ObjectId('60669e5514bbdb315c63aa62'),
  'fruit': 'melon',
  'price': 3.0,
  'details': {'color': 'green'}}]

For performance, you can index fields to speed up sorting and retrieval

You can index on multiple fields! REFER: https://docs.mongodb.com/manual/core/index-multikey/

In [74]:
basket.create_index('fruit')

'fruit_1'

Speaking of sorting...

In [76]:
list(basket.find({}).sort('fruit')) # alphabetical

[{'_id': ObjectId('60669e1914bbdb315c63aa5f'),
  'fruit': 'apple',
  'price': 1.75,
  'details': {'color': 'red'}},
 {'_id': ObjectId('60669e1914bbdb315c63aa60'),
  'fruit': 'banana',
  'price': 0.75,
  'details': {'color': 'yellow'}},
 {'_id': ObjectId('60669e5514bbdb315c63aa62'),
  'fruit': 'melon',
  'price': 3.0,
  'details': {'color': 'green'}},
 {'_id': ObjectId('60669e1914bbdb315c63aa61'),
  'fruit': 'strawberry',
  'price': 1.75,
  'details': {'color': 'red',
   'updated': datetime.datetime(2021, 4, 2, 4, 32, 21, 195000)}}]

For more complicated sorting REFER: https://docs.mongodb.com/manual/reference/method/cursor.sort/

In [103]:
list(basket.find({}).sort('details.color')) #remember, you can use the nested '.' syntax anywhere!

[{'_id': ObjectId('60669e5514bbdb315c63aa62'),
  'fruit': 'melon',
  'price': 3.0,
  'details': {'color': 'green'}},
 {'_id': ObjectId('60669e1914bbdb315c63aa5f'),
  'fruit': 'apple',
  'price': 1.75,
  'details': {'color': 'red'}},
 {'_id': ObjectId('60669e1914bbdb315c63aa61'),
  'fruit': 'strawberry',
  'price': 1.75,
  'details': {'color': 'red',
   'updated': datetime.datetime(2021, 4, 2, 4, 32, 21, 195000)}},
 {'_id': ObjectId('60669e1914bbdb315c63aa60'),
  'fruit': 'banana',
  'price': 0.75,
  'details': {'color': 'yellow'}}]

Perhaps you'd like to do several operations in sequence. Introducing aggregation pipelines:

For more aggregation operators REFER: https://docs.mongodb.com/manual/reference/operator/aggregation/

In [104]:
# You define the steps of your pipeline
pipeline = [
    # step 1: Filtering
    {
        '$match': {  # equivalent to find. Aggregation commands are named differently
            'details.color': 'red',
            'price': {  # just showing you can filter on multiple fields at once
                '$lt': 100.00  # doesn't filter any fruits out, since none of our fruits are overpriced :)
            }
        }
    },
    # step 2: Setting & updating fields
    {
        '$set': {  # set a field to each document in our result
            'description': 'rounding prices down',
            'price': {
                '$floor': '$price'  # append $ to fields that need to be referenced
            }
        }
    },
    # step 3: Sorting by price
    {
        '$sort': {
            'price': 1
        }
    }
]

Performing the aggregation, note the description and the price. Remember, aggregation doesn't change what's in the database (unless you tell it to).

In [106]:
list(basket.aggregate(pipeline))

[{'_id': ObjectId('60669e1914bbdb315c63aa5f'),
  'fruit': 'apple',
  'price': 1.0,
  'details': {'color': 'red'},
  'description': 'rounding prices down'},
 {'_id': ObjectId('60669e1914bbdb315c63aa61'),
  'fruit': 'strawberry',
  'price': 1.0,
  'details': {'color': 'red',
   'updated': datetime.datetime(2021, 4, 2, 4, 32, 21, 195000)},
  'description': 'rounding prices down'}]