<a href="https://colab.research.google.com/github/soujanya-vattikolla/MongoDB-for-Python-Developers-/blob/main/Chapter3%20AdminBackend.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Read Concerns**

* The default read concern in MongoDB is "local".
  * This does not check that data has been replicated.
* The read concern majority allows for more durable reads
  * This only returns the data that has been replicated to a majority of nodes.

**Read Concerns**<br>
**Problem:**<br>

Which of the following Read Concerns are valid in a 3-node replica set?<br>

* "local"

    * This will return latest data from the node your application is connected to. This is the default read concern in MongoDB.

* "majority"

    * This will return data that has been committed to a majority of nodes in the replica set. In a 3-node set, 2 nodes constitute a majority.

**Ticket: User Report**<br>
Problem:<br>

User Story<br>

"As an administrator, I want to be able to view the top 20 users by their number of comments."<br>

Task<br>

For this ticket, you'll be required to modify one method in db.py, most_active_commenters. This method produces a report of the 20 most frequent commenters on the MFlix site.

In [None]:
def most_active_commenters():
    group = {
        "$group":{
            "_id": "$email",
            "count": {"$sum": 1}
        }
    }
    sort = { "$sort": {"count": -1} }
    limit = { "$limit": 20}
    pipeline = [group, sort, limit]

    # we used Read Concern "majority" to make sure the data we read has been
    # majority-committed
    rc = ReadConcern("majority")
    comments = db.comments.with_options(read_concern=rc)
    result = comments.aggregate(pipeline)
    return list(result)

**Bulk Writes**

**Ordered Bulk Write**<br>
* The default setting for bulk writes in MongoDB
* Executes writes sequentially
    * Will end execution after first write failure

**Unordered Bulk Write**<br>
  * Has to be specified with the flag:{ordered:false}
  * Executes writes in parallel

* Bulk writes allow database clients to send multiple writes.
* Can either be ordered or unordered

Problem:<br>

Which of the following is true about bulk writes?<br>

* Bulk writes decrease the effect of latency on overall operation time.

    * By sending multiple documents in the same round trip, bulk writes reduce the effect of latency on the execution of an entire batch.

* By default, bulk writes are ordered.

    * This is the default behavior, but you can change this by passing the flag { ordered: false }.

**Ticket: Migration**<br>
Problem:<br>

Task<br>

For this ticket, you'll be required to complete the command-line script located in the migrations directory of mflix called movie_last_updated_migration.py.<br>

Things always change, and a requirement has come down that the lastupdated value in each document of the movies collection needs to be stored as an ISODate rather than a String.

In [None]:
from pymongo import MongoClient, UpdateOne
from pymongo.errors import InvalidOperation
from bson import ObjectId
import dateutil.parser as parser

host = "mongodb://localhost:27017"
mflix = MongoClient(host)["sample_mflix"]

# here we're making sure "lastupdated" exists in the document as a string
predicate = {"lastupdated": {"$exists": True, "$type": "string"}}
# this projection only sends the "lastupdated" and "_id" fields back to the client
projection = {"lastupdated": 1}

cursor = mflix.movies.find(predicate, projection)

updates = []
for doc in cursor:
    doc_id = doc.get('_id')
    lastupdated = doc.get('lastupdated', None)
    updates.append(
        {
            "doc_id": ObjectId(doc_id),
            "lastupdated": parser.parse(lastupdated)
        }
    )

print(f"{len(updates)} documents to update")

try:
    # this will gather UpdateOne operations into a bulk_updates array
    # we target the document with "_id" and then set its "lastupdated" field
    # to the new ISODate type
    bulk_updates = [UpdateOne(
        {"_id": update.get("doc_id")},
        {"$set": {"lastupdated": update.get("lastupdated")}}
    ) for update in updates]

    bulk_results = mflix.movies.bulk_write(bulk_updates)
    print(f"{bulk_results.modified_count} documents updated")

except InvalidOperation:
    print("no updates necessary")
except Exception as e:
    print(str(e))