Update 4. Aggregation Pipelines: Let the Server Do It For You.md

ortizfram · web-flow · commit 6c1dc0119daf · 2022-11-29T10:18:58.000-03:00
diff --git a/Introduction to MongoDB in Python/4. Aggregation Pipelines: Let the Server Do It For You.md b/Introduction to MongoDB in Python/4. Aggregation Pipelines: Let the Server Do It For You.md
@@ -1,7 +1,69 @@
-## 📚 Intro to Aggregation
+## ➡️ 📚 Intro to Aggregation
 ### 📚 Queries have implicit stages
 ![image](https://user-images.githubusercontent.com/51888893/204533592-e32876e1-5043-4fd8-b96d-a1ea21813b4a.png)
 ### 📚 Adding sort and skip stages
 ![image](https://user-images.githubusercontent.com/51888893/204533832-631539bf-080a-4c98-a2c8-e3c6d574a376.png)
 ### 📚 can i count?
 ![image](https://user-images.githubusercontent.com/51888893/204533977-5e57c13f-2a3f-4ce4-a24d-24e78f04782e.png)
+
+## 🦍 Sequencing stages
+Here is a cursor, followed by four aggregation pipeline stages:
+
+      cursor = (db.laureates.find(
+          projection={"firstname": 1, "prizes.year": 1, "_id": 0},
+          filter={"gender": "org"})
+       .limit(3).sort("prizes.year", -1))
+
+      project_stage = {"$project": {"firstname": 1, "prizes.year": 1, "_id": 0}}
+      match_stage = {"$match": {"gender": "org"}}
+      limit_stage = {"$limit": 3}
+      sort_stage = {"$sort": {"prizes.year": -1}}
+> What sequence pipeline of the above four stages can produce a cursor db.laureates.aggregate(pipeline) equivalent to cursor above?
+Possible Answers
+- [ ] [project_stage, match_stage, limit_stage, sort_stage]
+- [ ] [project_stage, match_stage, sort_stage, limit_stage]
+- [ ] [match_stage, project_stage, limit_stage, sort_stage]
+- [x] [match_stage, project_stage, sort_stage, limit_stage]
+
+## 🦍 Aggregating a few individuals' country data
+The following query cursor yields birth-country and prize-affiliation-country information for three non-organization laureates:
+
+      cursor = (db.laureates.find(
+          {"gender": {"$ne": "org"}},
+          ["bornCountry", "prizes.affiliations.country"]
+      ).limit(3))
+- [x] Translate the above cursor cursor to an equivalent aggregation cursor, saving the pipeline stages to pipeline. Recall that the find collection method's "filter" parameter maps to the "$match" aggregation stage, its "projection" parameter maps to the "$project" stage, and the "limit" parameter (or cursor method) maps to the "$limit" stage.
+```py
+# Translate cursor to aggregation pipeline
+pipeline = [
+    {'$match': {'gender': {'$ne': 'org'}}},
+    {'$project': {'bornCountry': 1, 'prizes.affiliations.country': 1}},
+    {'$limit': 3}
+]
+
+for doc in db.laureates.aggregate(pipeline):
+    print("{bornCountry}: {prizes}".format(**doc))
+```
+
+## 🦍 Passing the aggregation baton to Python
+- [x] Save to pipeline an aggregation pipeline to collect prize documents as detailed above. Use Python's collections.OrderedDict to specify any sorting.
+```py
+from collections import OrderedDict
+from itertools import groupby
+from operator import itemgetter
+
+original_categories = set(db.prizes.distinct("category", {"year": "1901"}))
+
+# Save an pipeline to collect original-category prizes
+pipeline = [
+    {"$match": {"category": {"$in": list(original_categories)}}},
+    {"$project": {"category": 1, "year": 1}},
+    {"$sort": OrderedDict([("year", -1)])}
+]
+cursor = db.prizes.aggregate(pipeline)
+for key, group in groupby(cursor, key=itemgetter("year")):
+    missing = original_categories - {doc["category"] for doc in group}
+    if missing:
+        print("{year}: {missing}".format(
+            year=key, missing=", ".join(sorted(missing))))
+```