Skip to content

Commit 6c1dc01

Browse files
authored
Update 4. Aggregation Pipelines: Let the Server Do It For You.md
1 parent 5d5edb5 commit 6c1dc01

File tree

1 file changed

+63
-1
lines changed

1 file changed

+63
-1
lines changed
Lines changed: 63 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,69 @@
1-
## 📚 Intro to Aggregation
1+
## ➡️ 📚 Intro to Aggregation
22
### 📚 Queries have implicit stages
33
![image](https://user-images.githubusercontent.com/51888893/204533592-e32876e1-5043-4fd8-b96d-a1ea21813b4a.png)
44
### 📚 Adding sort and skip stages
55
![image](https://user-images.githubusercontent.com/51888893/204533832-631539bf-080a-4c98-a2c8-e3c6d574a376.png)
66
### 📚 can i count?
77
![image](https://user-images.githubusercontent.com/51888893/204533977-5e57c13f-2a3f-4ce4-a24d-24e78f04782e.png)
8+
9+
## 🦍 Sequencing stages
10+
Here is a cursor, followed by four aggregation pipeline stages:
11+
12+
cursor = (db.laureates.find(
13+
projection={"firstname": 1, "prizes.year": 1, "_id": 0},
14+
filter={"gender": "org"})
15+
.limit(3).sort("prizes.year", -1))
16+
17+
project_stage = {"$project": {"firstname": 1, "prizes.year": 1, "_id": 0}}
18+
match_stage = {"$match": {"gender": "org"}}
19+
limit_stage = {"$limit": 3}
20+
sort_stage = {"$sort": {"prizes.year": -1}}
21+
> What sequence pipeline of the above four stages can produce a cursor db.laureates.aggregate(pipeline) equivalent to cursor above?
22+
Possible Answers
23+
- [ ] [project_stage, match_stage, limit_stage, sort_stage]
24+
- [ ] [project_stage, match_stage, sort_stage, limit_stage]
25+
- [ ] [match_stage, project_stage, limit_stage, sort_stage]
26+
- [x] [match_stage, project_stage, sort_stage, limit_stage]
27+
28+
## 🦍 Aggregating a few individuals' country data
29+
The following query cursor yields birth-country and prize-affiliation-country information for three non-organization laureates:
30+
31+
cursor = (db.laureates.find(
32+
{"gender": {"$ne": "org"}},
33+
["bornCountry", "prizes.affiliations.country"]
34+
).limit(3))
35+
- [x] Translate the above cursor cursor to an equivalent aggregation cursor, saving the pipeline stages to pipeline. Recall that the find collection method's "filter" parameter maps to the "$match" aggregation stage, its "projection" parameter maps to the "$project" stage, and the "limit" parameter (or cursor method) maps to the "$limit" stage.
36+
```py
37+
# Translate cursor to aggregation pipeline
38+
pipeline = [
39+
{'$match': {'gender': {'$ne': 'org'}}},
40+
{'$project': {'bornCountry': 1, 'prizes.affiliations.country': 1}},
41+
{'$limit': 3}
42+
]
43+
44+
for doc in db.laureates.aggregate(pipeline):
45+
print("{bornCountry}: {prizes}".format(**doc))
46+
```
47+
48+
## 🦍 Passing the aggregation baton to Python
49+
- [x] Save to pipeline an aggregation pipeline to collect prize documents as detailed above. Use Python's collections.OrderedDict to specify any sorting.
50+
```py
51+
from collections import OrderedDict
52+
from itertools import groupby
53+
from operator import itemgetter
54+
55+
original_categories = set(db.prizes.distinct("category", {"year": "1901"}))
56+
57+
# Save an pipeline to collect original-category prizes
58+
pipeline = [
59+
{"$match": {"category": {"$in": list(original_categories)}}},
60+
{"$project": {"category": 1, "year": 1}},
61+
{"$sort": OrderedDict([("year", -1)])}
62+
]
63+
cursor = db.prizes.aggregate(pipeline)
64+
for key, group in groupby(cursor, key=itemgetter("year")):
65+
missing = original_categories - {doc["category"] for doc in group}
66+
if missing:
67+
print("{year}: {missing}".format(
68+
year=key, missing=", ".join(sorted(missing))))
69+
```

0 commit comments

Comments
 (0)