Skip to content

Commit 2339401

Browse files
authored
Update 3. Get Only What You Need, and Fast.md
1 parent fa09cfa commit 2339401

File tree

1 file changed

+53
-0
lines changed

1 file changed

+53
-0
lines changed

Introduction to MongoDB in Python/3. Get Only What You Need, and Fast.md

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -194,3 +194,56 @@ docs = db.prizes.find(
194194
for doc in docs:
195195
print(doc)
196196
```
197+
## 🦍 High-share categories
198+
> Which of the following indexes is best suited to speeding up the operation
199+
200+
db.prizes.distinct("category", {"laureates.share": {"$gt": "3"}})
201+
Possible Answers
202+
203+
- [ ] [("category", 1)]
204+
- [ ] [("category", 1), ("laureates.share", 1)]
205+
- [ ] [("laureates.share", 1)]
206+
- [x] [("laureates.share", 1), ("category", 1)]
207+
208+
## 🦍 Recently single?
209+
- [x] Specify an index model that indexes first on category (ascending) and second on year (descending).
210+
- [x] Save a string report for printing the last single-laureate year for each distinct category, one category per line. To do this, for each distinct prize category, find the latest-year prize (requiring a descending sort by year) of that category (so, find matches for that category) with a laureate share of "1".
211+
```py
212+
# Specify an index model for compound sorting
213+
index_model = [('category', 1), ('year', -1)]
214+
db.prizes.create_index(index_model)
215+
216+
# Collect the last single-laureate year for each category
217+
report = ""
218+
for category in sorted(db.prizes.distinct("category")):
219+
doc = db.prizes.find_one(
220+
{'category': category, "laureates.share": "1"},
221+
sort=[('year', -1)]
222+
)
223+
report += "{category}: {year}\n".format(**doc)
224+
225+
print(report)
226+
```
227+
228+
## 🦍 Born and affiliated
229+
- [x] Create an index on country of birth ("bornCountry") for db.laureates to ensure efficient gathering of distinct values and counting of documents
230+
- [x] Complete the skeleton dictionary comprehension to construct n_born_and_affiliated, the count of laureates as described above for each distinct country of birth. For each call to count_documents, ensure that you use the value of country to filter documents properly.
231+
```py
232+
from collections import Counter
233+
234+
# Ensure an index on country of birth
235+
db.laureates.create_index([("bornCountry", 1)])
236+
237+
# Collect a count of laureates for each country of birth
238+
n_born_and_affiliated = {
239+
country: db.laureates.count_documents({
240+
"bornCountry": country,
241+
"prizes.affiliations.country": country
242+
})
243+
for country in db.laureates.distinct("bornCountry")
244+
}
245+
246+
five_most_common = Counter(n_born_and_affiliated).most_common(5)
247+
print(five_most_common)
248+
```
249+
[('USA', 241), ('United Kingdom', 56), ('France', 26), ('Germany', 19), ('Japan', 17)]

0 commit comments

Comments
 (0)