# Projection and Sorting

## What is "projection"?

- reducing data to fewer dimensions
- Asking certain data to "speak up"!

![](assets/map-projections.jpg)

## Projection in MongoDB

In [None]:
from pymongo import MongoClient

client = MongoClient()
db = client.nobel

When using `db.collection.find()`, the first argument is the filter and the second is the projection.

In [None]:
db.laureates.find({}, {})

Since `db.collection.find()` returns a cursor, we either need to iterate over its contents or make it into a list:

In [None]:
for doc in db.laureates.find({}, {}):
    print(doc)

In [None]:
list(db.laureates.find({},{}))[:3]

An empty projection dictionary by default only displays the entry `"_id"` field. To project out the prizes.affiliation field without the `"_id"` field, we would use a projection equal to `{"prizes.affiliations": 1, "_id": 0}`.

In [None]:
list(db.laureates.find({},{"prizes.affiliations": 1, "_id": 0}))[:3]

where the `1` value turns on `"prizes.affiliations"` and the `0` value turns off `"_id"`.

## Missing fields

In [None]:
list(db.laureates.find({"gender": "org"}, {"firstname": 1, "born": 1, "_id": 0}))

In [None]:
list(db.laureates.find({"gender": "org"}, {"favoriteIceCreamFlavor": 1, "_id": 0}))

## Shares of the 1963 Prize in Physics

Let's examine the laureates of the 1963 prize in physics and how they split the prize. Here is a query without projection:

```python
db.laureates.find({"prizes": {"$elemMatch": {"category": "physics", "year": "1963"}}})
```

How would we fetch the laureates' full names and prize share info?

In [None]:
projection = {"firstname": 1, "surname": 1, "prizes.share": 1, "_id": 0}

list(db.laureates.find({"prizes": {"$elemMatch": {"category": "physics", "year": "1963"}}}, projection))

## Sorting post-query with Python

In [None]:
from operator import itemgetter

docs = list(db.prizes.find({"category": "physics"}, ["year"]))

docs = sorted(docs, key=itemgetter("year"))
print([doc["year"] for doc in docs][:5])

In [None]:
docs = sorted(docs, key=itemgetter("year"), reverse=True)
print([doc["year"] for doc in docs][:5])

## Sorting in-query with MongoDB

In [None]:
cursor = db.prizes.find({"category": "physics"}, ["year"],
                        sort=[("year", 1)])
print([doc["year"] for doc in cursor][:5])

In [None]:
cursor = db.prizes.find({"category": "physics"}, ["year"],
                        sort=[("year", -1)])
print([doc["year"] for doc in cursor][:5])

## Primary and secondary sorting

In [None]:
for doc in db.prizes.find(
        {"year": {"$gt": "1966", "$lt": "1970"}},
        {"category": 1, "year": 1, "_id": 0},
        sort=[("year", 1), ("category", -1)]):
    print("{year} {category}".format(**doc))

## What the sort?

This block prints out the first five projections of a sorted query. What "sort" argument fills the blank?

```python
docs = list(db.laureates.find(
    {"born": {"$gte": "1900"}, "prizes.year": {"$gte": "1954"}, "gender":{"$in":["male","female"]}},
    {"born": 1, "prizes.year": 1, "_id": 0},
    sort=____))
for doc in docs[:5]:
    print(doc)
```
```
{'born': '1916-08-25', 'prizes': [{'year': '1954'}]}
{'born': '1915-06-15', 'prizes': [{'year': '1954'}]}
{'born': '1901-02-28', 'prizes': [{'year': '1954'}, {'year': '1962'}]}
{'born': '1913-07-12', 'prizes': [{'year': '1955'}]}
{'born': '1911-01-26', 'prizes': [{'year': '1955'}]}
```

Primary sorting is ascending by `"prizes.year"`. Secondary sort is descending by `"born"`. Thus, sort=`[("prizes.year",1),("born",-1)]`.

In [None]:
docs = list(db.laureates.find(
    {"born": {"$gte": "1900"}, "prizes.year": {"$gte": "1954"}, "gender":{"$in":["male","female"]}},
    {"born": 1, "prizes.year": 1, "_id": 0},
    sort=[("prizes.year",1),("born",-1)]))
for doc in docs[:5]:
    print(doc)