# Lab: Using Cursor-like aggregation stages

## For this lab, you'll have to use cursor-like aggregation stages to find the answer for the following scenario.

#### The dataset for this lab can be downloaded [here](https://s3.amazonaws.com/edu-static.mongodb.com/lessons/coursera/aggregation/movies.json) for upload to your own cluster.

### Movie Night

Your organization has a movie night scheduled, and you've again been tasked with coming up with a selection.

HR has polled employees and assembled the following list of preferred actresses and actors.

For movies released in the **USA** with a ``tomatoes.viewer.rating`` greater
than or equal to **3**, calculate a new field called num_favs that represets how
many **favorites** appear in the ``cast`` field of the movie.

Sort your results by ``num_favs``, ``tomatoes.viewer.rating``, and ``title``,
all in descending order.

What is the ``title`` of the **25th** film in the aggregation result?

**Hint**: MongoDB has a great expression for quickly determining whether there are common elements in lists, ``$setIntersection``


In [96]:
import pymongo

In [97]:
course_cluster_uri = "mongodb://agg-student:agg-password@cluster0-shard-00-00-jxeqq.mongodb.net:27017,cluster0-shard-00-01-jxeqq.mongodb.net:27017,cluster0-shard-00-02-jxeqq.mongodb.net:27017/test?ssl=true&replicaSet=Cluster0-shard-0&authSource=admin"
course_client = pymongo.MongoClient(course_cluster_uri)

In [98]:
movies = course_client['aggregations']['movies']

In [99]:
favorites = [
  "Sandra Bullock",
  "Tom Hanks",
  "Julia Roberts",
  "Kevin Spacey",
  "George Clooney"
]

In [100]:
predicate = {
    "$match": { 
        "title": { "$exists": True },
        "countries": { "$in": [ "USA", "$countries" ] }, 
        "cast": { "$elemMatch": { "$exists": True } }, 
        "tomatoes.viewer.rating": { "$gte": 3 }
        }
}

In [112]:
projection = {
    "$project": {
        "_id": 0,
        "cast": 1,
        "title": 1,
        "num_favs": { "$size": { "$setIntersection" : [ favorites, "$cast"] } }
    }
}

In [117]:
sorting = {
    "$sort": {
        "num_favs": -1,
        "tomatoes.viewer.rating": -1,
        "title": -1
    }
}


In [118]:
skipping = {
    "$skip": 24
}

In [119]:
limiting = {
    "$limit": 1
}

In [120]:
pipeline = [
    predicate,
    projection,
    sorting,
    skipping,
    limiting
]

display(list(movies.aggregate(pipeline)))

[{'title': 'The Ref',
  'cast': ['Denis Leary',
   'Judy Davis',
   'Kevin Spacey',
   'Robert J. Steinmiller Jr.'],
  'num_favs': 1}]