# MongoDB Quiz

This notebook contains a list of exercises, to assess your MongoDB knowledge.

The deadline is <font color='red'>April 12, 23:59</font>

## Introduction

The quiz uses the <font color="red">sample_mflix</font> database.

Make sure all your answers can execute correctly before submitting your notebooks.

In [1]:
!pip install pymongo
!curl ifconfig.me

2001:861:5e61:e930:4066:6503:58d0:4c84


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100    38  100    38    0     0    243      0 --:--:-- --:--:-- --:--:--   246


In [2]:
from pymongo.mongo_client import MongoClient
from pymongo.server_api import ServerApi
import certifi
# Initiate the database connection
username = "sanjayadg98"
password = "sana1998"
cluster_url = "cluster0.8efpevm.mongodb.net"

# Get the connection string from Atlas

uri = f"mongodb+srv://sanjayadg98:sana1998@cluster0.8efpevm.mongodb.net/?retryWrites=true&w=majority&appName=Cluster0"

# Create a new client and connect to the server
client = MongoClient(uri, server_api=ServerApi('1'), tlsCAFile=certifi.where())

# Send a ping to confirm a successful connection
try:
    client.admin.command('ping')
    print("Pinged your deployment. You successfully connected to MongoDB!")
except Exception as e:
    print(e)


Pinged your deployment. You successfully connected to MongoDB!


## 1. Easy: Find Documents

Retrieve one document from the `movies` collection where the `genres` contains `Comedy`.

* Print the result.
* Store the _id of the movie into a variable.

In [3]:
# Type your answer
# Select the sample_mflix database
db = client["sample_mflix"]

# Retrieve one document where genres contains "Comedy"
movie = db.movies.find_one({"genres": "Comedy"})

# Print the result
print(movie)

# Store the _id of the movie into a variable
movie_id = movie["_id"]

# Print the movie_id
print("Movie ID:", movie_id)


{'_id': ObjectId('573a1390f29313caabcd4803'), 'plot': 'Cartoon figures announce, via comic strip balloons, that they will move - and move they do, in a wildly exaggerated style.', 'genres': ['Animation', 'Short', 'Comedy'], 'runtime': 7, 'cast': ['Winsor McCay'], 'num_mflix_comments': 0, 'poster': 'https://m.media-amazon.com/images/M/MV5BYzg2NjNhNTctMjUxMi00ZWU4LWI3ZjYtNTI0NTQxNThjZTk2XkEyXkFqcGdeQXVyNzg5OTk2OA@@._V1_SY1000_SX677_AL_.jpg', 'title': 'New Title', 'fullplot': 'Cartoonist Winsor McCay agrees to create a large set of drawings that will be photographed and made into a motion picture. The job requires plenty of drawing supplies, and the cartoonist must also overcome some mishaps caused by an assistant. Finally, the work is done, and everyone can see the resulting animated picture.', 'languages': ['English'], 'released': datetime.datetime(1911, 4, 8, 0, 0), 'directors': ['Winsor McCay', 'J. Stuart Blackton'], 'writers': ['Winsor McCay (comic strip "Little Nemo in Slumberland")

## 2. Easy: Find Documents

Get the number of documents in the `movies` collection where the `genres` contains `Drama` and `History`.

* Print the result.

In [4]:
# Type your answer
# Select the sample_mflix database
db = client["sample_mflix"]

# Count the number of documents where genres contain both Drama and History
count = db.movies.count_documents({"genres": {"$all": ["Drama", "History"]}})

# Print the result
print("Number of documents where genres contain Drama and History:", count)

Number of documents where genres contain Drama and History: 640


## 3. Easy: Basic Update

Update the title of the movie previously retrieved to "New Title"

* Use the variable previously created in the #1
* Print the movie title before and after the update using find_one with projection and the `_id` from #1

In [5]:
# Type your answer
#Select the sample_mflix database
db = client["sample_mflix"]

# Retrieve the movie document using the movie_id variable from #1
movie = db.movies.find_one({"_id": movie_id})

# Print the movie title before the update
print("Movie title before update:", movie["title"])

# Update the title of the movie to "New Title"
db.movies.update_one({"_id": movie_id}, {"$set": {"title": "New Title"}})

# Retrieve the updated movie document
updated_movie = db.movies.find_one({"_id": movie_id}, projection={"title": 1})

# Print the movie title after the update
print("Movie title after update:", updated_movie["title"])

Movie title before update: New Title
Movie title after update: New Title


## 4. Easy: Simple deletion

Delete all documents from "comments" collection where the field `name` is `Peggy Heath`.

In [6]:
# Type your answer
# Select the sample_mflix database
db = client["sample_mflix"]

# Delete all documents from the "comments" collection where the field `name` is `Peggy Heath`
result = db.comments.delete_many({"name": "Peggy Heath"})

# Print the number of deleted documents
print("Deletion is completed")

Deletion is completed


## 5. Moderate: Conditional Find

Retrieve all movies from the `movies` collection released in the year 2015 with an `imdb.rating` greater than 4 and a `rated` value of `R`.

Print the count of the documents.

In [7]:
# Type your answer
# Select the sample_mflix database
db = client["sample_mflix"]

# Define the query criteria
query = {
    "year": 2015,
    "imdb.rating": {"$gt": 4},
    "rated": "R"
}

# Count the number of documents directly on the collection
count = db.movies.count_documents(query)

# Print the count of documents
print("Number of movies released in 2015 with an imdb rating greater than 4 and rated R:", count)

Number of movies released in 2015 with an imdb rating greater than 4 and rated R: 48


## 6. Moderate: Conitional Update

Increment the `runtime` field by 1 for the movie with _id `573a13cdf29313caabd841e8`

* Print the runtime before and after the update using findOne with the `_id` and projection.
* You can import ObjectId using `from bson.objectid import ObjectId`

In [8]:
# Type your answer
from bson.objectid import ObjectId

# Select the sample_mflix database
db = client["sample_mflix"]

# Define the _id of the movie
movie_id = ObjectId("573a13cdf29313caabd841e8")

# Find the movie with the specified _id and projection to get the current runtime
movie_before_update = db.movies.find_one({"_id": movie_id}, {"runtime": 1})

# Print the runtime before the update
print("Runtime before update:", movie_before_update["runtime"])

# Increment the runtime by 1
db.movies.update_one({"_id": movie_id}, {"$inc": {"runtime": 1}})

# Find the movie with the specified _id and projection to get the updated runtime
movie_after_update = db.movies.find_one({"_id": movie_id}, {"runtime": 1})

# Print the runtime after the update
print("Runtime after update:", movie_after_update["runtime"])


Runtime before update: 107
Runtime after update: 108


## 7. Difficult: Aggregation Pipeline

Find the average `imdb.rating` for each genre in the `movies` collection and sort the result in descending order.

* Movies can have multiple `genres`, use an operator to transform the list of `genres`
* [Example of average calculation](https://www.mongodb.com/docs/manual/reference/operator/aggregation/avg/#use-in--group-stage)

In [9]:
# Type your answer
# Select the sample_mflix database
db = client["sample_mflix"]

# Pipeline for aggregation
pipeline = [
    {
        "$unwind": "$genres"  # Unwind the genres array
    },
    {
        "$group": {
            "_id": "$genres",  # Group by genre
            "average_rating": {"$avg": "$imdb.rating"}  # Calculate the average rating for each genre
        }
    },
    {
        "$sort": {"average_rating": -1}  # Sort the result in descending order of average rating
    }
]

# Execute the aggregation pipeline
result = list(db.movies.aggregate(pipeline))

# Print the result
for genre_data in result:
    print("Genre:", genre_data["_id"])
    print("Average IMDb Rating:", genre_data["average_rating"])
    print()


Genre: Film-Noir
Average IMDb Rating: 7.397402597402598

Genre: Short
Average IMDb Rating: 7.377574370709382

Genre: Documentary
Average IMDb Rating: 7.365679824561403

Genre: News
Average IMDb Rating: 7.252272727272728

Genre: History
Average IMDb Rating: 7.1696100917431185

Genre: War
Average IMDb Rating: 7.128591954022989

Genre: Biography
Average IMDb Rating: 7.087984189723319

Genre: Talk-Show
Average IMDb Rating: 7.0

Genre: Animation
Average IMDb Rating: 6.89669603524229

Genre: Music
Average IMDb Rating: 6.883333333333334

Genre: Western
Average IMDb Rating: 6.823553719008264

Genre: Drama
Average IMDb Rating: 6.803377338624768

Genre: Sport
Average IMDb Rating: 6.749041095890411

Genre: Crime
Average IMDb Rating: 6.688585405625764

Genre: Musical
Average IMDb Rating: 6.665831435079727

Genre: Romance
Average IMDb Rating: 6.6564272782136396

Genre: Mystery
Average IMDb Rating: 6.527425044091711

Genre: Adventure
Average IMDb Rating: 6.493680884676145

Genre: Comedy
Average IMDb

Find the average `imdb.rating` for each genre in the `movies` collection, <font color="red">for movies that have been nominated to at least one award</font> and sort the result in descending order.

* Award information for a movie is available under `awards` fields.

In [10]:
# Type your answer
#Select the sample_mflix database
db = client["sample_mflix"]

# Pipeline for aggregation
pipeline = [
    {
        "$match": {
            "awards.nominations": {"$gt": 0}  # Filter movies with at least one nomination
        }
    },
    {
        "$unwind": "$genres"  # Unwind the genres array
    },
    {
        "$group": {
            "_id": "$genres",  # Group by genre
            "average_rating": {"$avg": "$imdb.rating"}  # Calculate the average rating for each genre
        }
    },
    {
        "$sort": {"average_rating": -1}  # Sort the result in descending order of average rating
    }
]

# Execute the aggregation pipeline
result = list(db.movies.aggregate(pipeline))

# Print the result
for genre_data in result:
    print("Genre:", genre_data["_id"])
    print("Average IMDb Rating:", genre_data["average_rating"])
    print()

Genre: Short
Average IMDb Rating: 7.387550200803213

Genre: Film-Noir
Average IMDb Rating: 7.365384615384615

Genre: Documentary
Average IMDb Rating: 7.359469417833456

Genre: News
Average IMDb Rating: 7.346875

Genre: History
Average IMDb Rating: 7.170079787234042

Genre: War
Average IMDb Rating: 7.143382352941177

Genre: Biography
Average IMDb Rating: 7.092733878292462

Genre: Talk-Show
Average IMDb Rating: 7.0

Genre: Music
Average IMDb Rating: 6.868005952380953

Genre: Animation
Average IMDb Rating: 6.840551724137931

Genre: Western
Average IMDb Rating: 6.8269938650306745

Genre: Drama
Average IMDb Rating: 6.807097021358107

Genre: Sport
Average IMDb Rating: 6.707073954983923

Genre: Crime
Average IMDb Rating: 6.697655502392345

Genre: Romance
Average IMDb Rating: 6.64971160778659

Genre: Musical
Average IMDb Rating: 6.640340909090909

Genre: Mystery
Average IMDb Rating: 6.546060606060607

Genre: Adventure
Average IMDb Rating: 6.473182651191204

Genre: Comedy
Average IMDb Rating: 6

Find the average `imdb.rating` for each genre in the `movies` collection, <font color="red">for movies that have won to at least one award</font> and sort the result in descending order.

* Award information for a movie is available under `awards` fields.

In [11]:

# Type your answer
#Select the sample_mflix database
db = client["sample_mflix"]

# Pipeline for aggregation
pipeline = [
    {
        "$match": {
            "awards.nominations": {"$gt": 0}  # Filter movies with at least one nomination
        }
    },
    {
        "$unwind": "$genres"  # Unwind the genres array
    },
    {
        "$group": {
            "_id": "$genres",  # Group by genre
            "average_rating": {"$avg": "$imdb.rating"}  # Calculate the average rating for each genre
        }
    },
    {
        "$sort": {"average_rating": -1}  # Sort the result in descending order of average rating
    }
]

# Execute the aggregation pipeline
result = list(db.movies.aggregate(pipeline))

for genre_data in result:
    print("Genre:", genre_data["_id"])
    print("Average IMDb Rating:", genre_data["average_rating"])
    print()

Genre: Short
Average IMDb Rating: 7.387550200803213

Genre: Film-Noir
Average IMDb Rating: 7.365384615384615

Genre: Documentary
Average IMDb Rating: 7.359469417833456

Genre: News
Average IMDb Rating: 7.346875

Genre: History
Average IMDb Rating: 7.170079787234042

Genre: War
Average IMDb Rating: 7.143382352941177

Genre: Biography
Average IMDb Rating: 7.092733878292462

Genre: Talk-Show
Average IMDb Rating: 7.0

Genre: Music
Average IMDb Rating: 6.868005952380953

Genre: Animation
Average IMDb Rating: 6.840551724137931

Genre: Western
Average IMDb Rating: 6.8269938650306745

Genre: Drama
Average IMDb Rating: 6.807097021358107

Genre: Sport
Average IMDb Rating: 6.707073954983923

Genre: Crime
Average IMDb Rating: 6.697655502392345

Genre: Romance
Average IMDb Rating: 6.64971160778659

Genre: Musical
Average IMDb Rating: 6.640340909090909

Genre: Mystery
Average IMDb Rating: 6.546060606060607

Genre: Adventure
Average IMDb Rating: 6.473182651191204

Genre: Comedy
Average IMDb Rating: 6

## 8. Difficult: Complex Aggregation

Calculate the total number of comments for each user in the `comments` collection, then find the user with the highest number of comments.

In [12]:
# Type your answer


db = client["sample_mflix"]

# Pipeline for aggregation
pipeline = [
    {
        "$group": {
            "_id": "$name",  # Group by user name
            "total_comments": {"$sum": 1}  # Count comments for each user
        }
    },
    {
        "$sort": {"total_comments": -1}  # Sort by total_comments in descending order
    },
    {
        "$limit": 1  # Limit to the top user with the highest number of comments
    }
]

result = list(db.comments.aggregate(pipeline))

if result:
    top_user = result[0]
    print("User with the highest number of comments:")
    print("User:", top_user["_id"])
    print("Total Comments:", top_user["total_comments"])
else:
    print("No comments found.")


User with the highest number of comments:
User: Mace Tyrell
Total Comments: 277


Calculate the total number of comments for each movie in the `comments` collection, then find the movie name with the highest number of comments.

* Use the [$lookup](https://www.mongodb.com/docs/manual/reference/operator/aggregation/lookup/) stage to get information about the movie

In [13]:
# Type your answer
# Aggregate to get the total number of comments for each movie
pipeline = [
    {
        "$group": {
            "_id": "$movie_id",
            "total_comments": {"$sum": 1}
        }
    },
    {
        "$lookup": {
            "from": "movies",
            "localField": "_id",
            "foreignField": "_id",
            "as": "movie_info"
        }
    },
    {
        "$addFields": {
            "movie_name": {"$arrayElemAt": ["$movie_info.title", 0]}
        }
    },
    {
        "$sort": {"total_comments": -1}
    },
    {
        "$limit": 1
    }
]


result = list(db.comments.aggregate(pipeline))


if result:
    top_movie = result[0]
    print("Movie with the highest number of comments:")
    print("Movie Name:", top_movie["movie_name"])
    print("Total Comments:", top_movie["total_comments"])
else:
    print("No comments found.")


Movie with the highest number of comments:
Movie Name: The Taking of Pelham 1 2 3
Total Comments: 161


## 9. Challenging: Aggregation and Lookup

Find the average number of comments per movie for movies released in 2000. Consider data from both the `movies` and `comments` collections.

In [14]:
# Type your answer

pipeline = [
    # Match movies released in 2000
    {
        "$match": {
            "year": 2000
        }
    },
    # Perform a left outer join with the comments collection
    {
        "$lookup": {
            "from": "comments",  # Collection to join with
            "localField": "_id",  # Field from the movies collection
            "foreignField": "movie_id",  # Field from the comments collection
            "as": "comments"  # Name for the new array field
        }
    },
    # Unwind the comments array to deconstruct the array elements
    {
        "$unwind": "$comments"
    },
    # Group by movie and calculate the count of comments for each movie
    {
        "$group": {
            "_id": "$_id",  # Group by movie
            "comments_count": {"$sum": 1}  # Calculate the count of comments
        }
    },
    # Calculate the average number of comments per movie
    {
        "$group": {
            "_id": None,  # Group all documents together
            "average_comments_per_movie": {"$avg": "$comments_count"}  # Calculate the average
        }
    }
]


result = list(db.movies.aggregate(pipeline))
print(result)



[{'_id': None, 'average_comments_per_movie': 10.492537313432836}]


## 10. Challenging: Aggregation

Calculate the average `imdb.rating` for movies in the `movies` collection released after 2012. Consider only movies with at least 100 `imdb.votes`.

In [15]:
# Type your answer

pipeline = [
    # Match movies released after 2012 and with at least 100 imdb.votes
    {
        "$match": {
            "year": {"$gt": 2012},  # Released after 2012
            "imdb.votes": {"$gte": 100}  # At least 100 imdb.votes
        }
    },
  
    {
        "$group": {
            "_id": None,  # Group all documents together
            "average_rating": {"$avg": "$imdb.rating"}  # Calculate the average imdb.rating
        }
    }
]

result = db.movies.aggregate(pipeline)
for doc in result:
    print("Average IMDB Rating for movies released after 2012 with at least 100 votes:", doc["average_rating"])


Average IMDB Rating for movies released after 2012 with at least 100 votes: 6.506469500924214


## 11. Really Challenging ✌

Retrieve the top 3 genres with the highest average `imdb.rating`, and for each genre, list the `title` and `imdb.rating` of the 5 movies with the highest ratings, sorted by the ratings. Use the `movies` collection, where genres are stored as an array.

In [16]:
pipeline = [
    # Unwind the genres array
    {"$unwind": "$genres"},
    # Group by genre and calculate the average imdb.rating
    {"$group": {"_id": "$genres", "avg_rating": {"$avg": "$imdb.rating"}}},
    # Sort by average rating in descending order
    {"$sort": {"avg_rating": -1}},
    # Limit to the top 3 genres
    {"$limit": 3},
]


top_genres = db.movies.aggregate(pipeline)

# For each top genre, find the top 5 movies
for genre_data in top_genres:
    genre = genre_data["_id"]
    pipeline = [
        # Match documents with the selected genre
        {"$match": {"genres": genre}},
        # Sort by imdb.rating in descending order and limit to the top 5 movies
        {"$sort": {"imdb.rating": -1}},
        {"$limit": 5},
        # Project only the title and imdb.rating fields
        {"$project": {"title": 1, "imdb.rating": 1}},
    ]
    
    top_movies = list(db.movies.aggregate(pipeline))
    
    print("Genre:", genre)
    print("Average IMDB Rating:", genre_data["avg_rating"])
    print("Top 5 Movies:")
    for movie in top_movies:
        print("Title:", movie["title"], "- IMDB Rating:", movie["imdb"]["rating"])
    print()


Genre: Film-Noir
Average IMDB Rating: 7.397402597402598
Top 5 Movies:
Title: Double Indemnity - IMDB Rating: 8.4
Title: Touch of Evil - IMDB Rating: 8.2
Title: Laura - IMDB Rating: 8.1
Title: Strangers on a Train - IMDB Rating: 8.1
Title: Notorious - IMDB Rating: 8.1

Genre: Short
Average IMDB Rating: 7.377574370709382
Top 5 Movies:
Title: Day One - IMDB Rating: 
Title: Absent Minded - IMDB Rating: 
Title: Manoman - IMDB Rating: 
Title: Hallway - IMDB Rating: 
Title: Krot na more - IMDB Rating: 

Genre: Documentary
Average IMDB Rating: 7.365679824561403
Top 5 Movies:
Title: No Tomorrow - IMDB Rating: 
Title: Catching the Sun - IMDB Rating: 
Title: Junior - IMDB Rating: 
Title: Emergency Exit: Young Italians Abroad - IMDB Rating: 
Title: All Eyes and Ears - IMDB Rating: 



Who are the top 3 writers with the best rating?<br>
Use the average `imdb.rating` for each writer from the `writers` array in the `movies` collection

In [17]:
# Type your answer

pipeline = [
    # Unwind the writers array
    {"$unwind": "$writers"},
    # Group by writer and calculate the average imdb.rating
    {"$group": {"_id": "$writers", "avg_rating": {"$avg": "$imdb.rating"}}},
    # Sort by average rating in descending order
    {"$sort": {"avg_rating": -1}},
    # Limit to the top 3 writers
    {"$limit": 3}
]

# Execute the pipeline to get the top 3 writers
top_writers = db.movies.aggregate(pipeline)


for writer_data in top_writers:
    writer = writer_data["_id"]
    avg_rating = writer_data["avg_rating"]
    print("Writer:", writer)
    print("Average IMDB Rating:", avg_rating)
    print()


Writer: Stephen King (short story "Rita Hayworth and Shawshank Redemption")
Average IMDB Rating: 9.3

Writer: Kevin Derek
Average IMDB Rating: 9.3

Writer: Mario Puzo (novel)
Average IMDB Rating: 9.2



Who are the top 3 actors with the best rating?<br>
Use the average `imdb.rating` for each actor from the `cast` array in the `movies` collection

In [18]:
# Type your answer

pipeline = [
    # Unwind the cast array
    {"$unwind": "$cast"},
    # Group by actor and calculate the average imdb.rating
    {"$group": {"_id": "$cast", "avg_rating": {"$avg": "$imdb.rating"}}},
    # Sort by average rating in descending order
    {"$sort": {"avg_rating": -1}},
    # Limit to the top 3 actors
    {"$limit": 3}
]

# Execute the pipeline to get the top 3 actors
top_actors = db.movies.aggregate(pipeline)


for actor_data in top_actors:
    actor = actor_data["_id"]
    avg_rating = actor_data["avg_rating"]
    print("Actor:", actor)
    print("Average IMDB Rating:", avg_rating)
    print()


Actor: Lizzie Velasquez
Average IMDB Rating: 9.4

Actor: Carl Sagan
Average IMDB Rating: 9.3

Actor: Michael Chavez
Average IMDB Rating: 9.3



Finally, what is the winning combination of actor, director?
Find the top 5 combinations of `actor` and `director` that have the best average rating, with the name of movies where we have these combination.

<font color="red">Filter on combinations that appear in more than one movie, and only consider movies after the year 2000.</font>

In [19]:
pipeline = [
    # Match movies released after the year 2000
    {"$match": {"year": {"$gt": 2000}}},
    # Unwind the cast array
    {"$unwind": "$cast"},
    # Group by actor and director combination
    {"$group": {
        "_id": {"actor": "$cast", "director": "$directors"},
        "movies": {"$push": "$title"},
        "average_rating": {"$avg": "$imdb.rating"},
        "count": {"$sum": 1}
    }},
    # Filter combinations appearing in more than one movie
    {"$match": {"count": {"$gt": 1}}},
    # Project fields
    {"$project": {
        "_id": 0,
        "actor": "$_id.actor",
        "director": "$_id.director",
        "average_rating": 1,
        "movies": 1
    }}
]

# Execute the pipeline
result = db.movies.aggregate(pipeline)

# Print the result
for doc in result:
    print("Actor:", doc["actor"])
    print("Director:", doc.get("director", "Unknown"))
    print("Average Rating:", doc["average_rating"])
    print("Movies:", doc["movies"])
    print()


Actor: Ajith Kumar
Director: ['K.S. Ravikumar']
Average Rating: 7.0
Movies: ['Villain', 'Varalaaru']

Actor: Hilary Swank
Director: ['Richard LaGravenese']
Average Rating: 7.3
Movies: ['P.S. I Love You', 'Freedom Writers']

Actor: Ganesh
Director: ['Gautham Menon']
Average Rating: 7.9
Movies: ['Will You Cross the Skies for Me?', 'Will You Cross the Skies for Me?']

Actor: Miori Takimoto
Director: ['Hayao Miyazaki']
Average Rating: 7.8
Movies: ['The Wind Rises', 'The Wind Rises']

Actor: Chad Michael Murray
Director: ['Nathan Frankowski']
Average Rating: 6.6
Movies: ['To Write Love on Her Arms', 'To Write Love on Her Arms']

Actor: Peter Clayton-Luce
Director: ['Bryan Bertino']
Average Rating: 6.2
Movies: ['The Strangers', 'The Strangers']

Actor: Valerio Mastandrea
Director: ['Ivano De Matteo']
Average Rating: 6.9
Movies: ['Balancing Act', 'Balancing Act']

Actor: Sophia Loren
Director: ['Edoardo Ponti']
Average Rating: 6.2
Movies: ['Between Strangers', 'Between Strangers']

Actor: Fra

## 12 Extra

<font color='green'>This question is for extra credit. It is not necessary to complete.</font>

MongoDB offers the [$genNear](https://www.mongodb.com/docs/manual/reference/operator/aggregation/geoNear/) stage for calculating distances between points.
Let's test this stage.

Use the `theaters` collection to find all the theatres within a certain distance from a defined center point.

Use the defined `display_points_on_map` function to visualize the results, and validate that all points are indeed within the radius.


In [20]:
def display_points_on_map(theaters, center = [ -76.512016, 38.29697 ], radius=60000):

  # Import/Install the required library
  try:
    from ipyleaflet import Map, Marker, Circle
  except:
    !pip install ipyleaflet
    from ipyleaflet import Map, Marker, Circle

  # Create the center point.
  center_point = (center[1], center[0])

  # Create the map
  map = Map(center=center_point, zoom=9)

  # Add a circle
  circle = Circle()
  circle.location = center_point
  circle.radius = radius
  circle.color = "green"
  circle.fill_color = "green"
  map.add(circle)

  # Add the points
  for theater in theaters:
    coordinates = theater['location']['geo']['coordinates']
    marker = Marker(location=(coordinates[-1],coordinates[0]), draggable=False)
    map.add(marker)

  # Display the map
  display(map)

In [21]:
center = [ -76.512016, 38.29697 ]
radius = 60000

# Type your answer
# Fetch theaters from the MongoDB collection
theaters = db.theaters.find({
    "location.geo": {
        "$geoWithin": {
            "$centerSphere": [center, radius / 6378100]  # Convert radius to radians
        }
    }
})

# Call the display_points_on_map function
display_points_on_map(theaters, center, radius)

Map(center=[38.29697, -76.512016], controls=(ZoomControl(options=['position', 'zoom_in_text', 'zoom_in_title',…