# Lesson 3: Projections in MongoDB Vector Search

## What You'll Learn
**Projections** let you select only specific fields from search results, reducing data transfer and improving performance.

## Part A: Setup
Load data, connect to MongoDB, create index

In [1]:
# Cell 1: Imports and setup
import warnings
warnings.filterwarnings('ignore')

import custom_utils
from datasets import load_dataset
import pandas as pd
from pydantic import BaseModel
from typing import Optional
from IPython.display import display, HTML

print("✅ All imports loaded successfully!")

✅ All imports loaded successfully!


In [2]:
# Cell 2: Load environment variables
# Check that API keys are loaded from .env file

print("🔑 Checking environment variables...")
print(f"  OpenAI API Key: {'✅ Found' if custom_utils.OPENAI_API_KEY else '❌ Missing'}")
print(f"  MongoDB URI: {'✅ Found' if custom_utils.MONGO_URI else '❌ Missing'}")

if not custom_utils.OPENAI_API_KEY or not custom_utils.MONGO_URI:
    print("\n⚠️ Warning: Missing API keys. Check your .env file!")
else:
    print("\n✅ All credentials loaded!")

🔑 Checking environment variables...
  OpenAI API Key: ✅ Found
  MongoDB URI: ✅ Found

✅ All credentials loaded!


In [3]:
# Cell 3: Load Airbnb dataset
print("📦 Loading Airbnb embeddings dataset...")

# Load 100 records from HuggingFace
dataset = load_dataset("MongoDB/airbnb_embeddings", streaming=True, split="train")
dataset = dataset.take(100)

# Convert to pandas DataFrame
dataset_df = pd.DataFrame(dataset)

print(f"✅ Loaded {len(dataset_df)} Airbnb listings")
print(f"📊 Columns: {len(dataset_df.columns)} fields per listing")
print(f"🔢 Embedding dimensions: {len(dataset_df.iloc[0]['text_embeddings'])}")

# Show first listing
sample = dataset_df.iloc[0]
print(f"\n📋 Sample listing:")
print(f"   Name: {sample['name']}")
print(f"   Price: ${sample['price']}")
print(f"   Location: {sample['address']['market']}, {sample['address']['country']}")

📦 Loading Airbnb embeddings dataset...
✅ Loaded 100 Airbnb listings
📊 Columns: 43 fields per listing
🔢 Embedding dimensions: 1536

📋 Sample listing:
   Name: Ribeira Charming Duplex
   Price: $80
   Location: Porto, Portugal


In [4]:
# Cell 4: Process and validate records
print("🔄 Processing records with Pydantic validation...")

# Use custom_utils to clean and validate data
listings = custom_utils.process_records(dataset_df)

if listings:
    print(f"\n✅ Ready to insert {len(listings)} validated listings into MongoDB")
else:
    print("\n❌ Error: No listings were validated")

🔄 Processing records with Pydantic validation...
✅ Processed 100 listings successfully

✅ Ready to insert 100 validated listings into MongoDB


In [5]:
# Cell 5: Connect to MongoDB Atlas
print("🔌 Connecting to MongoDB Atlas...\n")

# Connect using custom_utils
db, collection = custom_utils.connect_to_database(
    database_name="airbnb_dataset",
    collection_name="listings_reviews"
)

print(f"\n✅ Connected successfully!")
print(f"📊 Current document count: {collection.count_documents({})}")

🔌 Connecting to MongoDB Atlas...

✅ Connection to MongoDB successful
📋 Database: airbnb_dataset
📋 Collection: listings_reviews

✅ Connected successfully!
📊 Current document count: 100


In [6]:
# Cell 6: Insert data into MongoDB
print("💾 Inserting data into MongoDB...\n")

# Clear existing data (fresh start)
delete_result = collection.delete_many({})
print(f"🗑️ Deleted {delete_result.deleted_count} existing documents")

# Insert validated listings
insert_result = collection.insert_many(listings)
print(f"📥 Inserted {len(insert_result.inserted_ids)} new documents")

# Verify
final_count = collection.count_documents({})
print(f"\n✅ Collection now has {final_count} documents")

# Show a sample document structure
sample_doc = collection.find_one()
print(f"\n📋 Sample document has {len(sample_doc.keys())} fields:")
print(f"   {list(sample_doc.keys())[:10]}...")

💾 Inserting data into MongoDB...

🗑️ Deleted 100 existing documents
📥 Inserted 100 new documents

✅ Collection now has 100 documents

📋 Sample document has 40 fields:
   ['_id', 'listing_url', 'name', 'summary', 'space', 'description', 'neighborhood_overview', 'notes', 'transit', 'access']...


In [7]:
# Cell 7: Create vector search index with filters
print("🔍 Creating vector search index with filterable fields...\n")

# Create the enhanced index (supports filtering by accommodates and bedrooms)
custom_utils.setup_vector_search_index_with_filter(
    collection=collection,
    index_name="vector_index_with_filter"
)

print("\n📋 Index configuration:")
print("   - Field: text_embeddings (1536 dimensions)")
print("   - Similarity: cosine")
print("   - Filterable fields: accommodates, bedrooms")
print("\n⏳ Index may take 1-2 minutes to fully initialize")
print("✅ You can proceed to the next part!")

🔍 Creating vector search index with filterable fields...

Creating index with filters...
✅ Index 'vector_index_with_filter' created successfully: vector_index_with_filter
💡 Wait a few minutes before conducting searches

📋 Index configuration:
   - Field: text_embeddings (1536 dimensions)
   - Similarity: cosine
   - Filterable fields: accommodates, bedrooms

⏳ Index may take 1-2 minutes to fully initialize
✅ You can proceed to the next part!


In [None]:
# Cell 8: Define SearchResultItem model and search function (NO projection)
class SearchResultItem(BaseModel):
    """Model for search results - matches what we'll display"""
    name: str
    accommodates: Optional[int] = None
    bedrooms: Optional[int] = None
    address: custom_utils.Address
    summary: Optional[str] = None
    space: Optional[str] = None
    neighborhood_overview: Optional[str] = None
    notes: Optional[str] = None

def handle_user_query_no_projection(query, db, collection):
    """
    Search WITHOUT projection - returns ALL fields from MongoDB.
    
    This is the 'before' picture - lots of data!
    """
    print(f"🔍 Searching for: '{query}'")
    print("📦 Mode: WITHOUT projection (all fields returned)\n")
    
    # Run vector search with NO additional stages (no projection)
    get_knowledge = custom_utils.vector_search_with_filter(
        query, 
        db, 
        collection, 
        additional_stages=[],  # Empty - no projection!
        vector_index="vector_index_with_filter"
    )
    
    if not get_knowledge:
        return "No results found."
    
    print(f"\n📊 Found {len(get_knowledge)} results")
    print(f"🔑 First result has {len(get_knowledge[0].keys())} fields")
    print(f"📋 Field names: {list(get_knowledge[0].keys())}\n")
    
    # Convert to our model (only for display, MongoDB still sent ALL fields)
    search_results_models = [
        SearchResultItem(**result)
        for result in get_knowledge
    ]
    
    search_results_df = pd.DataFrame([item.dict() for item in search_results_models])
    
    # Generate recommendation with GPT-4.1
    print("🤖 Generating recommendation with GPT-4.1...\n")
    
    completion = custom_utils.openai.chat.completions.create(
        model="gpt-4.1",  # Using GPT-4.1 as requested!
        messages=[
            {
                "role": "system",
                "content": "You are an Airbnb listing recommendation system."
            },
            {
                "role": "user",
                "content": f"Answer this user query: {query} with the following context:\n{search_results_df}"
            }
        ]
    )
    
    system_response = completion.choices[0].message.content
    
    # Display results
    print(f"━" * 80)
    print(f"❓ USER QUESTION:")
    print(f"{query}\n")
    print(f"━" * 80)
    print(f"🤖 GPT-4.1 RECOMMENDATION:")
    print(f"{system_response}\n")
    print(f"━" * 80)
    print(f"📋 TOP 5 RESULTS:")
    display(HTML(search_results_df.head(5).to_html()))
    
    return system_response

print("✅ Search function ready (NO projection)")

In [10]:
# Cell 9: Test search WITHOUT projection
query = """
I want to stay in a place that's warm and friendly, 
and not too far from restaurants. Can you recommend a place? 
Include a reason as to why you've chosen your selection.
"""

# Run the search WITHOUT projection
result = handle_user_query_no_projection(query, db, collection)

print("\n" + "=" * 80)
print("💡 NOTICE: MongoDB returned ALL 40+ fields per listing!")
print("   Even though we only display 8 fields, the database sent everything.")
print("   This is inefficient - let's see how projections fix this in Part C!")
print("=" * 80)

🔍 Searching for: '
I want to stay in a place that's warm and friendly, 
and not too far from restaurants. Can you recommend a place? 
Include a reason as to why you've chosen your selection.
'
📦 Mode: WITHOUT projection (all fields returned)

⚡ Search completed in 0.254165 milliseconds

📊 Found 20 results
🔑 First result has 40 fields
📋 Field names: ['_id', 'listing_url', 'name', 'summary', 'space', 'description', 'neighborhood_overview', 'notes', 'transit', 'access', 'interaction', 'house_rules', 'property_type', 'room_type', 'bed_type', 'minimum_nights', 'maximum_nights', 'cancellation_policy', 'last_scraped', 'calendar_last_scraped', 'first_review', 'last_review', 'accommodates', 'bedrooms', 'beds', 'number_of_reviews', 'bathrooms', 'amenities', 'price', 'security_deposit', 'cleaning_fee', 'extra_people', 'guests_included', 'images', 'host', 'address', 'availability', 'review_scores', 'reviews', 'text_embeddings']

🤖 Generating recommendation with GPT-4.1...

━━━━━━━━━━━━━━━━━━━━━━━━

Unnamed: 0,name,accommodates,bedrooms,address,summary,space,neighborhood_overview,notes
0,Cozy house at Beyoğlu,2,1,"{'street': 'Beyoğlu, İstanbul, Turkey', 'government_area': 'Beyoglu', 'market': 'Istanbul', 'country': 'Turkey', 'country_code': 'TR', 'location': {'type': 'Point', 'coordinates': [28.95825, 41.03777], 'is_location_exact': False}}","Hello dear Guests, wellcome to istanbul. My House is 2+1 and at second floor. 1 privite room is for my international guests. House is Very close to Taksim Square. You can Walk in 30 minutes or you can take a bus. The bus stop is only 100 m from home. You can go Taksim, Eminönü, Karaköy, Kadıköy, Beyazıt, Sultanahmet easily from home. I have 1 bed, two people can sleep together. Second person should pay extra. You can use kitchen, bathroom, free Wifi, dishwasher, washing machine, Ironing.","Safe, quite, big house, wiev, Central, near the bus stop.","Beyoğlu / Centre of İstanbul It calls Hasköy area, near the Golden Horn",Just enjoy your holiday
1,Downtown Oporto Inn (room cleaning),2,1,"{'street': 'Porto, Porto, Portugal', 'government_area': 'Cedofeita, Ildefonso, Sé, Miragaia, Nicolau, Vitória', 'market': 'Porto', 'country': 'Portugal', 'country_code': 'PT', 'location': {'type': 'Point', 'coordinates': [-8.60867, 41.1543], 'is_location_exact': False}}","Tradicional building, with high ceilings next to City Hall or Trindade Subway station, at a short walking distance from the historic center of this beautiful city. R It is the property of a book novel writer.","Cozy, located near the most interesting points of the city to provide a nice stay, with a low budget. Has a gift shop to buy handicraft, books and other gifts,","Exciting, urban and dinamic, stay with us, near the center, and enjoy a unique stay!",No private parking.
2,Homely Room in 5-Star New Condo@MTR,2,1,"{'street': 'Mongkok, Kowloon, Hong Kong', 'government_area': 'Yau Tsim Mong', 'market': 'Hong Kong', 'country': 'Hong Kong', 'country_code': 'HK', 'location': {'type': 'Point', 'coordinates': [114.17094, 22.32074], 'is_location_exact': False}}","Located in Mongkok, close to everything. 2min walk to both Mongkok and Mongkok East station. Gym, sauna and swimming pool (in summer) are available in the clubhouse. You'll have a private double room. Washroom and kitchen are shared with host. We are family of 3, my husband, 1y old son and me. The guest bedroom can accommodate two people, the 3rd person has to sleep on the couch (3'x6') in the living room.","You will stay with my son, my husband and me. We couple love travelling very much and have been to more than 35 countries in the past few years. We like to share our travel tips and photos with everyone. There is a luxury clubhouse in my building, with gym and swimming pool. The building is newly built and it's the most luxury one in Mongkok area.",Many restaurants and shops nearby.,"Just feel as home. We will give you all assistance. The 3rd guest is allowed to sleep on the sofa in the living room, and it's subject to an extra charge HK$250 per night."
3,Banyan Bungalow,2,0,"{'street': 'Waialua, HI, United States', 'government_area': 'North Shore Oahu', 'market': 'Oahu', 'country': 'United States', 'country_code': 'US', 'location': {'type': 'Point', 'coordinates': [-158.1602, 21.57561], 'is_location_exact': False}}",The place to be on the north shore is where you can be steps from the ocean and watch the stars at night. Our 2 acre property (with tropical greenhouses) hosts a quiet cottage with private driveway/private access.,"Big, open space with lots of natural light. The cottage is clean and quiet - perfect for a good night's sleep. Meals can be easily prepared in the small kitchen. Microwave, hot plate, toaster, blender, coffee maker, full size fridge are available.","This desirable neighborhood is comprised of other vacation rentals, local families, public beach access, and even a campground. Roosters and hens have made their home here as well. Many native birds can be seen and their many sweet sounds can be enjoyed.",
4,Cheerful new renovated central apt,8,3,"{'street': 'Beyoğlu, İstanbul, Turkey', 'government_area': 'Beyoglu', 'market': 'Istanbul', 'country': 'Turkey', 'country_code': 'TR', 'location': {'type': 'Point', 'coordinates': [28.97477, 41.03735], 'is_location_exact': False}}","The full equipped apartment located in the heritage district of Istanbul, colorful Tarlabaşı. If you are looking for a place where you really want to taste the chaos with harmony like a real Istanbuller you are very welcome to stay in my apartment.","Hi there! My name is Aybike. I love to travel, to discover new places and to meet new people. I will be glad to hosting you in Istanbul at my place. My apartment is newly renovated, clean, cosy, comfortable, large enough for 8 people and is situated literally at the heart of Istanbul. Apartment has one of the unique examples of turn-of-the-century Levantine architecture in Turkey: slim, four-storey bow-fronted homes that huddle along winding, narrow streets. Located in a street as it was used to be; the ground floors often served as stores or workshops. More likes to come to your posts in Instagram ! As a traveller my wish is to make you feel at home; drink your morning coffee while listening to the sound of Istanbul then take your map, jump into the street with friendly neighborhood, and enjoy the city without running, walking, exploring, watching, reading and hearing. I know how it is important to be able to feel the city you are visiting. So if you are looking for a place where","Great location will allow you to explore and enjoy Taksim, Pera,Galata , Şişhane and Cihangir. (great walking distance for all the events), Easily accessible to subway, tram, ferries. The neighbourhood is friendly and diverse. It is only 3 minutes by walk to the Galatasaray Square which is located approximately at the center of the Istiklal Avenue. Istiklal Avenue is located in the historic Beyoğlu (Pera) district, it is an elegant pedestrian street, 1.4 kilometers long, which houses boutiques, music stores, bookstores, art galleries, cinemas, theatres, libraries, cafés, pubs, night clubs with live music, historical patisseries, chocolateries and restaurants. You should definately walk through the local bazaar on Sunday right beside the apartment. You can find anything you need with cheap prices; fruits, vegetables, nuts, fish, meat, dairy products! Local bazaars will be very helpful to understand the culture of Turkish people as well. There are couple of restaurants close by where yo","From/To Airports: There are several ways to get from the airport to the apartment, but the most convenient manner is to take “HAVATAŞ"" shuttle to Taksim Square departing every 30 minutes from the airport (from both airports- Atatürk and Sabiha Gökçen). As you may be unfamiliar with the area, I am happy to come and pick you up in front of Galatasaray Highschool (on Istiklal Street) which is 10 minutes walk from Taksim Square where you will get off. I can always advise you cheaper public transport options if you ask for. Useful information: You can rent the apartment/room for (a) day(s), week, month or longer periods of time. There is various supermarkets conveniently situated a block away from the apartment on the way to Istiklal street, also a small kiosk right next to the apartment and a laundry in 100 meters distance."



💡 NOTICE: MongoDB returned ALL 40+ fields per listing!
   Even though we only display 8 fields, the database sent everything.
   This is inefficient - let's see how projections fix this in Part C!


In [11]:
# Cell 10: Define the PROJECTION STAGE
# This is the KEY to reducing data transfer!

projection_stage = {
    "$project": {
        # 0 = exclude this field, 1 = include this field
        "_id": 0,  # Don't send MongoDB's internal ID
        
        # Include only the fields we actually need:
        "name": 1,
        "accommodates": 1,
        
        # Nested fields - use dot notation:
        "address.street": 1,
        "address.government_area": 1,
        "address.market": 1,
        "address.country": 1,
        "address.country_code": 1,
        "address.location.type": 1,
        "address.location.coordinates": 1,
        "address.location.is_location_exact": 1,
        
        # Text fields:
        "summary": 1,
        "space": 1,
        "neighborhood_overview": 1,
        "notes": 1,
        
        # Special: Add the vector search similarity score
        # This is metadata calculated by MongoDB, not stored in the document
        "score": {"$meta": "vectorSearchScore"}
    }
}

print("✅ Projection stage defined!")
print("\n📋 What this does:")
print("   - Tells MongoDB: 'Only send these 14 fields'")
print("   - Excludes: reviews, images, host info, amenities, etc.")
print("   - Adds: similarity score (how well it matches the query)")
print("\n💡 Key concepts:")
print("   - '1' = include field")
print("   - '0' = exclude field")
print("   - 'address.street' = nested field access")
print("   - '$meta' = access calculated metadata")
print("\n🎯 Result: 40 fields → 14 fields = 65% reduction!")

✅ Projection stage defined!

📋 What this does:
   - Tells MongoDB: 'Only send these 14 fields'
   - Excludes: reviews, images, host info, amenities, etc.
   - Adds: similarity score (how well it matches the query)

💡 Key concepts:
   - '1' = include field
   - '0' = exclude field
   - 'address.street' = nested field access
   - '$meta' = access calculated metadata

🎯 Result: 40 fields → 14 fields = 65% reduction!


In [13]:
# Cell 11: Search function WITH projection + Test it
def handle_user_query_with_projection(query, db, collection, projection_stage):
    """
    Search WITH projection - returns ONLY the fields we specify.
    
    This is the 'after' picture - lean and efficient!
    """
    print(f"🔍 Searching for: '{query}'")
    print("📦 Mode: WITH projection (only selected fields returned)\n")
    
    # Run vector search WITH projection stage
    get_knowledge = custom_utils.vector_search_with_filter(
        query, 
        db, 
        collection, 
        additional_stages=[projection_stage],  # ⭐ ADD PROJECTION HERE!
        vector_index="vector_index_with_filter"
    )
    
    if not get_knowledge:
        return "No results found."
    
    print(f"\n📊 Found {len(get_knowledge)} results")
    print(f"🔑 First result has {len(get_knowledge[0].keys())} fields")
    print(f"📋 Field names: {list(get_knowledge[0].keys())}")
    print(f"✅ Notice the 'score' field - that's the similarity score!\n")
    
    # Show first result in detail
    print("🔎 First result details:")
    for key, value in list(get_knowledge[0].items())[:5]:
        if isinstance(value, str) and len(value) > 60:
            print(f"   {key}: {value[:60]}...")
        else:
            print(f"   {key}: {value}")
    
    # Generate recommendation with GPT-4.1
    print("\n🤖 Generating recommendation with GPT-4.1...\n")
    
    # Convert to DataFrame for GPT
    results_df = pd.DataFrame(get_knowledge)
    
    completion = custom_utils.openai.chat.completions.create(
        model="gpt-4.1",
        messages=[
            {
                "role": "system",
                "content": "You are an Airbnb listing recommendation system."
            },
            {
                "role": "user",
                "content": f"Answer this user query: {query} with the following context:\n{results_df}"
            }
        ]
    )
    
    system_response = completion.choices[0].message.content
    
    # Display results
    print(f"━" * 80)
    print(f"❓ USER QUESTION:")
    print(f"{query}\n")
    print(f"━" * 80)
    print(f"🤖 GPT-4.1 RECOMMENDATION:")
    print(f"{system_response}\n")
    print(f"━" * 80)
    print(f"📋 TOP 5 RESULTS (with scores!):")
    display(HTML(results_df.head(5).to_html()))
    
    return system_response

# Run the SAME query, but WITH projection
result_with_projection = handle_user_query_with_projection(query, db, collection, projection_stage)

print("\n" + "=" * 80)
print("🎉 SUCCESS: MongoDB only returned the 14 fields we asked for!")
print("   Compare this to Part B where it sent all 40 fields.")
print("   This is 65% less data transferred over the network!")
print("=" * 80)

🔍 Searching for: '
I want to stay in a place that's warm and friendly, 
and not too far from restaurants. Can you recommend a place? 
Include a reason as to why you've chosen your selection.
'
📦 Mode: WITH projection (only selected fields returned)

⚡ Search completed in 0.196114 milliseconds

📊 Found 20 results
🔑 First result has 8 fields
📋 Field names: ['name', 'summary', 'space', 'neighborhood_overview', 'notes', 'accommodates', 'address', 'score']
✅ Notice the 'score' field - that's the similarity score!

🔎 First result details:
   name: Cozy house at Beyoğlu
   summary: Hello dear Guests, wellcome to istanbul. My House is 2+1 and...
   space: Safe, quite, big house, wiev, Central, near the bus stop.
   neighborhood_overview: Beyoğlu / Centre of İstanbul It calls Hasköy area, near the ...
   notes: Just enjoy your holiday

🤖 Generating recommendation with GPT-4.1...

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
❓ USER QUESTION:

I want to stay in

Unnamed: 0,name,summary,space,neighborhood_overview,notes,accommodates,address,score
0,Cozy house at Beyoğlu,"Hello dear Guests, wellcome to istanbul. My House is 2+1 and at second floor. 1 privite room is for my international guests. House is Very close to Taksim Square. You can Walk in 30 minutes or you can take a bus. The bus stop is only 100 m from home. You can go Taksim, Eminönü, Karaköy, Kadıköy, Beyazıt, Sultanahmet easily from home. I have 1 bed, two people can sleep together. Second person should pay extra. You can use kitchen, bathroom, free Wifi, dishwasher, washing machine, Ironing.","Safe, quite, big house, wiev, Central, near the bus stop.","Beyoğlu / Centre of İstanbul It calls Hasköy area, near the Golden Horn",Just enjoy your holiday,2,"{'street': 'Beyoğlu, İstanbul, Turkey', 'government_area': 'Beyoglu', 'market': 'Istanbul', 'country': 'Turkey', 'country_code': 'TR', 'location': {'type': 'Point', 'coordinates': [28.95825, 41.03777], 'is_location_exact': False}}",0.700824
1,Downtown Oporto Inn (room cleaning),"Tradicional building, with high ceilings next to City Hall or Trindade Subway station, at a short walking distance from the historic center of this beautiful city. R It is the property of a book novel writer.","Cozy, located near the most interesting points of the city to provide a nice stay, with a low budget. Has a gift shop to buy handicraft, books and other gifts,","Exciting, urban and dinamic, stay with us, near the center, and enjoy a unique stay!",No private parking.,2,"{'street': 'Porto, Porto, Portugal', 'government_area': 'Cedofeita, Ildefonso, Sé, Miragaia, Nicolau, Vitória', 'market': 'Porto', 'country': 'Portugal', 'country_code': 'PT', 'location': {'type': 'Point', 'coordinates': [-8.60867, 41.1543], 'is_location_exact': False}}",0.693967
2,Homely Room in 5-Star New Condo@MTR,"Located in Mongkok, close to everything. 2min walk to both Mongkok and Mongkok East station. Gym, sauna and swimming pool (in summer) are available in the clubhouse. You'll have a private double room. Washroom and kitchen are shared with host. We are family of 3, my husband, 1y old son and me. The guest bedroom can accommodate two people, the 3rd person has to sleep on the couch (3'x6') in the living room.","You will stay with my son, my husband and me. We couple love travelling very much and have been to more than 35 countries in the past few years. We like to share our travel tips and photos with everyone. There is a luxury clubhouse in my building, with gym and swimming pool. The building is newly built and it's the most luxury one in Mongkok area.",Many restaurants and shops nearby.,"Just feel as home. We will give you all assistance. The 3rd guest is allowed to sleep on the sofa in the living room, and it's subject to an extra charge HK$250 per night.",2,"{'street': 'Mongkok, Kowloon, Hong Kong', 'government_area': 'Yau Tsim Mong', 'market': 'Hong Kong', 'country': 'Hong Kong', 'country_code': 'HK', 'location': {'type': 'Point', 'coordinates': [114.17094, 22.32074], 'is_location_exact': False}}",0.681879
3,Banyan Bungalow,The place to be on the north shore is where you can be steps from the ocean and watch the stars at night. Our 2 acre property (with tropical greenhouses) hosts a quiet cottage with private driveway/private access.,"Big, open space with lots of natural light. The cottage is clean and quiet - perfect for a good night's sleep. Meals can be easily prepared in the small kitchen. Microwave, hot plate, toaster, blender, coffee maker, full size fridge are available.","This desirable neighborhood is comprised of other vacation rentals, local families, public beach access, and even a campground. Roosters and hens have made their home here as well. Many native birds can be seen and their many sweet sounds can be enjoyed.",,2,"{'street': 'Waialua, HI, United States', 'government_area': 'North Shore Oahu', 'market': 'Oahu', 'country': 'United States', 'country_code': 'US', 'location': {'type': 'Point', 'coordinates': [-158.1602, 21.57561], 'is_location_exact': False}}",0.677247
4,Cheerful new renovated central apt,"The full equipped apartment located in the heritage district of Istanbul, colorful Tarlabaşı. If you are looking for a place where you really want to taste the chaos with harmony like a real Istanbuller you are very welcome to stay in my apartment.","Hi there! My name is Aybike. I love to travel, to discover new places and to meet new people. I will be glad to hosting you in Istanbul at my place. My apartment is newly renovated, clean, cosy, comfortable, large enough for 8 people and is situated literally at the heart of Istanbul. Apartment has one of the unique examples of turn-of-the-century Levantine architecture in Turkey: slim, four-storey bow-fronted homes that huddle along winding, narrow streets. Located in a street as it was used to be; the ground floors often served as stores or workshops. More likes to come to your posts in Instagram ! As a traveller my wish is to make you feel at home; drink your morning coffee while listening to the sound of Istanbul then take your map, jump into the street with friendly neighborhood, and enjoy the city without running, walking, exploring, watching, reading and hearing. I know how it is important to be able to feel the city you are visiting. So if you are looking for a place where","Great location will allow you to explore and enjoy Taksim, Pera,Galata , Şişhane and Cihangir. (great walking distance for all the events), Easily accessible to subway, tram, ferries. The neighbourhood is friendly and diverse. It is only 3 minutes by walk to the Galatasaray Square which is located approximately at the center of the Istiklal Avenue. Istiklal Avenue is located in the historic Beyoğlu (Pera) district, it is an elegant pedestrian street, 1.4 kilometers long, which houses boutiques, music stores, bookstores, art galleries, cinemas, theatres, libraries, cafés, pubs, night clubs with live music, historical patisseries, chocolateries and restaurants. You should definately walk through the local bazaar on Sunday right beside the apartment. You can find anything you need with cheap prices; fruits, vegetables, nuts, fish, meat, dairy products! Local bazaars will be very helpful to understand the culture of Turkish people as well. There are couple of restaurants close by where yo","From/To Airports: There are several ways to get from the airport to the apartment, but the most convenient manner is to take “HAVATAŞ"" shuttle to Taksim Square departing every 30 minutes from the airport (from both airports- Atatürk and Sabiha Gökçen). As you may be unfamiliar with the area, I am happy to come and pick you up in front of Galatasaray Highschool (on Istiklal Street) which is 10 minutes walk from Taksim Square where you will get off. I can always advise you cheaper public transport options if you ask for. Useful information: You can rent the apartment/room for (a) day(s), week, month or longer periods of time. There is various supermarkets conveniently situated a block away from the apartment on the way to Istiklal street, also a small kiosk right next to the apartment and a laundry in 100 meters distance.",8,"{'street': 'Beyoğlu, İstanbul, Turkey', 'government_area': 'Beyoglu', 'market': 'Istanbul', 'country': 'Turkey', 'country_code': 'TR', 'location': {'type': 'Point', 'coordinates': [28.97477, 41.03735], 'is_location_exact': False}}",0.676147



🎉 SUCCESS: MongoDB only returned the 14 fields we asked for!
   Compare this to Part B where it sent all 40 fields.
   This is 65% less data transferred over the network!


In [14]:
# Cell 12: Side-by-side comparison
print("=" * 80)
print("📊 PART B vs PART C COMPARISON")
print("=" * 80)
print("\n🔴 WITHOUT Projection (Part B):")
print("   - Fields returned: 40")
print("   - Includes: All reviews, images, host info, amenities, etc.")
print("   - Data transfer: 100% (full documents)")
print("   - Use case: When you need ALL the data")

print("\n🟢 WITH Projection (Part C):")
print("   - Fields returned: 14")
print("   - Includes: Only name, address, summary, space, notes, accommodates, score")
print("   - Data transfer: ~35% (65% reduction!)")
print("   - Use case: When you know exactly what fields you need")

print("\n" + "=" * 80)
print("💡 KEY LEARNINGS ABOUT PROJECTIONS")
print("=" * 80)
print("\n1️⃣  What is a projection?")
print("   A filter that tells MongoDB which fields to send back.")

print("\n2️⃣  How does it work?")
print("   Use the $project stage in the aggregation pipeline:")
print("   - '1' means include this field")
print("   - '0' means exclude this field")
print("   - Works with nested fields (address.street)")
print("   - Can add calculated fields ($meta)")

print("\n3️⃣  Why use projections?")
print("   ✅ Faster - Less data over the network")
print("   ✅ Cheaper - Reduced bandwidth costs")
print("   ✅ Cleaner - Only get what you need")
print("   ✅ LLM-friendly - Focused context for GPT")

print("\n4️⃣  When to use projections?")
print("   ✅ Building APIs that return specific fields")
print("   ✅ Feeding data to LLMs (keep context focused)")
print("   ✅ Large documents where you only need a few fields")
print("   ✅ Performance-critical applications")

print("\n5️⃣  The $meta operator")
print("   Special operator to access MongoDB-calculated metadata:")
print("   - vectorSearchScore: How similar to the query (0-1)")
print("   - textScore: Full-text search relevance")
print("   - This data isn't stored - it's calculated during search!")

print("\n" + "=" * 80)
print("🎓 LESSON 3 COMPLETE!")
print("=" * 80)
print("\nYou now understand:")
print("✅ What projections are")
print("✅ How to define a projection stage")
print("✅ When and why to use projections")
print("✅ How $meta adds calculated fields")
print("✅ The performance benefits (65% data reduction!)")
print("\n🚀 Next: Apply this to your own projects!")
print("=" * 80)

📊 PART B vs PART C COMPARISON

🔴 WITHOUT Projection (Part B):
   - Fields returned: 40
   - Includes: All reviews, images, host info, amenities, etc.
   - Data transfer: 100% (full documents)
   - Use case: When you need ALL the data

🟢 WITH Projection (Part C):
   - Fields returned: 14
   - Includes: Only name, address, summary, space, notes, accommodates, score
   - Data transfer: ~35% (65% reduction!)
   - Use case: When you know exactly what fields you need

💡 KEY LEARNINGS ABOUT PROJECTIONS

1️⃣  What is a projection?
   A filter that tells MongoDB which fields to send back.

2️⃣  How does it work?
   Use the $project stage in the aggregation pipeline:
   - '1' means include this field
   - '0' means exclude this field
   - Works with nested fields (address.street)
   - Can add calculated fields ($meta)

3️⃣  Why use projections?
   ✅ Faster - Less data over the network
   ✅ Cheaper - Reduced bandwidth costs
   ✅ Cleaner - Only get what you need
   ✅ LLM-friendly - Focused context