# 5-Day Gen AI Intensive Course with Google
# Capstone Project: Gen AI Powered Caribbean Getaway Travel Agent
# Will Contreras

## 1. Introduction: Use Case and Problem Statement

This notebook demonstrates the creation of a simulated AI-powered travel agent specializing in weekend getaways to the Caribbean. The core problem is that traditional travel websites often rely on static listings or rigid, rule-based recommendation systems, making it difficult for users to discover personalized and engaging travel options that truly match their unique interests and preferences. This project explores how a Large Language Model (LLM) can dynamically generate relevant promotional content and interpret user queries in a more flexible and context-aware manner, offering a more intuitive and personalized travel planning experience. While this notebook presents a simulation and does not connect to real-time booking or inventory systems, it effectively showcases the potential of Generative AI to power such an intelligent travel agent.

## 2. Gen AI Capabilities Utilized

This project leverages the following Generative AI capabilities, drawing from the provided Capstone Project Capabilities List:

- **Few-shot Prompting**: The generation of the initial travel promotion descriptions was guided by a set of explicit few-shot examples. These examples demonstrated the desired format and content style for different Caribbean destinations and themes, allowing the LLM to learn from these provided instances when generating new content. (Reference the few_shot_examples list in the code).
- **Embeddings**: We utilized the Gemini embedding model to create numerical representations (embeddings) of the generated travel promotion descriptions. These embeddings capture the semantic meaning of the text, enabling us to perform efficient semantic similarity searches.
- **Retrieval Augmented Generation (RAG)**: The core of our recommendation system employs RAG. When a user enters a query, we generate an embedding of the query and then use vector search to retrieve the most semantically similar travel promotions from our embedded database. The descriptions of these retrieved promotions are then presented to the user as relevant recommendations.
- **Vector Search**: The vector_search function implements the capability to find the top-N most similar embeddings to a user's query embedding. This function calculates cosine similarity between the query embedding and the embeddings of the travel promotions, ranking them by similarity to retrieve the most relevant ones.
- **Gen AI Evaluation**: Although not a fully automated evaluation pipeline, the project includes a step where the generated travel promotion descriptions are evaluated by the LLM itself based on criteria like relevance, attractiveness, conciseness, and accuracy. This demonstrates a basic form of Gen AI evaluation.

## 3. Implementation
### 3.1 Configuring libraries and API
- 3.1.1 Install GEMINI API client library (version 1.7.0).

In [None]:
!pip install -Uq "google-genai==1.7.0"

**3.1.2 Importing SDK and some helpers for rendering the output**
- Importing the genai library for interacting with the Gemini models, the types module, and Markdown and display from IPython.display for richer output.
- Printing the installed version of the google-genai library for verification.

In [None]:
from google import genai
from google.genai import types

from IPython.display import Markdown, display

genai.__version__

**3.1.3 Configuring API key, initializing a model, and testing it**
- Gemini API client initialized with API key.
- Testing gemini-1.5-flash model and API connection with a simple request.
- Printing text response.

In [None]:
client = genai.Client(api_key="My_API_key")

response = client.models.generate_content(
    model="gemini-1.5-flash",
    contents="How many colors are in a rainbow?")

print(response.text)

### 3.2 Automated API Call Retries for Robustness

This section implements automated retry logic for calls to the Gemini API. This helps the application handle transient errors like network issues or temporary service unavailability (HTTP status codes 429 - Too Many Requests, indicating rate limiting, and 503 - Service Unavailable). By automatically retrying failed API calls, the application becomes more resilient.

- Importing the retry module from the google-api-core library, which provides functionality for automatically retrying operations.
- from google import genai: Ensures the main Gemini AI library is imported.

In [None]:
from google.api_core import retry


is_retriable = lambda e: (isinstance(e, genai.errors.APIError) and e.code in {429, 503})

if hasattr(genai, 'client') and hasattr(genai.client, 'GenerativeModel') and not hasattr(genai.client.GenerativeModel.generate_content, '__wrapped__'):
    genai.client.GenerativeModel.generate_content = retry.Retry(predicate=is_retriable)(genai.client.GenerativeModel.generate_content)

if hasattr(genai, 'client') and hasattr(genai.client, 'embed_content') and not hasattr(genai.client.embed_content, '__wrapped__'):
    genai.client.embed_content = retry.Retry(predicate=is_retriable)(genai.client.embed_content)

### 3.3 Generating Few-Shot Examples for Promotional Text Generation

This section defines a function to generate text using the Gemini model with a higher temperature for more creative outputs and then uses it to create a list of few-shot examples for generating promotional content.  

- High temperature configuration: Defines high_temp_config with a temperature of 1.0 to encourage more varied text generation.

- generate_text function: This function takes a prompt, model name (defaulting to 'gemini-1.5-flash'), and maximum tokens (defaulting to 150) as input. It sends the prompt to the specified Gemini model with the high-temperature configuration and returns the generated text. It includes error handling to print a message and return None if text generation fails.

- Destinations and their prompts: The destinations_prompts list contains tuples. Each tuple includes a destination name, a primary activity/theme associated with that destination, and a specific prompt instructing the LLM to act as a tourism brochure writer and generate a concise description emphasizing the activity and including one more popular tourist attraction.

- Generating and storing examples: The code initializes an empty list called few_shot_examples. It then iterates through the destinations_prompts list. For each destination and prompt, it calls the generate_text function. If text is successfully generated, a dictionary containing the destination, the associated activity, and the generated description is appended to the few_shot_examples list.

- The generated description is also printed to the console for review. If text generation fails for a particular destination, a failure message is printed.

In [None]:
high_temp_config = types.GenerateContentConfig(temperature=1.0)

def generate_text(prompt: str, model_name: str = "gemini-1.5-flash", max_tokens: int = 150):
    """
    Generates text using the specified Gemini model.

    Args:
        prompt: The text prompt to send to the model.
        model_name: The name of the Gemini model to use (default: "gemini-1.5-flash").
        max_tokens: Maximum nuber of tokens to generate (default=150).

    Returns:
        The generated text, or None if an error occurs.
    """
    try:

        response = client.models.generate_content(
            model=model_name,
            config=high_temp_config,
            contents=prompt,

        )
        return response.text
    except Exception as e:
        print(f"Error generating text: {e}")
        return None

destinations_prompts = [
    ("Bonaire", "Water Sports", "As a writer of tourism brochures, generate a concise, tourism-oriented description for Bonaire, emphasizing its appeal for water sports due to its shore diving, diver's paradise status, protected coastline marine park, windsurfing and kiteboarding mecca in Lac Bay, snorkeling the shallow reefs, and sailing opportunities. Include one more notable and popular touristic activity."),
    ("St. Lucia", "Romance", "As a writer of tourism brochures, generate a concise, tourism-oriented description for St. Lucia, emphasizing its romantic atmosphere with its stunning Piton mountains, lush rainforests, pristine beaches ideal for couples, luxurious resorts, and opportunities for intimate experiences. Include one more notable and popular touristic activity."),
    ("Dominica", "Adventure and thrill seeking", "As a writer of tourism brochures, generate a concise, tourism-oriented description for Dominica, emphasizing its natural beauty as the 'Nature Island' with its Boiling Lake, Trafalgar Falls, Emerald Pool, Waitukubuli National Trail for hiking, whale watching opportunities, and vibrant Kalinago culture. Include one more notable and popular touristuc activity."),
    ("Turks and Caicos", "Wellness", "As a writer of tourism brochures, generate a concise, tourism-oriented description for Turks and Caicos, emphasizing its appeal for wellness activies such as its luxurious retreats like Amanyara, Kokomo Botanical Resort,and eco-focused wellness activities. Include one more notable and popular touristic activity."),
    ("Bahamas", "Family", "As a writer of tourism brochures, generate a concise, tourism-oriented description for Bahamas, emphasizing its appeal for family oriented activities, such as Atlantis Paradise Island,  Aquaventure Water Park, Marine Habitats, Iconic architecture, beaches, Pirates of Nassau Musuem, Cable Beach, Blue Laggon Island to encounter dolphins. Include one more notable and popular touristic activity."),
    ("Puerto Rico", "History and Culture", "As a writer of tourism brochures, generate a concise, tourism-oriented description for Puerto Rico, emphasizing its rich historic, cultural heritage as found in Old San Juan, Castillo San Felipe del Morro, Castillo San Cristobal, La Fortaleza, The City Walls. Include one more notable and popular touristic activity."),
    ]

few_shot_examples = []
print("Few Shot Examples generation:\n")
for destination,activity, prompt in destinations_prompts:
    description = generate_text(prompt)
    if description:
        few_shot_examples.append({"destination": destination, "activity": activity, "description": description})
        print(f"--- {destination, activity} ---")
        print(description)
    else:
        print(f"--- {destination, activity} ---")
        print("Failed to generate description.")

### 3.4 Generating and Evaluating Travel Promotions with Package Information

This section focuses on generating more detailed travel promotions for a list of Caribbean destinations, including a concise description and package information (potential inclusions, example dates, trip duration, and estimated price). It also incorporates an evaluation step to assess the quality of the generated descriptions.

* **Importing Libraries:** The cell starts by importing necessary libraries:
    * `import json`: For handling JSON data structures.
    * `import re`: For regular expression operations used to extract information from the LLM's responses.

* **Defining common themes and new destinations:** The `common_themes` variable stores a string of potential travel themes. The `new_themes_destinations` list contains dictionaries, each specifying a destination and the `common_themes` that might apply.

* **Initializing `generated_promotions` list:** An empty list called `generated_promotions` is created to store the generated travel package details for each destination.

* **`generate_package_info` function:** This function takes a `destination` name and its generated `description` as input. It constructs a detailed prompt asking the LLM to generate a concise summary of potential inclusions for a 3-night weekend getaway, suggest three example Friday starting dates in the summer of 2025 for a 4-day/3-night trip, provide the trip duration as "3 nights", and estimate a total price per person (including round-trip airfare and 3 nights' accommodation). It uses regular expressions to extract this structured information from the LLM's response and returns it as a tuple of inclusion summary, list of dates, duration (integer), and price (integer). If the LLM response is not in the expected format, it returns default values.

* **Main loop for generating and evaluating promotions:** The code iterates through each item in the `new_themes_destinations` list. For each destination:
    * It constructs a detailed prompt for the LLM, instructing it to act as a travel advisor and generate a short (under 20 tokens) paragraph describing the most popular activities, considering the common themes and beach activities relevant to the destination. It emphasizes vivid sensory details and specific examples and includes a few-shot example description for guidance (with adjustments for "San Juan, Puerto Rico").
    * It calls the `generate_content` function with a temperature and top-p setting to get the promotional description.
    * It then calls the `generate_package_info` function to get the package details based on the generated description.
    * An evaluation prompt is constructed, asking the LLM (using the `gemini-2.0-flash` model) to act as a travel content evaluator and score the generated description based on relevance to the Caribbean, attractiveness, conciseness, and accuracy. It requests a score (1-5) and a brief justification for each criterion, along with overall feedback. The description and its evaluation are then printed.
    * Finally, a dictionary containing the extracted information (destination, theme, description, availability, trip duration, price, includes) is appended to the `generated_promotions` list, and the key package details are printed. Error handling is included for the description generation.

* **Printing generated promotions:** After processing all destinations, the `generated_promotions` list, containing all the generated travel packages, is printed to the console in a nicely formatted JSON structure.

In [None]:
import json
import re

common_themes = "romance, family, adventure, wellness, culture, water sports"
new_themes_destinations = [
    {"theme": common_themes, "destination": "St. Lucia"},
    {"theme": common_themes, "destination": "Bonaire "},
    {"theme": common_themes, "destination": "Dominica"},
    {"theme": common_themes, "destination": "Turks and Caicos"},
    {"theme": common_themes, "destination": "San Juan, Puerto Rico"}
]

generated_promotions = []

def generate_package_info(destination, description):
    prompt = f"""Generate a concise summary (max 30 tokens) of what a typical 3-night weekend getaway travel package to {destination} **might include**, considering the following description of the destination: "{description}".  Think about common inclusions such as airport transfers, perhaps one guided activity or tour related to the main attractions, and possibly some meals or discounts. Provide three example available **starting** dates in the summer of 2025 that fall on a **Friday**, for a **4-day/3-night trip**, YYYYY-MM-DD format. Provide the **trip duration** as "3 nights". Also, provide an estimated total price per person (including round-trip airfare from a major US city and accommodation for 3 nights) as a single integer. Respond in the following format:

**Summary:** [your summary]
**Dates:** [YYYY-MM-DD], [YYYY-MM-DD], [YYYY-MM-DD]
**Trip Duration:** [number] nights
**Price:** [integer price]"""
    try:
        response = client.models.generate_content(
            model="gemini-1.5-flash",
            contents=prompt,
        )
        if response.text:
            inclusion_match = re.search(r"\*\*Summary:\*\*\s*(.*)", response.text)
            dates_match = re.search(r"\*\*Dates:\*\*\s*([\d\-\, ]+)", response.text)
            duration_match = re.search(r"\*\*Trip Duration:\*\*\s*(\d+)\s*nights", response.text)
            price_match = re.search(r"\*\*Price:\*\*\s*(\d+)", response.text)

            inclusion = inclusion_match.group(1).strip() if inclusion_match else None
            dates_str = dates_match.group(1).strip() if dates_match else None
            duration_str = duration_match.group(1).strip() if duration_match else None
            price_str = price_match.group(1).strip() if price_match else None

            dates = [d.strip() for d in dates_str.split(',')] if dates_str else []
            duration = int(duration_str) if duration_str and duration_str.isdigit() else None
            price = int(price_str) if price_str and price_str.isdigit() else None

            return inclusion, dates, duration, price
        else:
            return None, [], None, None
    except Exception as e:
        print(f"Error generating package info for {destination}: {e}")
        return None, [], None, None

# Modify the main loop to capture and print the duration AND EVALUATE DESCRIPTION
for item in new_themes_destinations:
    print(f"Processing Destination: {item['destination']}")
    prompt_parts = [
        "You are a superb travel advisor. Generate a short paragraph in no more than 20 tokens describing the most popular and famous activities at",
        item["destination"],
        "considering the following potential themes: romance, family, adventure, wellness, and culture and history. Do not forget beach activities according to the destination particularities",
        "Based on knowledge about tourism in the Caribbean, identify the 2 or 3 most popular or significant of these themes for",
        item["destination"],
        "and focus most of the description on activities related to these top themes. Ensure the description is relevant to a Caribbean vacation.",
        "**To make the description highly attractive and engaging, incorporate vivid sensory details (sights, sounds, tastes, smells, feelings) and mention specific examples of locations, activities, or unique aspects that make this destination stand out.**",
        "Do not include 'Theme:' or 'Destination:' prefixes in your response. Here are examples:",
    ]
    # Add specific instructions based on the evaluation feedback
    if destination == "St. Lucia":
        prompt_parts.append("Picture dramatic emerald Pitons soaring above turquoise waters. Indulge in couples' spa treatments with volcanic mud. Hike through fragrant rainforests to thundering waterfalls. Snorkel kaleidoscopic coral reefs teeming with life. Savor the zesty flavors of freshly caught grilled fish. St. Lucia: where breathtaking beauty ignites romance and adventure.")
    elif destination == "Bonaire":
        prompt_parts.append("Dive into Bonaire's underwater paradise, where visibility stretches endlessly over vibrant coral gardens teeming with colorful fish. Feel the gentle tug of the current as you drift along pristine reefs. Above the surface, the warm sun kisses your skin on tranquil beaches, followed by the exquisite taste of freshly grilled seafood at sunset. Bonaire: an unforgettable haven for divers and nature lovers.")
    elif destination == "Dominica":
        prompt_parts.append("Discover Dominica, the 'Nature Island': hike through emerald rainforests alive with the sounds of exotic birds and the scent of tropical blooms. Feel the primal heat of the Boiling Lake and swim in the crystal-clear embrace of hidden waterfalls. Explore dramatic coastlines and encounter unique wildlife. Dominica: an adventure for the soul, far beyond the ordinary beach escape.")
    elif destination == "Turks and Caicos":
        prompt_parts.append("Sink your toes into the unbelievably soft, powdery white sands of Grace Bay, where turquoise waters stretch to the horizon, inviting you for a refreshing swim. Feel the warm caress of the sun and the gentle whisper of the sea breeze. Indulge in world-class dining and luxurious spa treatments, as the breathtaking beauty of Turks and Caicos rejuvenates your senses. Pure island indulgence.")
    elif destination == "San Juan, Puerto Rico":
        prompt_parts.append("To enhance appeal, incorporate more sensory details, such as the sounds of Old San Juan or the flavors of the local cuisine.")
        prompt_parts.append("Mention specific examples of historical sites or cultural experiences beyond just the names.")
        prompt_parts.append("Keep the description concise and short.")

    for example in few_shot_examples:
        prompt_parts.append(f"\n{example['description']}")

    prompt_description = " ".join(prompt_parts)

    model_config = types.GenerateContentConfig(
        temperature=0.5,
        top_p=0.1
    )
    try:
        response_description = client.models.generate_content(
            model="gemini-1.5-flash",
            config=model_config,
            contents=prompt_description
        )
        description = response_description.text.strip() if response_description.text else "No description generated."

        # --- Evaluation Step ---
        evaluation_prompt = f"""You are acting as an expert travel content evaluator. You will assess the following promotional description for a Caribbean destination based on several criteria. Provide a score from 1 (very poor) to 5 (excellent) for each criterion and a brief justification for each score.

        Destination Description: {description}

        Evaluation Criteria:
        - Relevance to Caribbean travel: How well does the description highlight aspects and activities commonly associated with a Caribbean vacation?
        - Attractiveness and Persuasiveness: How compelling and inviting is the description to a potential traveler?
        - Conciseness and Clarity: Is the description, easy to understand, and within a reasonable length?
        - Accuracy of implied information: Based on your general knowledge, does the description seem to align with typical offerings and attractions of such a destination?

        Provide your evaluation in the following format:
        Relevance Score: [score]
        Justification: [justification]
        Attractiveness Score: [score]
        Justification: [justification]
        Conciseness Score: [score]
        Justification: [justification]
        Accuracy Score: [score]
        Justification: [justification]
        Overall Feedback: [A brief summary of the strengths and weaknesses of the description]
        """

        response_evaluation = client.models.generate_content(model="gemini-2.0-flash", contents=evaluation_prompt)
        evaluation_text = response_evaluation.text.strip() if response_evaluation.text else "No evaluation generated."
        print(f"\n--- {item['destination']} ---")
        print(description)
        print("\n--- Evaluation of Description ---")
        print(evaluation_text)

        inclusion, availability, duration, price = generate_package_info(item["destination"], description)

        generated_promotions.append({
            "destination": item["destination"].split('(')[0].strip(),
            "theme": item["theme"],
            "description": description,
            "availability": availability,
            "trip_duration_nights": duration,
            "price_per_person": price,
            "includes": inclusion
        })
        if inclusion:
            print(f"Includes: {inclusion}")
        if availability:
            print(f"Availability (starting Fridays): {', '.join(availability)}")
        if duration is not None:
            print(f"Trip Duration: {duration} nights")
        if price is not None:
            print(f"Price per person (including airfare & 3 nights): ${price}\n")

    except Exception as e:
        print(f"\n--- {item['destination']} ---")
        print(f"Error generating description: {e}")

print("\n--- Generated Promotions ---")
print(json.dumps(generated_promotions, indent=4))

### 3.5 Generating Embeddings for Promotion Descriptions

This section defines a function to generate embedding vectors for text using the Gemini Embedding model and then applies this function to the descriptions of the generated travel promotions. These embeddings will be used for semantic search and retrieval.

* **`get_embedding` function:** This function takes a text string (`text`), an optional embedding model name (defaulting to 'models/gemini-embedding-exp-03-07'), and an optional task type (defaulting to 'retrieval_document') as input. It calls the Gemini Embedding model through the `client.models.embed_content` method to generate an embedding for the input text. It includes error handling to catch potential exceptions during the embedding process and prints informative messages if the embedding generation fails or if the result structure is unexpected. It returns the list of embedding values or `None` if an error occurs.

* **Generating embeddings for promotions:** This loop iterates through the `generated_promotions` list (created in the previous section). For each promotion, it retrieves the 'description'. If a description exists, it calls the `get_embedding` function to obtain its vector embedding. If the embedding is successfully generated, it is added as a new key 'description_embedding' to the promotion dictionary. If the embedding fails or if there is no description, the 'description_embedding' is set to `None`. A message is printed to the console indicating the destination for which an embedding was generated.

* **Embedding generation completion message:** After processing all promotions, a message "Embeddings generation complete." is printed to the console.

* **Optional: Saving promotions with truncated embeddings to JSON:** This part (which is commented out in the provided code but included in the documentation for completeness based on the code's presence) demonstrates how the promotions, along with their generated embeddings, could be saved to a JSON file. It iterates through the `generated_promotions` list. For each promotion, if an embedding exists, it truncates it for display purposes (showing only the first and last five elements with dots in between) before including it in a new list `promotions_with_embeddings_json`. Finally, this list is printed to the console in a formatted JSON structure, showing the truncated embeddings. If an embedding doesn't exist for a promotion, the original promotion data is included in the JSON output.

In [None]:
def get_embedding(text: str, model_name="models/gemini-embedding-exp-03-07", task_type="retrieval_document") -> list[float]:
    """Returns embeddings for the given text using Gemini."""
    try:
        result = client.models.embed_content(
            model=model_name,
            contents=text,
            config=types.EmbedContentConfig(task_type=task_type)
        )
        if result and result.embeddings and len(result.embeddings) > 0 and result.embeddings[0].values:
            return result.embeddings[0].values
        else:
            print(f"Unexpected embedding result structure for '{text[:20]}...': {result}")
            return None
    except Exception as e:
        print(f"Error generating embedding: {e}")
        return None

# Generate embeddings for our generated promotions
for promo in generated_promotions:
    description = promo.get("description")
    if description:
        embedding = get_embedding(description)
        if embedding is not None:
            promo['description_embedding'] = embedding
            print(f"Generated embedding for: {promo['destination']}")
        else:
            promo['description_embedding'] = None # Store None if embedding fails
    else:
        promo['description_embedding'] = None # If no description

print("\nEmbeddings generation complete.")

# Optional: Save the promotions with embeddings to a new JSON file
import json
promotions_with_embeddings_json = []
for promo in generated_promotions:
    embedding = promo.get("description_embedding")
    if embedding:
        truncated_embedding = f"{embedding[:5]}............{embedding[-5:]}"
        promotions_with_embeddings_json.append({**promo, "description_embedding": truncated_embedding})
    else:
        promotions_with_embeddings_json.append(promo)

print("\nPromotions with embeddings (JSON - truncated embeddings):")
print(json.dumps(promotions_with_embeddings_json, indent=4))


### 3.6 Semantic Similarity Search Function

This section defines the vector_search function, which is responsible for finding the travel promotions that are most semantically similar to a user's search query, while also ensuring a minimum level of relevance. It achieves this by comparing the embedding of the query with the embeddings of our generated promotion descriptions using cosine similarity and applying a threshold.

- **Importing Libraries:**

    - from sklearn.metrics.pairwise import cosine_similarity: Imports the function to calculate cosine similarity between vectors.
    - import numpy as np: Imports the NumPy library, which provides support for large, multi-dimensional arrays and matrices, along with a collection1 of high-level mathematical functions to operate2 on these arrays, including the argsort function used for sorting.

**- vector_search function:**

**Arguments:**
- query_embedding (list[float]): The embedding vector of the query.
- embeddings (list[list[float]]): A list of embedding vectors to search within.
- top_n (int, optional): The maximum number of top similar results to return. Defaults to 3.
- similarity_threshold (float, optional): Minimum cosine similarity score to consider a result. Defaults to 0.7.

**Functionality:**
- It calculates the cosine similarity between the query_embedding and all the embeddings using cosine_similarity.
- It then ranks the embeddings based on their similarity to the query using np.argsort and reverses the order to get the most similar first.
- It iterates through the ranked indices and adds the index to the top_indices list only if its corresponding similarity score is greater than or equal to the similarity_threshold and the number of top_indices is less than top_n.
- Finally, it returns the top_indices list, containing the indices of the most similar embeddings that meet the specified threshold.

In [None]:
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np

def vector_search(query_embedding: list[float], embeddings: list[list[float]], top_n: int = 3, similarity_threshold: float = 0.7):
    """
    Performs vector search to find the top N most similar embeddings to the query embedding
    that are above a specified similarity threshold.

    Args:
        query_embedding (list[float]): The embedding vector of the query.
        embeddings (list[list[float]]): A list of embedding vectors to search within.
        top_n (int, optional): The maximum number of top similar results to return. Defaults to 3.
        similarity_threshold (float, optional): Minimum cosine similarity score to consider a result. Defaults to 0.7.

    Returns:
        list[int]: A list of indices corresponding to the top N most similar embeddings
                     that meet the threshold, sorted in descending order of similarity.
    """
    similarities = cosine_similarity([query_embedding], embeddings)[0]
    ranked_indices = np.argsort(similarities)[::-1]
    top_indices = []
    for index in ranked_indices:
        if similarities[index] >= similarity_threshold:
            top_indices.append(index)
            if len(top_indices) >= top_n:
                break
    return top_indices

### 3.7 Implementing Retrieval-Augmented Generation (RAG) for Travel Recommendations

This section defines a function `travel_agent_rag_search` that simulates a travel agent using Retrieval-Augmented Generation (RAG). It takes a user's search query and a list of generated promotions (with their embeddings) as input to find the most relevant promotions based on semantic similarity.

* **`travel_agent_rag_search` function:**
    * **Arguments:**
        * `query` (str): The user's search query.
        * `generated_promotions` (list[dict]): A list of promotion dictionaries, where each dictionary is expected to contain a 'description' and a 'description_embedding'.
        * `top_n` (int, optional): The number of top matching promotions to retrieve. Defaults to 3.
    * **Functionality:**
        * It first generates an embedding for the user's `query` using the `get_embedding` function. If the embedding fails, it prints an error and returns an empty list.
        * It then extracts the valid promotion embeddings and their corresponding promotion dictionaries from the `generated_promotions` list (ignoring promotions without embeddings).
        * It performs a vector search using the `vector_search` function (defined in a previous cell) to find the `top_n` most similar promotion embeddings to the `query_embedding`.
        * It retrieves the actual promotion dictionaries corresponding to the indices returned by `vector_search`.
        * It then prints the truncated `query_embedding`, the number of valid embeddings and promotions, the similar indices found, and the truncated embeddings of the matching promotions.
        * Finally, it returns a list of the top `n` matching promotion dictionaries.

* **Example Usage:** The code then demonstrates how to use the `travel_agent_rag_search` function with several example user queries ("romantic getaway", "best snorkeling spots", "cultural heritage tours", "relaxing beach vacation", "affordable family trips"). For each query, it calls the search function and prints the 'Destination', 'Theme', and a truncated 'Description Embedding' of the top matching promotions. If no matching promotions are found for a query, it prints a corresponding message. The output for each query is separated by "---".

In [None]:
def travel_agent_rag_search(query: str, generated_promotions: list[dict], top_n: int = 3):
    """
    Simulates a travel agent that uses embeddings and vector search (RAG)
    to find relevant promotions.

    Args:
        query (str): The user's search query.
        generated_promotions (list[dict]): A list of promotion dictionaries,
                                            where each dictionary contains 'description'
                                            and 'description_embedding'.
        top_n (int, optional): The number of top matching promotions to return. Defaults to 3.

    Returns:
        list[dict]: A list of the top N matching promotion dictionaries.
    """
    query_embedding = get_embedding(query)

    if query_embedding is None:
        print("Error generating embedding for the query.")
        return []

    print(f"Query Embedding: {query_embedding[:5]}............{query_embedding[-5:] if isinstance(query_embedding, list) and len(query_embedding) > 5 else query_embedding}...")

    # Extract valid promotion embeddings and their corresponding promotions
    valid_promotion_data = [(promo['description_embedding'], promo)
                            for promo in generated_promotions
                            if promo.get('description_embedding') is not None]

    if not valid_promotion_data:
        print("No valid promotion embeddings available.")
        return []

    valid_embeddings, valid_promotions = zip(*valid_promotion_data) if valid_promotion_data else ([], [])

    if not valid_embeddings:
        print("No valid promotion embeddings to search against.")
        return []

    print(f"Number of valid embeddings: {len(valid_embeddings)}")
    print(f"Number of valid promotions: {len(valid_promotions)}")

    similar_indices = vector_search(query_embedding, list(valid_embeddings), top_n=top_n)

    print(f"Similar Indices: {similar_indices}")

    matching_promotions_truncated = []
    for promo in [valid_promotions[i] for i in similar_indices]:
        embedding = promo.get("description_embedding")
        if isinstance(embedding, list) and len(embedding) > 10:
            truncated_embedding = f"{embedding[:5]}............{embedding[-5:]}"
            matching_promotions_truncated.append({**{k: v for k, v in promo.items() if k != "description_embedding"}, "description_embedding": truncated_embedding})
        else:
            matching_promotions_truncated.append(promo)

    print(f"Matching Promotions: {matching_promotions_truncated}")

    return [valid_promotions[i] for i in similar_indices]

# Example usage
user_query_rag_1 = "romantic getaway"
matching_results_rag_1 = travel_agent_rag_search(user_query_rag_1, generated_promotions)
print(f"\nMatching results for 'romantic getaway':")
if matching_results_rag_1:
    for result in matching_results_rag_1:
        print(f"  Destination: {result['destination']}")
        print(f"  Theme: {result['theme']}")
        embedding = result.get("description_embedding")
        if isinstance(embedding, list) and len(embedding) > 10:
            truncated_embedding = f"{embedding[:5]}............{embedding[-5:]}"
            print(f"  Description Embedding: {truncated_embedding}")
        else:
            print(f"  Description Embedding: {embedding}")
        print("-" * 20)
else:
    print("  No matching promotions found.")

user_query_rag_2 = "best snorkeling spots"
matching_results_rag_2 = travel_agent_rag_search(user_query_rag_2, generated_promotions)
print(f"\nMatching results for 'best snorkeling spots':")
if matching_results_rag_2:
    for result in matching_results_rag_2:
        print(f"  Destination: {result['destination']}")
        print(f"  Theme: {result['theme']}")
        embedding = result.get("description_embedding")
        if isinstance(embedding, list) and len(embedding) > 10:
            truncated_embedding = f"{embedding[:5]}............{embedding[-5:]}"
            print(f"  Description Embedding: {truncated_embedding}")
        else:
            print(f"  Description Embedding: {embedding}")
        print("-" * 20)
else:
    print("  No matching promotions found.")

user_query_rag_3 = "cultural heritage tours"
matching_results_rag_3 = travel_agent_rag_search(user_query_rag_3, generated_promotions)
print(f"\nMatching results for 'cultural heritage tours':")
if matching_results_rag_3:
    for result in matching_results_rag_3:
        print(f"  Destination: {result['destination']}")
        print(f"  Theme: {result['theme']}")
        embedding = result.get("description_embedding")
        if isinstance(embedding, list) and len(embedding) > 10:
            truncated_embedding = f"{embedding[:5]}............{embedding[-5:]}"
            print(f"  Description Embedding: {truncated_embedding}")
        else:
            print(f"  Description Embedding: {embedding}")
        print("-" * 20)
else:
    print("  No matching promotions found.")

user_query_rag_4 = "relaxing beach vacation"
matching_results_rag_4 = travel_agent_rag_search(user_query_rag_4, generated_promotions)
print(f"\nMatching results for 'relaxing beach vacation':")
if matching_results_rag_4:
    for result in matching_results_rag_4:
        print(f"  Destination: {result['destination']}")
        print(f"  Theme: {result['theme']}")
        embedding = result.get("description_embedding")
        if isinstance(embedding, list) and len(embedding) > 10:
            truncated_embedding = f"{embedding[:5]}............{embedding[-5:]}"
            print(f"  Description Embedding: {truncated_embedding}")
        else:
            print(f"  Description Embedding: {embedding}")
        print("-" * 20)
else:
    print("  No matching promotions found.")

user_query_rag_5 = "affordable family trips"
matching_results_rag_5 = travel_agent_rag_search(user_query_rag_5, generated_promotions)
print(f"\nMatching results for 'affordable family trips':")
if matching_results_rag_5:
    for result in matching_results_rag_5:
        print(f"  Destination: {result['destination']}")
        print(f"  Theme: {result['theme']}")
        embedding = result.get("description_embedding")
        if isinstance(embedding, list) and len(embedding) > 10:
            truncated_embedding = f"{embedding[:5]}............{embedding[-5:]}"
            print(f"  Description Embedding: {truncated_embedding}")
        else:
            print(f"  Description Embedding: {embedding}")
        print("-" * 20)
else:
    print("  No matching promotions found.")

### 3.8 Building a Conversational Agent

This section defines the `conversational_travel_agent` function, which provides an interactive experience for users to get travel recommendations. It simulates a conversation where the user can input their preferences, and the system will respond with relevant Caribbean getaway options based on the RAG implementation we've built.

* **Function Definition:** The `conversational_travel_agent` function takes an optional boolean argument `offer_interaction`, which defaults to `True`. This flag controls whether the interactive prompt is offered to the user.

* **Offering Interaction:** If `offer_interaction` is `True`, the function prompts the user if they would like to try the Caribbean Getaway Recommender. It takes user input and converts it to lowercase.

* **Interactive Loop:** If the user responds with 'yes', the function enters a `while True` loop, allowing for multiple queries within the same session:
    * **User Input:** It prompts the user to enter their travel preferences or type 'no' to continue to the next step (which will break the loop).
    * **Exiting the Loop:** If the user enters 'no', the loop breaks, and an "Exiting the travel agent." message is printed.
    * **Searching for Promotions:** For any other input, the `travel_agent_rag_search` function (defined in a previous cell) is called with the user's query, the `generated_promotions` data, and a `top_n` of 3 to find the most relevant travel options.
    * **Displaying Recommendations:** If matching promotions are found:
        * It prints a header indicating the recommendations.
        * It iterates through the `matching_promotions`. To avoid showing the same destination multiple times if it appears in the top results with slightly different details, it uses a `set` called `displayed_destinations` to keep track of already shown destinations.
        * For each unique destination, it prints the destination name, description, and any available price per person, trip duration in nights, inclusions, and availability (starting Fridays). It numbers each unique option for better readability.
    * **Handling No Results:** If no matching promotions are found for the user's query, a "Sorry, no getaways found." message is displayed.

* **Skipping Interaction:** If the initial response to the "Would you like to try..." prompt is not 'yes', or if the `offer_interaction` flag is set to `False`, the function prints "Skipping the interactive travel agent demo."

* **Error Handling:** The `try...except` block handles potential `EOFError` (if no input is received, e.g., in a non-interactive environment) and `KeyboardInterrupt` (if the user interrupts the input, e.g., by pressing Ctrl+C), gracefully skipping the interactive demo in these cases.

* **Main Execution Block:** The `if __name__ == "__main__":` block ensures that the `conversational_travel_agent()` function is called when the script is executed directly, with `offer_interaction` set to `True` by default, thus starting the interactive session.

In [None]:
def conversational_travel_agent(offer_interaction=True):
    if offer_interaction:
        try:
            print("Would you like to try the Caribbean Getaway Recommender? (yes/no)")
            response = input().lower()
            if response == 'yes':
                print("Welcome to the Caribbean Getaway Recommender!")
                while True:
                    user_query = input("\nPlease enter your travel preferences (or type 'no' to continue to the next step): ")
                    if user_query.lower() == 'no':
                        break
                    matching_promotions = travel_agent_rag_search(user_query, generated_promotions, top_n=3)
                    if matching_promotions:
                        print("\nHere are some recommendations based on your query:")
                        displayed_destinations = set()
                        option_number = 1
                        for promo in matching_promotions:
                            if promo['destination'] not in displayed_destinations:
                                print(f"\n--- Option {option_number} ---")
                                print(f"Destination: {promo['destination']}")
                                print(f"Description: {promo['description']}")
                                if promo.get("price_per_person"):
                                    print(f"Price per person: ${promo['price_per_person']}")
                                if promo.get("trip_duration_nights"):
                                    print(f"Trip Duration: {promo['trip_duration_nights']} nights")
                                if promo.get("includes"):
                                    print(f"Includes: {promo['includes']}")
                                if promo.get("availability"):
                                    print(f"Availability (starting Fridays): {', '.join(promo['availability'])}")
                                displayed_destinations.add(promo['destination'])
                                option_number += 1
                    else:
                        print("Sorry, no getaways found.")
                print("Exiting the travel agent.")
            else:
                print("Skipping the interactive travel agent demo.")
        except EOFError:
            print("\nNo input received, skipping interactive travel agent demo.")
        except KeyboardInterrupt:
            print("\nUser interrupted, skipping interactive travel agent demo.")
    else:
        print("Interactive travel agent demo is set to skip.")

if __name__ == "__main__":
    conversational_travel_agent(offer_interaction=True) # Set to False to skip by default)

### 4. Conclusion and Next Steps

This notebook has demonstrated the development of a simulated AI-powered travel agent specializing in Caribbean getaways. By effectively leveraging Generative AI capabilities, including few-shot prompting for dynamic content generation, embeddings for capturing semantic meaning, vector search for efficiently finding relevant information, Retrieval-Augmented Generation (RAG) for intelligent and context-aware search, and even a basic form of Gen AI evaluation, we have shown how an LLM can provide relevant and engaging travel recommendations based on user queries.

**Key achievements of this project include**:

- Successfully generated diverse and appealing promotional content for a range of Caribbean destinations using few-shot prompting.
- The creation of a semantic search mechanism, powered by embeddings and vector search, that can understand the intent behind user queries and retrieve relevant travel options even without exact keyword matches.
- The implementation of a basic conversational interface allowing users to interact with the travel agent and receive recommendations.

**While this project provides a solid foundation and showcases several key Gen AI capabilities, there are numerous potential avenues for future exploration and improvement**:

- Integration with real-time data sources: Connecting to APIs for flight and accommodation availability and pricing would make the agent significantly more practical.
- Function Calling: Implementing function calling would allow the agent to interact with external tools and services, such as checking availability or even simulating booking processes.
- Personalization: Incorporating user history and preferences to provide even more tailored and relevant recommendations.
- More sophisticated conversational capabilities: Enhancing the agent's ability to understand complex queries, ask clarifying questions, and engage in more natural and multi-turn dialogue.
- Multimodal integration: Incorporating images and videos into the recommendations to create a richer and more engaging user experience.

This Capstone Project serves as a valuable demonstration of the transformative power of Generative AI in **revolutionizing the travel planning experience**, offering a glimpse into a future where AI agents can provide truly personalized and intuitive travel assistance.