# TravelAIAgent: Overview ℹ️
Author: Ramis Hasanli
https://www.linkedin.com/in/hasanliramis/

Welcome to my for the 5-day Google x Kaggle Generative AI course!\
**TravelAIAgent: A Multimodal Conversational Travel Planner**\
In this project, we build **TravelAIAgent**, a multimodal conversational assistant that helps users plan and enjoy their trips using natural language and photos. It combines real-time weather and event data, image understanding, user preferences, and grounded travel knowledge to deliver an intelligent, personalized travel experience.

## 🧠 What Problem Does It Solve?

Planning a trip often involves jumping between apps — weather forecasts, events calendars, translation tools, tourist guides, and more. Many travelers also struggle to get real-time, personalized advice while already on the move.

**TravelAIAgent solves this by acting like a smart, always-available travel agent** that:
- Understands natural language queries and photos
- Offers contextual suggestions tailored to the user’s preferences
- Uses real-time data (e.g., events, weather) to enhance planning
- Generates complete itineraries, answers travel questions, and guides users through unfamiliar locations

The result is a smoother, more interactive travel experience — whether you're planning in advance or exploring on the go.

---

## ⚙️ How It Works: Architecture and Core Design

The assistant uses **intent-based routing** to understand what the user is asking, then routes requests to specialized modules that handle tasks like weather forecasting, itinerary generation, photo analysis, and event lookup.

### 🔧 Key Components:
- **Gemini 2.0 Flash & 1.5 Pro Vision** – for language and image understanding
- **OpenWeather API** – for current weather and multi-day forecasts
- **Ticketmaster Discovery API** – to fetch local events
- **ChromaDB + `text-embedding-004`** – to retrieve grounded cultural tips (RAG)
- **Python CLI loop** – simulates interactive chat experience (notebook-friendly)
- **Modular intent handlers** – clean structure to process user requests

---

## ✨ Functionality

TravelAIAgent supports a wide range of features, including:

- **🌦 Real-time weather updates** and multi-day forecasts with Gemini-powered summaries
- **🧳 Travel tips and cultural insights** for 30+ cities via Retrieval Augmented Generation (RAG)
- **🎟 Local event discovery** using Ticketmaster's API, summarized by Gemini
- **🗺 Personalized itinerary generation**, combining forecast, events, travel tips, and user preferences
- **🖼️ Tourist photo understanding**, including landmark descriptions and cultural context
- **📸 Text extraction and translation** from uploaded photos (e.g., signs, menus)
- **🧠 Session memory** of user preferences and previously mentioned cities

All responses are generated in a natural, friendly tone — just like a helpful travel agent might speak.

---


# TravelAIAgent: Setup 🛠️
First, install ChromaDB and the Gemini API Python SDK. Then, import all necessary Python dependecies.

In [None]:
!pip uninstall -qqy jupyterlab kfp
!pip install -qU "google-genai==1.7.0" "chromadb==0.6.3"

In [None]:
from google import genai
from google.genai import types
from google.api_core import retry
from IPython.display import HTML, Markdown, display
import chromadb
from chromadb.utils.embedding_functions import EmbeddingFunction
import os

### Automated retry

In [None]:
# Define a retry policy. The model might make multiple consecutive calls automatically
# for a complex query, this ensures the client retries if it hits quota limits.
from google.api_core import retry
from google.api_core import retry
is_retriable = lambda e: (isinstance(e, genai.errors.APIError) and e.code in {429, 503})
if not hasattr(genai.models.Models.generate_content, '__wrapped__'):
  genai.models.Models.generate_content = retry.Retry(
      predicate=is_retriable)(genai.models.Models.generate_content)

## 🔑 API keys

To run the following cell, your API keys must be stored it in a [Kaggle secret](https://www.kaggle.com/discussions/product-feedback/114053) named `GOOGLE_API_KEY`, `WEATHER_API_KEY`, and `EVENTS_API_KEY`.\
GOOGLE_API_KEY - Gemini API Key.\
WEATHER_API_KEY - OpenWeather API Key.\
EVENTS_API_KEY - TicketMaster Discovery API Key.\
To make the key available through Kaggle secrets, choose `Secrets` from the `Add-ons` menu and follow the instructions to add your key or enable it for this notebook.

In [None]:
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
GOOGLE_API_KEY = user_secrets.get_secret('GOOGLE_API_KEY')
WEATHER_API_KEY = user_secrets.get_secret('WEATHER_API_KEY')
EVENTS_API_KEY = user_secrets.get_secret('EVENTS_API_KEY')

## ✨ System Personality and Onboarding

To create a travel assistant that feels more like a friendly, human travel planner than a generic chatbot, we start by shaping its "personality" using system instructions and an engaging onboarding message.

### `TRAVELAGENT_SYSINT` 🧠

This string defines how our AI should behave during the entire conversation. It's part of Gemini's **system instruction** capability — a powerful way to steer tone, style, and behavior of the model. Here, we tell the assistant to act like a friendly, well-traveled guide who gives practical, friendly advice with just the right amount of emojis. It also encourages the bot to remember preferences the user shares and to speak like a helpful person — not a robot.

This instruction is passed into the Gemini API's config to control every future response the assistant generates.

---

### `onboarding_message` 👋

This is the first message the user sees when launching the chatbot — kind of like the assistant's welcome pitch. It sets expectations for what the assistant can help with: weather, tips, events, image translation, itinerary planning, and more. The list-style format, combined with emojis, keeps it light and easy to read. This message is printed to the user and stored in the chat history as the assistant's first message.

---

### ⚙️ Connecting to Gemini using genai.Client

In [None]:
TRAVELAGENT_SYSINT = (
    "You are TravelAIAgent, a helpful and friendly AI travel companion. You're chatting with someone who is planning or enjoying a trip. "
    "Your role is to offer smart, practical advice in a natural, engaging tone — like a well-traveled friend or human travel planner. "
    "You assist with weather forecasts, cultural tips, local events, itinerary planning, image/text translation, and tourist insights. "
    "You remember preferences the user shares (like what they enjoy or dislike) and use that to personalize your suggestions. "
    "Be relaxed and human-sounding. Avoid robotic or overly formal language. Use emojis only when they help convey emotion or friendliness. "
    "Offer helpful suggestions based on what the user says — but don't overload them with too much info unless they ask for it."
)

onboarding_message = (
    "Hi there! I'm TravelAIAgent — your AI-powered companion for smarter travel planning 🌍✈️\n\n"
    "Here’s what I can help you with:\n"
    "- 🌦️ Get real-time weather info and multi-day forecasts for any city\n"
    "- 🧳 Share cultural tips, local etiquette, and safety advice\n"
    "- 🌐 Translate short phrases or signs into English\n"
    "- 📸 Read and translate text from uploaded travel photos\n"
    "- 🏛️ Describe and answer questions about landmarks in your photos\n"
    "- 🎟️ Find events, concerts, and things to do during your trip\n"
    "- 🗺️ Build a personalized day-by-day travel itinerary using your preferences, the weather, and local events\n\n"
    "Just talk to me naturally — what's on your mind?"
)

# This block connects the assistant to the Gemini API from Google and sets up the generation behavior. We're using:
# temperature = 0.7 for a balanced level of creativity and stability — just enough flair for conversational responses.
# max_output_tokens = 512 to limit overly long replies.
# system_instruction = TRAVELAGENT_SYSINT to ensure every message stays in character as a smart, fun travel companion.
# This configuration is used in every Gemini content generation call, shaping how the assistant responds to user input throughout the entire experience.
client = genai.Client(api_key=GOOGLE_API_KEY)
config = types.GenerateContentConfig(
    temperature = 0.7,
    max_output_tokens = 512,
    system_instruction = TRAVELAGENT_SYSINT,
)

# TravelAIAgent: Backend 🧑‍💻

## 🧠 Interpreting User Intent
One of the most important parts of TravelAIAgent is understanding what the user is actually asking for — whether they want the weather, cultural tips, a travel itinerary, or help with a photo. To handle this, we use a technique called function routing, powered by Gemini.

The interpret_user_request() function acts as the brain of this routing system. It sends the user's message to Gemini with a carefully crafted prompt that explains how to classify the input. The prompt includes examples of how to convert natural language into a structured Python dictionary, such as: {'intent': 'get_weather', 'location': 'Paris'} or {'intent': 'plan_itinerary', 'location': 'Rome', 'days': 3, 'when': 'next week'} or {'intent': 'get_tip', 'query': 'public transport in Tokyo'}

Gemini then replies with a dictionary string, and we use Python’s ast.literal_eval() to safely convert that string into a real Python dictionary. This dictionary tells the assistant which intent to handle, and which handler function to trigger.
If something goes wrong — such as Gemini returning an invalid dictionary — we default to a generic 'chat' intent, so the assistant can still respond gracefully.

---

## 🏙️ Extracting City Names
Sometimes users don’t explicitly state a city in a clean format, especially when asking for travel tips or events. To improve contextual understanding, we use the extract_city_from_text() function. This is a lightweight utility that asks Gemini to pull a city name from any given sentence.

For example:
“Tell me more about Zurich” → returns "Zurich", “What can I do there?” → returns "None"

It helps ensure the assistant tracks which city is being discussed, so it can recall that later (e.g. when asked "What's the weather like there?"). This adds a subtle layer of memory and personalization to the interaction, without requiring complex NER models or external parsers.

In [None]:
import ast

def interpret_user_request(user_input):
    prompt = (
        "You are a function router. Based on the user message, output ONLY a Python dictionary.\n"
        "- If the user is asking for weather info, return: {'intent': 'get_weather', 'location': '<CITY>'}\n"
        "- If the user is asking for tips or local info (scams, etiquette, transport, etc.), return: {'intent': 'get_tip', 'query': '<QUESTION>'}\n"
        "- If the user asks about the weather in last mentioned city, return: {'intent': 'get_weather_last'}\n"
        "- If the user asks what city they last mentioned, return: {'intent': 'get_last_city'}\n"
        "- If the user asks to read and translate a photo, return: {'intent': 'image_translate', 'filename': '<FILENAME>'}\n"
        "- If the user asks what is shown in a specific photo, return: {'intent': 'describe_photo', 'filename': '<FILENAME>'}\n"
        "- If the user asks a question about a previously uploaded photo, return: {'intent': 'ask_about_photo', 'question': '<QUESTION>'}\n"
        "- If the user asks for a travel plan or itinerary, return: {'intent': 'plan_itinerary', 'location': '<CITY>', 'days': <NUMBER>, 'when': '<TIMEFRAME>'}"
        "- If the user asks for weather in the future (e.g. weather next week, forecast for the next few days), return: {'intent': 'get_weather_forecast', 'location': '<CITY>'}\n"
        "- If the user asks for upcoming events or things to do in a city, return: {'intent': 'get_events', 'location': '<CITY>'}"
        "- Otherwise, return: {'intent': 'chat'}\n"
        "Respond with ONLY the dictionary. No explanations, no code blocks.\n\n"
        f"User: {user_input}"
    )

    response = client.models.generate_content(
        model="gemini-2.0-flash",
        contents=[types.Part.from_text(text=prompt)]
    )

    try:
        return ast.literal_eval(response.text.strip())
    except Exception:
        return {'intent': 'chat'}


def extract_city_from_text(user_input: str) -> str | None:
    prompt = (
        "You are a city name extractor. If the following sentence clearly refers to a known city, "
        "output ONLY the city name as a string, with no extra text.\n"
        "If no city is clearly mentioned, output 'None'.\n\n"
        f"Sentence: {user_input}"
    )

    response = client.models.generate_content(
        model="gemini-2.0-flash",
        contents=[types.Part.from_text(text=prompt)]
    )

    result = response.text.strip().strip("'\"")
    return None if result.lower() == "none" else result



## 🌦️ Real-Time & Forecasted Weather
TravelAIAgent helps users make smart decisions while traveling — and weather is a big part of that! These two functions (get_weather and get_weather_summary) fetch and summarize weather data in a friendly, travel-savvy tone using a combination of the OpenWeather API and Gemini 2.0 Flash.

**get_weather(city) – Current Conditions**\
This function fetches the real-time weather for a given city using the OpenWeather API. Instead of just dumping raw data, it prepares a natural-language summary with the help of Gemini. The API response is parsed to extract key weather details like: Current temperature, Feels-like temperature, Humidity, Wind speed, General weather description. Then, a prompt is crafted to instruct Gemini to summarize this like a local travel agent would — conversational, helpful, and using emojis where appropriate.\
For example:
“It’s around 15°C in Lisbon right now with a nice breeze and clear skies — perfect for walking the old town! 🌤️👟”
This makes the response feel more human and trip-relevant, not just like reading numbers.

**get_weather_summary(city) – 5-Day Forecast**\
Instead of a snapshot, this function gathers a multi-day forecast from OpenWeather, broken into 3-hour segments across five days. The forecast entries are grouped by date, and then passed to Gemini in a structured format. The LLM is asked to: Summarize the general vibe of each day, highlight rain or great outdoor weather, and recommend the best days for activities. This is especially useful when building itineraries, since it helps the assistant match sunny days with walking tours or outdoor activities, and reserve indoor options for rainy ones.\
For example:
“Tuesday looks rainy 🌧️ — maybe plan some museum time. But Wednesday through Friday look warm and mostly sunny — great for beach walks or sightseeing! ☀️🌊”

In [None]:
# Get weather info
import requests
import datetime

# Current weather
def get_weather(city):
    url = (
        f"http://api.openweathermap.org/data/2.5/weather?"
        f"q={city}&appid={WEATHER_API_KEY}&units=metric"
    )
    response = requests.get(url)
    if response.status_code != 200:
        return f"Sorry, I couldn’t fetch the weather for {city}."

    data = response.json()
    description = data["weather"][0]["description"]
    temp = data["main"]["temp"]
    feels_like = data["main"]["feels_like"]
    humidity = data["main"].get("humidity", "N/A")
    wind_speed = data["wind"].get("speed", "N/A")

    # Create input for Gemini summarization
    weather_text = (
        f"City: {city}\n"
        f"Description: {description}\n"
        f"Temperature: {temp}°C (feels like {feels_like}°C)\n"
        f"Humidity: {humidity}%\n"
        f"Wind Speed: {wind_speed} m/s"
    )

    prompt = (
        "You are a friendly travel agent giving a casual weather update. "
        "Use the details below to tell the user what the weather is like right now and whether it’s good for being outdoors. "
        "Sound conversational, not robotic. Use emojis to represent weather conditions.\n\n"
        f"{weather_text}"
    )

    response = client.models.generate_content(
        model="gemini-2.0-flash",
        contents=[types.Part.from_text(text=prompt)]
    )

    return response.text.strip()


# Future forecast
def get_weather_summary(city: str) -> str:
    url = (
        f"http://api.openweathermap.org/data/2.5/forecast?"
        f"q={city}&appid={WEATHER_API_KEY}&units=metric"
    )
    response = requests.get(url)
    if response.status_code != 200:
        return f"Sorry, I couldn’t fetch the forecast for {city}."

    data = response.json()

    # Group by date
    forecasts_by_day = {}
    for entry in data["list"]:
        dt = datetime.datetime.fromtimestamp(entry["dt"])
        date_str = dt.date().isoformat()
        time_str = dt.strftime("%H:%M")
        desc = entry["weather"][0]["description"]
        temp = entry["main"]["temp"]

        if date_str not in forecasts_by_day:
            forecasts_by_day[date_str] = []
        forecasts_by_day[date_str].append(f"{time_str}: {desc}, {temp:.1f}°C")

    # Prepare text for Gemini to summarize
    forecast_text = f"City: {city}\nForecast for the next 5 days:\n\n"
    for date, slots in forecasts_by_day.items():
        forecast_text += f"Date: {date}\n" + "\n".join(f"  - {slot}" for slot in slots) + "\n\n"

    # Ask Gemini to summarize it in a friendly way
    prompt = (
        f"You’re a helpful travel assistant. Write a casual, helpful summary of this 5-day weather forecast in {city}. "
        f"Include what days look nice for being outdoors, if there’s rain or cold, and suggest the best days to explore. "
        f"Sound like a local giving friendly advice. Use emojis to represent weather conditions.\n\n{forecast_text}"
    )

    response = client.models.generate_content(
        model="gemini-2.0-flash",
        contents=[types.Part.from_text(text=prompt)]
    )

    return response.text.strip()

## 🖼️ Understanding & Translating Travel Photos

Travelers often encounter signs, menus, landmarks, and cultural artifacts they don’t fully understand. These three functions add powerful image understanding capabilities to TravelAIAgent by leveraging Gemini 1.5 Pro Vision for visual tasks and Gemini 2.0 Flash for language translation. The result is a travel assistant that can read and interpret photos like a human guide would.

**extract_text_from_image(file_path) – OCR with Gemini Vision** 📷🔤\
This function allows users to upload an image (like a street sign, museum label, or food menu), and have the assistant extract all visible text from it. It uses Gemini 1.5 Pro’s multimodal vision model, which is capable of reading embedded text from real-world photos.\
The function opens the image file in binary mode, attaches it to a prompt asking Gemini to extract all visible text (but not translate), and then returns the raw string. This step is useful when the goal is to understand what’s written in a local language before translating or explaining it.

**translate_to_english(text) – Text Translation** 🌐🗣️\
Once the text is extracted from an image, we pass it to this function, which uses Gemini 2.0 Flash to translate it into English. The prompt is simple and direct: translate the following text.\
This lets users take photos of signs, menus, schedules, or maps and instantly understand what they mean — perfect for tourists navigating unfamiliar environments.\
For example: A French street sign like "Rue des Bourneaux" → "Bourneaux Street"

**describe_photo(file_path) – Landmark and Scene Description** 🏛️✨\
This function goes beyond OCR and answers a higher-level question: “What am I looking at?”\
It uses Gemini 1.5 Pro to analyze the image and describe the scene, including its possible cultural or historical context. It’s especially useful for travel photos like landmarks, public art, or architecture. The prompt instructs Gemini to describe what’s in the image and why it might matter to a traveler — similar to how a museum guide or tour book would explain a location.\
For example: Upload a photo of the Eiffel Tower → “This is the Eiffel Tower in Paris, an iconic symbol of France built for the 1889 World's Fair…”

In [None]:
# Extract text from image
def extract_text_from_image(file_path: str) -> str:
    with open(file_path, "rb") as img:
        image_bytes = img.read()

    prompt = "Extract all visible text from this image. Don't translate it."

    response = client.models.generate_content(
        model="gemini-1.5-pro-latest",
        contents=[
            types.Part.from_text(text=prompt),
            types.Part.from_bytes(data=image_bytes, mime_type="image/png")
        ]
    )
    return response.text.strip()

# Translate extracted text using Geimini Flash
def translate_to_english(text: str) -> str:
    prompt = f"Translate the following to English:\n\n{text}"
    
    response = client.models.generate_content(
        model="gemini-2.0-flash",
        contents=[types.Part.from_text(text=prompt)]
    )
    return response.text.strip()

def describe_photo(file_path: str) -> str:
    with open(file_path, "rb") as img:
        image_bytes = img.read()

    prompt = (
        "You are a travel assistant. Describe this photo in detail. "
        "Explain what is shown, its possible cultural or historical significance, "
        "and any information that could help a traveler understand what they’re looking at."
    )

    response = client.models.generate_content(
        model="models/gemini-1.5-pro-latest",
        contents=[
            types.Part.from_text(text=prompt),
            types.Part.from_bytes(data=image_bytes, mime_type="image/png")  # or image/png
        ]
    )

    return response.text.strip()

## 🎟️ Discovering Local Events
One of the most exciting travel upgrades in TravelAIAgent is its ability to suggest real events happening in your destination. Whether you’re into concerts, theater, or cultural exhibits, the assistant uses this function to bring your itinerary to life with up-to-date entertainment options — all powered by the Ticketmaster Discovery API and summarized naturally using Gemini.

**get_events(city, start_date, end_date, max_events) – Event Discovery + Summarization**\
This function starts by querying the Ticketmaster Discovery API, using the name of the city and optional start/end date filters. It returns a list of upcoming events sorted by date — concerts, sports, exhibitions, shows, and more. We limit the number of results using max_events (default is 5) to avoid overwhelming the user and focus on the most relevant options. For each event, we extract key info such as name, date, venue\
Instead of listing raw data directly to the user, the assistant passes the list to Gemini 2.0 Flash using a well-crafted prompt. Gemini then summarizes the events in a casual, friendly tone — just like a travel agent would.\
For example: “This week in San Francisco: balloon art exhibits, jazz at Birdland, and some Broadway action if you’re feeling theatrical. 🎷🎭”\
The prompt also nudges Gemini to recommend which events suit different travelers (families, art lovers, music fans), highlight variety and uniqueness, and keep the tone fun and helpful

In [None]:
def get_events(city: str, start_date: str = None, end_date: str = None, max_events: int = 5) -> str:
    url = "https://app.ticketmaster.com/discovery/v2/events.json"

    params = {
        "apikey": EVENTS_API_KEY,
        "city": city,
        "sort": "date,asc",
        "locale": "*",
    }

    if start_date:
        params["startDateTime"] = start_date
    if end_date:
        params["endDateTime"] = end_date

    response = requests.get(url, params=params)
    if response.status_code != 200:
        return f"Sorry, I couldn't fetch events for {city}."

    data = response.json()
    events = data.get("_embedded", {}).get("events", [])[:max_events]

    if not events:
        return f"No upcoming events found in {city}."

    # Prepare input for Gemini
    raw_event_lines = []
    for event in events:
        name = event.get("name", "Unnamed Event")
        date = event.get("dates", {}).get("start", {}).get("localDate", "Unknown Date")
        venue = event.get("_embedded", {}).get("venues", [{}])[0].get("name", "Unknown Venue")
        line = f"{name} on {date} at {venue}"
        raw_event_lines.append(line)

    event_input = "\n".join(raw_event_lines)

    # Gemini prompt
    prompt = (
    f"You’re a cheerful travel assistant. Summarize these events in {city} in a relaxed, travel-friendly tone. "
    f"Pick out highlights — like concerts, art, festivals, or unique experiences — and suggest what type of traveler they might appeal to.\n\n"
    f"{event_input}"
    )


    summary_response = client.models.generate_content(
        model="gemini-2.0-flash",
        contents=[types.Part.from_text(text=prompt)]
    )

    return summary_response.text.strip()


## 🔍 Local Insights & Landmark Knowledge
To give TravelAIAgent the ability to answer grounded, location-specific questions — like cultural etiquette, local dos and don'ts, or tips about a photo — we use Retrieval Augmented Generation (RAG). This section combines ChromaDB (a vector database), Gemini embeddings, and Gemini's language generation to bring real-world knowledge into the assistant's responses.

**🧳 documents – Curated Travel Tips**\
We start with a hardcoded list of helpful, city-specific cultural tips and travel advice. These act as a lightweight knowledge base. Each item is a short, practical sentence — perfect for answering questions like: “What should I know before going to Tokyo?”, “Is it okay to haggle in Marrakech?”

**🧠 GeminiEmbeddingFunction – Semantic Understanding**\
To search through these tips in a meaningful way, we embed both the tips and the user's queries using Gemini's text-embedding-004 model. This converts text into high-dimensional vectors that capture their semantic meaning. The class supports two modes: "retrieval_document" – for indexing passages and "retrieval_query" – for embedding the user's question. This ensures the model can retrieve relevant advice even if the user asks in a different phrasing than what’s stored.

**🗃️ chroma_client + chromaDB – Embedding Storage**\
We initialize a ChromaDB collection named "travel_tips", embed the list of tips, and store them by unique IDs. This allows the assistant to quickly search for tips that are most relevant to the user's question at runtime.

**❓ search_and_answer(user_question) – RAG Answering**\
When a user asks for tips or cultural context: The function switches to query mode -> Finds the top 2 semantically similar tips from ChromaDB -> Passes those into Gemini 2.0 Flash using a prompt that instructs it to generate a helpful, grounded answer. This pattern ensures that answers are both relevant and informed by the actual data — a key aspect of building trust in AI assistants.

**🖼️ Image Insights: photo_collection + search_photo_insights()**\
We use the same RAG pipeline to store and query descriptions of user-uploaded photos (e.g. landmarks, statues, monuments). After describing a photo using Gemini Vision, the description is embedded and added to a separate collection (photo_insights). Then, if the user asks: “What is that building?” or “What’s the story behind this statue?”. We use search_photo_insights() to pull relevant image descriptions from the vector DB and let Gemini answer based on that context.


Together, these components turn your assistant into a context-aware, grounded travel expert. It can recall city-specific etiquette, offer photo-based insight, and explain things like a human guide — all while staying efficient and scalable. ✈️🧠📸

In [None]:
documents = [
    "In Bangkok, visit Chatuchak Market on the weekend and avoid tuk-tuk scams near tourist spots.",
    "In Paris, always greet with 'Bonjour' before asking questions. Most museums are closed on Mondays.",
    "In London, public transport uses contactless cards. Tipping is not expected in pubs.",
    "In Dubai, public displays of affection are discouraged. Dress modestly in public areas.",
    "In Singapore, chewing gum is banned. It’s one of the cleanest and safest cities globally.",
    "In Kuala Lumpur, try nasi lemak for breakfast. Petronas Towers are best viewed at sunset.",
    "In New York City, walk fast, tip 15–20%, and explore boroughs beyond Manhattan.",
    "In Istanbul, try local ferry rides and visit both European and Asian sides of the city.",
    "In Tokyo, avoid loud phone calls on trains. Convenience stores (konbini) have everything.",
    "In Antalya, the old town (Kaleiçi) is walkable and filled with ancient Roman ruins.",
    "In Seoul, Korean street food markets like Gwangjang are must-visit. Public transport is efficient.",
    "In Rome, beware of pickpockets at popular landmarks. Tap water is drinkable.",
    "In Phuket, island tours are best booked locally. Avoid animal-based attractions.",
    "In Mecca, non-Muslims cannot enter the central holy area. Stay hydrated in the heat.",
    "In Hong Kong, use the Octopus card for transport and street food. Don't miss Victoria Peak.",
    "In Barcelona, be cautious of pickpockets on Las Ramblas. Tapas culture is big—dine late.",
    "In Zurich, trains are punctual to the minute. Public water fountains offer safe drinking water.",
    "In Cairo, tipping (baksheesh) is expected for most services. Visit the pyramids early to avoid crowds.",
    "In Sydney, sun protection is essential year-round. Use an Opal card for public transport.",
    "In Marrakech, haggle respectfully in souks. Fridays are holy—many shops may close.",
    "In Amsterdam, watch for bikes at crossings. Many museums require advance reservations.",
    "In Mexico City, avoid tap water. Street tacos are a must, but go where locals eat.",
    "In Athens, the Acropolis is best visited early morning. Tipping is appreciated but not required.",
    "In Vancouver, use contactless or Compass Card for transit. Pack for rain, even in summer.",
    "In Buenos Aires, late dinners and tango shows are local staples. Beware of counterfeit currency.",
    "In Prague, the Old Town is stunning but crowded—explore Žižkov and Letná for a local vibe.",
    "In Vienna, classical concerts abound, but dress semi-formally. Public transport is safe and clean.",
    "In Cape Town, avoid walking after dark in certain areas. Visit Table Mountain on a clear day.",
    "In Lisbon, trams can get crowded—beware of pickpockets. Try pastel de nata from local bakeries.",
    "In Los Angeles, public transport is limited—consider renting a car. Tipping is standard at 20%."
]


In [None]:
# Gemini Embedding Function class
class GeminiEmbeddingFunction(EmbeddingFunction):
    document_mode = True

    @retry.Retry(predicate=is_retriable)
    def __call__(self, input):
        task_type = "retrieval_document" if self.document_mode else "retrieval_query"
        response = client.models.embed_content(
            model="models/text-embedding-004",
            contents=input,
            config=types.EmbedContentConfig(task_type=task_type)
        )
        return [e.values for e in response.embeddings]

# Initialize ChromaDB
chroma_client = chromadb.Client()
DB_NAME = "travel_tips"
embed_fn = GeminiEmbeddingFunction()

db = chroma_client.get_or_create_collection(
    name=DB_NAME,
    embedding_function=embed_fn
)

# Store tips
db.add(documents=documents, ids=[str(i) for i in range(len(documents))])

# Search and answer
def search_and_answer(user_question):
    # Switch to query embedding mode
    embed_fn.document_mode = False

    # Search top matching passage
    result = db.query(query_texts=[user_question], n_results=2)
    passages = result["documents"][0]

    # Create grounded prompt for Gemini
    prompt = (
        "You are a helpful travel assistant. Use the information from the following passages "
        "to answer the user's question clearly. If irrelevant, ignore it.\n\n"
        f"QUESTION: {user_question}\n"
    )

    for p in passages:
        prompt += f"PASSAGE: {p.strip()}\n"

    # Generate response
    response = client.models.generate_content(
        model="gemini-2.0-flash",
        contents=[types.Part.from_text(text=prompt)]
    )
    return response.text

# Store image description with ChromaDB
photo_collection = chroma_client.get_or_create_collection(
    name="photo_insights",
    embedding_function=embed_fn
)

def add_photo_description_to_chromadb(photo_id: str, description: str):
    photo_collection.add(
        documents=[description],
        ids=[photo_id]
    )

# Search photo description with a question
def search_photo_insights(user_question: str, top_k: int = 2) -> str:
    embed_fn.document_mode = False  # use query embedding

    results = photo_collection.query(query_texts=[user_question], n_results=top_k)
    matches = results["documents"][0]

    prompt = (
        "You are a helpful travel assistant. Use the following descriptions to answer the user's question.\n\n"
        f"QUESTION: {user_question}\n"
    )

    for desc in matches:
        prompt += f"DESCRIPTION: {desc.strip()}\n"

    response = client.models.generate_content(
        model="gemini-2.0-flash",
        contents=[types.Part.from_text(text=prompt)]
    )
    return response.text.strip()


## 🧠 Modular Intent Handlers
To keep TravelAIAgent organized and extensible, every user intent is handled by a dedicated function — known as a “handler.” These functions make the assistant more maintainable and allow for clear separation of logic, especially when integrating multiple tools like weather APIs, image understanding, vector search, and LLM generation.\
Each handler performs three key tasks:
1. Executes a specific capability (like getting weather or generating an itinerary)
2. Updates the assistant’s memory if needed (e.g. storing the last mentioned city)
3. Prints and stores the response in the conversation history

---

**🌤️ handle_get_weather & handle_get_weather_forecast**\
These handlers fetch current conditions or a 5-day forecast from OpenWeather and summarize it using Gemini. They store the city as memory["last_city"] so the assistant can reference it later when the user says, “What’s the weather like there?”

**🧳 handle_get_tip**\
This pulls grounded cultural advice from ChromaDB using a RAG-style query. It also extracts and stores the city mentioned in the user’s tip-related question to maintain context.

**📸 handle_describe_photo, handle_ask_about_photo, handle_image_translate**\
These three handlers handle photo-related requests:
1. describe_photo: Uses Gemini Vision to describe what’s in an uploaded image (e.g. a statue or landmark)
2. ask_about_photo: Allows follow-up questions using photo-based RAG from ChromaDB
3. image_translate: Extracts and translates visible text from a photo (e.g. street signs or menus)
They help users understand and interact with their surroundings more easily, especially in unfamiliar places.

**🌆 handle_get_events**\
This fetches events in a city using the Ticketmaster Discovery API and summarizes them using Gemini. The results feel natural and tailored — great for finding things to do on the fly.

**🗺️ handle_plan_itinerary**\
This is one of the most impressive handlers — it builds a personalized, multi-day itinerary using:
1. Weather forecasts
2. Cultural travel tips (from ChromaDB)
3. User preferences (stored throughout the session)
It crafts a prompt that instructs Gemini to combine all that information into a casual, smart itinerary.

**🧠 handle_get_weather_last & handle_get_last_city**\
These support context-awareness. If the user asks: “What’s the weather like there?” or “Which city was I talking about?”. The assistant responds intelligently based on the memory dictionary (memory["last_city"]), maintaining a natural and coherent conversation flow.


In [None]:
# Handlers
def handle_get_weather(action, history, memory):
    city = action["location"]
    memory["last_city"] = city
    weather_info = get_weather(city)
    print("AI:", weather_info)
    history.append(types.ModelContent(parts=[types.Part.from_text(text=weather_info)]))


def handle_get_tip(action, history, memory):
    query = action["query"]
    city = extract_city_from_text(query)
    memory["last_city"] = city if city else query
    tip_answer = search_and_answer(query)
    print("AI:", tip_answer)
    history.append(types.ModelContent(parts=[types.Part.from_text(text=tip_answer)]))


def handle_describe_photo(action, history):
    filename = action["filename"]
    filepath = f"/kaggle/input/photoss/{filename}"

    try:
        with open(filepath, "rb") as f:
            f.read(1)
        if filename not in photo_collection.get()['ids']:
            description = describe_photo(filepath)
            add_photo_description_to_chromadb(photo_id=filename, description=description)
            print("AI: Here's what I see in that photo:")
            print(description)
            history.append(types.ModelContent(parts=[types.Part.from_text(text=description)]))
        else:
            print(f"AI: I already have a description for `{filename}`.")
    except FileNotFoundError:
        print(f"AI: I couldn't find the image `{filename}` in /kaggle/input/photoss/.")


def handle_ask_about_photo(user_input, history):
    if not photo_collection.get()['documents']:
        print("AI: I haven’t analyzed any photos yet. Ask me to describe one first.")
        return
    photo_answer = search_photo_insights(user_input)
    print("AI:", photo_answer)
    history.append(types.ModelContent(parts=[types.Part.from_text(text=photo_answer)]))


def handle_image_translate(action, history):
    filename = action["filename"]
    filepath = f"/kaggle/input/photoss/{filename}"

    try:
        raw_text = extract_text_from_image(filepath)
        if not raw_text or raw_text.lower() in ["none", ""]:
            print("AI: I couldn't read any text from the image.")
            return

        translated = translate_to_english(raw_text)
        print("AI: Here's the translated text from the image:")
        print(translated)
        history.append(types.ModelContent(parts=[types.Part.from_text(text=translated)]))
    except FileNotFoundError:
        print(f"AI: I couldn't find `{filename}`. Upload it to /kaggle/input/photoss/.")


def handle_get_weather_last(memory, history):
    if memory["last_city"]:
        city = memory["last_city"]
        weather_info = get_weather(city)
        print(f"AI: Here's the latest weather in {city}:\n{weather_info}")
        history.append(types.ModelContent(parts=[types.Part.from_text(text=weather_info)]))
    else:
        print("AI: I don't know your last mentioned city yet.")


def handle_get_last_city(memory, history):
    if memory["last_city"]:
        print(f"AI: The last city you mentioned was {memory['last_city']}.")
        history.append(types.ModelContent(parts=[types.Part.from_text(text=memory['last_city'])]))
    else:
        print("AI: You haven't mentioned a city yet.")

def handle_get_weather_forecast(action, memory, history):
    city = action["location"]
    memory["last_city"] = city
    forecast = get_weather_summary(city)
    print("AI:", forecast)
    history.append(types.ModelContent(parts=[types.Part.from_text(text=forecast)]))

def handle_plan_itinerary(action, memory, history):
    city = action["location"]
    days = action.get("days", 3)
    when = action.get("when", "next week")
    memory["last_city"] = city

    # Get forecast summary (multi-day, formatted)
    forecast = get_weather_summary(city)

    # Get cultural tips
    tips = search_and_answer(f"travel tips for {city}")

    # Combine prompt
    user_prefs = "\n".join(memory["preferences"]) if memory["preferences"] else "No specific preferences given."

    prompt = (
        f"You are a friendly travel planner. Create a detailed day-by-day travel itinerary for {city} for {days} days starting {when}.\n"
        f"Use the weather forecast, travel tips, and user preferences below:\n\n"
        f"Weather:\n{forecast}\n\n"
        f"Cultural Tips:\n{tips}\n\n"
        f"User Preferences:\n{user_prefs}\n\n"
        f"Return a helpful, conversational itinerary. If a day is rainy, suggest more indoor activities. Mention why each activity fits the user’s interests."
    )

    response = client.models.generate_content(
        model="gemini-2.0-flash",
        contents=[types.Part.from_text(text=prompt)]
    )

    itinerary = response.text.strip()
    print("AI:", itinerary)
    history.append(types.ModelContent(parts=[types.Part.from_text(text=itinerary)]))

# Get events info
def handle_get_events(action, history, memory):
    city = action["location"]
    memory["last_city"] = city

    today = datetime.datetime.utcnow()
    next_week = today + datetime.timedelta(days=7)
    start = today.strftime("%Y-%m-%dT00:00:00Z")
    end = next_week.strftime("%Y-%m-%dT23:59:59Z")

    events_info = get_events(city=city, start_date=start, end_date=end)
    print("AI:", events_info)
    history.append(types.ModelContent(parts=[types.Part.from_text(text=events_info)]))


# 💬 **Chat with TravelAIAgent**
This block of code powers the entire real-time conversation between the user and TravelAIAgent. It’s where everything comes together — intent recognition, memory tracking, handler dispatching, and fallback chat. Here's how it works:

---

**🕘 history and memory Initialization**\
The assistant starts with two critical variables:
1. history: a list that stores the full back-and-forth conversation (in Gemini’s expected format). It keeps context across turns.
2. memory: a dictionary that stores session-specific context like: last_city: the most recently mentioned city (used for follow-ups like "What's the weather like there?"), and  preferences: any user-specified likes (e.g. “I love seafood”) that guide future responses

---
**👋 First Impression: Onboarding Message**\
The assistant prints a warm welcome using the onboarding_message and also adds it to the conversation history. This sets the tone and immediately informs the user of the assistant’s capabilities.

---

**🔁 The Main Loop**\
This is a while True: loop that simulates ongoing conversation. Each time the user enters input: User message is read and cleaned, and intent is interpreted using interpret_user_request()

The assistant routes the request to the appropriate handler: Weather, Cultural tips, Photo analysis, Event search, Itinerary planning, Translation, Memory recall, Handlers take care of fetching data, calling APIs, or prompting Gemini, then store the result back in history.

---

**🧠 Fallback Branch**\
If none of the intent matches are triggered (e.g. the user says something casual or vague): The input is added to history, Gemini is called with full context to continue the conversation naturally. Optionally, if the user shares a preference (e.g. "I love museums"), it’s stored in memory["preferences"] for future personalization. A try/except block ensures any Gemini API failures are handled gracefully without breaking the loop — offering a friendly error instead of a crash.

In [None]:
def start_travel_chat():
    history = []
    memory = {
        "last_city": None,
        "preferences": []
    }

    print("TravelAI:", onboarding_message)
    history.append(types.ModelContent(parts=[types.Part.from_text(text=onboarding_message)]))

    while True:
        user_input = input("You: ").strip()
        if user_input.lower() == 'q':
            print("Goodbye!")
            break

        action = interpret_user_request(user_input)

        if action.get("intent") == "get_weather" and "location" in action:
            handle_get_weather(action, history, memory)
        elif action.get("intent") == "get_tip" and "query" in action:
            handle_get_tip(action, history, memory)
        elif action.get("intent") == "describe_photo" and "filename" in action:
            handle_describe_photo(action, history)
        elif action.get("intent") == "ask_about_photo" and "question" in action:
            handle_ask_about_photo(user_input, history)
        elif action.get("intent") == "image_translate" and "filename" in action:
            handle_image_translate(action, history)
        elif action.get("intent") == "get_weather_last":
            handle_get_weather_last(memory, history)
        elif action.get("intent") == "get_last_city":
            handle_get_last_city(memory, history)
        elif action.get("intent") == "get_weather_forecast" and "location" in action:
            handle_get_weather_forecast(action, memory, history)
        elif action.get("intent") == "plan_itinerary" and "location" in action:
            handle_plan_itinerary(action, memory, history)
        elif action.get("intent") == "get_events" and "location" in action:
            handle_get_events(action, history, memory)
        else:
            if user_input:
                history.append(types.UserContent(parts=[types.Part.from_text(text=user_input)]))

                if any(phrase in user_input.lower() for phrase in ["i like", "i prefer", "i enjoy", "i love", "i want"]):
                    memory["preferences"].append(user_input)

                try:
                    response = client.models.generate_content(
                        model="gemini-2.0-flash",
                        config=config,
                        contents=history
                    )
                    history.append(response.candidates[0].content)
                    print("AI:", response.text)
                except Exception as e:
                    print("⚠️ AI: I had trouble generating a response. Please try rephrasing.")
                    print("Error:", str(e))

## Feel free to run following code and start chatting to TravelAIAgent!
To begin your travel planning session, just run the cell below. Type naturally — like you're talking to a friend — and ask anything about your destination. Type `q` to exit the chat anytime.

In [None]:
start_travel_chat()