# 🧠 GenAI Travel Planner: Personalized Itinerary Generator

Welcome to the **GenAI Travel Planner**, a project developed as part of the [Gen AI Intensive Course Capstone 2025Q1](https://www.kaggle.com/competitions/gen-ai-intensive-course-capstone-2025q1).

In this notebook, we build a smart itinerary planner that:
- Generates **personalized travel plans** based on user preferences
- Uses real-world datasets of **museums**, **restaurants**, **natural attractions**, and **transit stations**
- Leverages **Generative AI capabilities** to produce structured and creative multi-day itineraries

### 🚀 GenAI Capabilities Demonstrated:
- **Structured Output / JSON Mode** — for day-by-day travel itineraries
- **Few-shot Prompting** — to guide the itinerary generation
- **Retrieval Augmented Generation (RAG)** — we extract real-world POIs from datasets and use them as context

Let’s get planning!

## 📁 Step 1: Load Datasets

We begin by loading four public datasets related to places of interest:
- **Museums Dataset**: Cultural institutions from the IMLS dataset
- **Yelp Restaurants Dataset**: Dining places with location and category
- **Valley Metro Stations**: Public transport stations in the Phoenix area
- **GNIS Natural Features**: Parks, lakes, trails, etc. from GNIS

These will be used as input for our retrieval and GenAI itinerary generation.

In [1]:
# Dataset sources:
# - Museums: https://www.kaggle.com/datasets/imls/museum-directory/data

import pandas as pd

# Load datasets
museums_df = pd.read_csv("/kaggle/input/museum-directory/museums.csv", low_memory=False)  # From IMLS Kaggle dataset
yelp_df = pd.read_csv("/kaggle/input/yelp-restaurants/yelp_restaurants.csv")  # Custom-cleaned subset from Yelp Academic
valley_metro_df = pd.read_csv("/kaggle/input/phoenix-valley-metro-rail-stations/ValleyMetroRailStations.csv")  # Phoenix Light Rail station dataset
gnis_df = pd.read_csv("/kaggle/input/gnisnational/gnis.csv")

### 🧹 Step 2: Preprocess the Datasets

We standardize and clean each dataset to ensure they are usable in our GenAI pipeline:

- Museums: Extract name, type, city/state, and location
- Yelp Restaurants: Keep name, cuisine categories, and coordinates
- Valley Metro: Extract station name, location, and address
- GNIS: Extract natural feature names, types, and geo-coordinates

All datasets are filtered to remove entries missing geolocation.

In [2]:
# --- 🏛️ Museums ---
museums_poi = museums_df[[
    'Museum Name',
    'Museum Type',
    'Latitude',
    'Longitude',
    'City (Administrative Location)',
    'State (Administrative Location)'
]].rename(columns={
    'Museum Name': 'name',
    'Museum Type': 'type',
    'Latitude': 'latitude',
    'Longitude': 'longitude',
    'City (Administrative Location)': 'city',
    'State (Administrative Location)': 'state'
}).dropna(subset=['latitude', 'longitude'])

# --- 🍽️ Yelp Restaurants ---
yelp_poi = yelp_df[[
    'name', 'categories', 'latitude', 'longitude', 'city', 'state'
]].dropna(subset=['latitude', 'longitude'])

# --- 🚉 Valley Metro Stations ---
metro_poi = valley_metro_df[[
    'StationName', 'POINT_Y', 'POINT_X', 'Address'
]].rename(columns={
    'StationName': 'name',
    'POINT_Y': 'latitude',
    'POINT_X': 'longitude',
    'Address': 'address'
}).dropna(subset=['latitude', 'longitude'])

# --- 🌄 GNIS Natural Features ---
gnis_poi = gnis_df[[
    'FEATURE_NAME', 'FEATURE_CLASS', 'PRIM_LAT_DEC', 'PRIM_LONG_DEC', 'STATE_ALPHA'
]].rename(columns={
    'FEATURE_NAME': 'name',
    'FEATURE_CLASS': 'type',
    'PRIM_LAT_DEC': 'latitude',
    'PRIM_LONG_DEC': 'longitude',
    'STATE_ALPHA': 'state'
}).dropna(subset=['latitude', 'longitude'])

# ✅ Show Cleaned Samples
print("🏛️ Cleaned Museums:")
display(museums_poi.head())

print("🍽️ Cleaned Restaurants:")
display(yelp_poi.head())

print("🚉 Cleaned Metro Stations:")
display(metro_poi.head())

print("🌄 Cleaned Natural Features:")
display(gnis_poi.head())

🏛️ Cleaned Museums:


Unnamed: 0,name,type,latitude,longitude,city,state
0,ALASKA AVIATION HERITAGE MUSEUM,HISTORY MUSEUM,61.17925,-149.97254,ANCHORAGE,AK
1,ALASKA BOTANICAL GARDEN,"ARBORETUM, BOTANICAL GARDEN, OR NATURE CENTER",61.1689,-149.76708,ANCHORAGE,AK
2,ALASKA CHALLENGER CENTER FOR SPACE SCIENCE TEC...,SCIENCE & TECHNOLOGY MUSEUM OR PLANETARIUM,60.56149,-151.21598,KENAI,AK
3,ALASKA EDUCATORS HISTORICAL SOCIETY,HISTORIC PRESERVATION,60.5628,-151.26597,KENAI,AK
4,ALASKA HERITAGE MUSEUM,HISTORY MUSEUM,61.17925,-149.97254,ANCHORAGE,AK


🍽️ Cleaned Restaurants:


Unnamed: 0,name,categories,latitude,longitude,city,state
0,Emerald Chinese Restaurant,Specialty Food|Restaurants|Dim Sum|Imported Fo...,43.605499,-79.652289,Mississauga,ON
1,Musashi Japanese Restaurant,Sushi Bars|Restaurants|Japanese,35.092564,-80.859132,Charlotte,NC
2,Taco Bell,Restaurants|Breakfast & Brunch|Mexican|Tacos|T...,33.495194,-112.028588,Phoenix,AZ
3,Marcos Pizza,Italian|Restaurants|Pizza|Chicken Wings,41.70852,-81.359556,Mentor-on-the-Lake,OH
4,Carluccios Tivoli Gardens,Restaurants|Italian,36.100016,-115.128529,Las Vegas,NV


🚉 Cleaned Metro Stations:


Unnamed: 0,name,latitude,longitude,address
0,19th Ave / Dunlap,33.56709,-112.099389,1935 W Dunlap Ave
1,Center / Main St,33.415098,-111.83066,26 East Main Street
2,Northern / 19th Ave,33.55319,-112.09936,7832 N 19th Ave
3,Glendale / 19th Ave,33.538643,-112.099329,6813 N 19th Ave
4,44th St / Washington,33.44817,-111.987983,4203 East Washington Street


🌄 Cleaned Natural Features:


Unnamed: 0,name,type,latitude,longitude,state
0,Agua Sal Creek,Stream,36.461112,-109.478439,AZ
1,Agua Sal Wash,Valley,36.546112,-109.517607,AZ
2,Aguaje Draw,Valley,34.577496,-109.213616,AZ
3,Arlington State Wildlife Area,Park,33.248655,-112.773505,AZ
4,Bar X Wash,Stream,32.470904,-109.936185,AZ
