# 🗺️ **AI-Venture: GenAI-Powered Travel Companion**
---

An intelligent trip planner that generates **personalized itineraries**, recommends accommodations and experiences, and answers user queries — enhanced with **Google Gemini**.

---

## ✅ _Problem Statement_

Planning a trip should be exciting—not exhausting. But without a travel guide or broker, organizing a vacation becomes time-consuming and frustrating. Hiring a guide can be costly, and endless searching online doesn’t always match your preferences.

**AI-Venture** is here to simplify it all with a smart, AI-driven solution.

---

## 💡 _Solution Overview_

**AI-Venture** is an intelligent assistant that helps you:

- Create tailored itineraries  
- Stay within budget  
- Discover local food, culture, and attractions  
- Answer travel-related questions from uploaded brochures/guides  

Built using **Google Gemini API**, it combines conversational AI, semantic search, and function-calling.

---

## 🔮 _Gen AI Capabilities Demonstrated_

- 🧠 **Few-Shot Prompting**: Fine-tuned prompts to shape Gemini’s behavior.  
- 📄 **Structured Output (JSON)**: Clean and readable parsed responses.  
- 🧭 **Function Calling**: Generate personalized itineraries based on time, location, and budget.  
- 📚 **Retrieval-Augmented Generation (RAG)**: Answer from uploaded PDFs and brochures.  
- 🔍 **Embeddings + Vector Search (FAISS)**: Semantic search for relevant document chunks.

---

## 📦 Environment Setup:
Let's install all the necessary libraries for our project.

---

In [1]:
!pip install -q -U google-generativeai PyMuPDF faiss-cpu langchain gradio pypdf2 tiktoken

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m155.4/155.4 kB[0m [31m5.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m20.0/20.0 MB[0m [31m78.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m30.7/30.7 MB[0m [31m58.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m39.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m46.9/46.9 MB[0m [31m36.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m322.2/322.2 kB[0m [31m14.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m232.6/232.6 kB[0m [31m12.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m95.2/95.2 kB[0m [31m4.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━

## 📚 Import Libraries:

---

In [2]:
import google.generativeai as genai
import os
import pandas as pd
import fitz  
import faiss
import json
import numpy as np
import re
from IPython.display import Markdown, display

## 🔐 API Key Setup:

---
Authenticate with your API key using:

`import google.generativeai as genai
genai.configure(api_key="your-key")`

Replace with your `GOOGLE_API_KEY` stored as an environment variable or secret.

In [3]:
from kaggle_secrets import UserSecretsClient
GOOGLE_API_KEY = UserSecretsClient().get_secret("GOOGLE_API_KEY")

# Set your Gemini API key
genai.configure(api_key=GOOGLE_API_KEY)

# Create a model instance
model = genai.GenerativeModel("gemini-2.0-flash")

## 🗃️ Defining Dataframes/Database:
---
We’ll manually define motels, restaurants & attractions dataframes for AI to work with.

In [4]:
# Sample motels, restaurants, attractions
motels = pd.DataFrame([
    # Paris
    {"name": "Cozy Inn", "location": "Paris", "price_per_night": 45, "rating": 4.2},
    {"name": "Budget Stay", "location": "Paris", "price_per_night": 30, "rating": 3.8},
    {"name": "Chic Rooms", "location": "Paris", "price_per_night": 65, "rating": 4.5},

    # Rome
    {"name": "Rome Comfort Hotel", "location": "Rome", "price_per_night": 50, "rating": 4.0},
    {"name": "Trastevere Budget Inn", "location": "Rome", "price_per_night": 35, "rating": 3.7},
    {"name": "Hotel Bella Roma", "location": "Rome", "price_per_night": 60, "rating": 4.3},

    # Berlin
    {"name": "Berlin Central Stay", "location": "Berlin", "price_per_night": 48, "rating": 4.1},
    {"name": "Budget Lodge Berlin", "location": "Berlin", "price_per_night": 32, "rating": 3.9},
    {"name": "Checkpoint Hostel", "location": "Berlin", "price_per_night": 27, "rating": 3.6},

    # Barcelona
    {"name": "Barcelona Breeze", "location": "Barcelona", "price_per_night": 55, "rating": 4.4},
    {"name": "La Rambla Inn", "location": "Barcelona", "price_per_night": 40, "rating": 4.0},
    {"name": "El Centro Lodge", "location": "Barcelona", "price_per_night": 33, "rating": 3.8}
])

restaurants = pd.DataFrame([
    # Paris
    {"name": "Veggie Delight", "location": "Paris", "type": "Vegetarian", "price_range": "Low"},
    {"name": "Cafe Royale", "location": "Paris", "type": "Cafe", "price_range": "Medium"},
    {"name": "Seafood Heaven", "location": "Paris", "type": "Seafood", "price_range": "High"},

    # Rome
    {"name": "Pasta Fresca", "location": "Rome", "type": "Italian", "price_range": "Medium"},
    {"name": "Vegan Vibe", "location": "Rome", "type": "Vegetarian", "price_range": "Low"},
    {"name": "Trattoria Romano", "location": "Rome", "type": "Traditional", "price_range": "High"},

    # Berlin
    {"name": "Berlin Bites", "location": "Berlin", "type": "Fast Food", "price_range": "Low"},
    {"name": "Wurst Haus", "location": "Berlin", "type": "German", "price_range": "Medium"},
    {"name": "Green Eats", "location": "Berlin", "type": "Vegan", "price_range": "Low"},

    # Barcelona
    {"name": "Tapas Town", "location": "Barcelona", "type": "Tapas", "price_range": "Medium"},
    {"name": "La Veggiera", "location": "Barcelona", "type": "Vegetarian", "price_range": "Low"},
    {"name": "Mar y Sol", "location": "Barcelona", "type": "Seafood", "price_range": "High"}
])

attractions = pd.DataFrame([
    # Paris
    {"name": "Eiffel Tower", "location": "Paris", "type": "Landmark", "duration": "2 hours"},
    {"name": "Louvre Museum", "location": "Paris", "type": "Museum", "duration": "3 hours"},
    {"name": "Seine River Cruise", "location": "Paris", "type": "Experience", "duration": "1.5 hours"},

    # Rome
    {"name": "Colosseum", "location": "Rome", "type": "Landmark", "duration": "2 hours"},
    {"name": "Vatican Museums", "location": "Rome", "type": "Museum", "duration": "3 hours"},
    {"name": "Trevi Fountain", "location": "Rome", "type": "Landmark", "duration": "1 hour"},

    # Berlin
    {"name": "Brandenburg Gate", "location": "Berlin", "type": "Landmark", "duration": "1 hour"},
    {"name": "Pergamon Museum", "location": "Berlin", "type": "Museum", "duration": "2.5 hours"},
    {"name": "Berlin Wall Memorial", "location": "Berlin", "type": "Historical", "duration": "1.5 hours"},

    # Barcelona
    {"name": "Sagrada Familia", "location": "Barcelona", "type": "Landmark", "duration": "2 hours"},
    {"name": "Park Güell", "location": "Barcelona", "type": "Experience", "duration": "1.5 hours"},
    {"name": "Picasso Museum", "location": "Barcelona", "type": "Museum", "duration": "2 hours"}
])

## 🧠 Teaching the AI with Examples (Few-Shot Prompting)
---
We’re going to provide a few example inputs and desired outputs.  
This helps Gemini understand what a “good itinerary” looks like.

We use:
- Example input/output
- User's input at the end
- Ask Gemini to respond **only in JSON**

In [5]:
# Function to build the few-shot prompt for the model
def build_few_shot_prompt(user):
    return f"""
You are a travel planner assistant. Based on the user's input, generate a {user["days"]}-day travel itinerary in JSON format.

Instructions:
- Recommend motels, attractions, and restaurants that match the user's preferences.
- Stay strictly within the total budget (in USD).
- Keep each day's total cost proportional to the total budget.
- Include estimated cost breakdowns (in USD) for:
  - Motel
  - Food
  - Attraction entrance fees (if applicable)
- Mention attraction durations where useful.
- **Do not use tildes (~) anywhere.**
- Respond in clean, readable JSON only (no explanations).
- Follow this structure exactly.

Example:

User: {{
  "name": "Aarav",
  "destination": "Tokyo",
  "days": 2,
  "budget": 150,
  "preferences": {{
    "food": "Vegetarian",
    "interests": ["Museum", "Cultural"]
  }}
}}

Response:
{{
  "day_1": {{
    "attractions": [
      "Tokyo National Museum (Entry approx. $7, Duration: 2 hrs)",
      "Asakusa Shrine (Free, Duration: 1 hr)"
    ],
    "restaurant": "Veggie Sora (Estimated around $14)",
    "motel": "Tokyo Budget Inn (Estimated around $28/night)",
    "total_cost": 70
  }},
  "day_2": {{
    "attractions": [
      "Meiji Shrine (Free, Duration: 1.5 hrs)",
      "Ueno Park (Free, Duration: 1 hr)"
    ],
    "restaurant": "Green Table Tokyo (Estimated around $14)",
    "motel": "Tokyo Budget Inn (Estimated around $28/night)",
    "total_cost": 70
  }},
  "notes": "Remaining $10 can be used for transport or snacks."
}}

Now generate for this user:

User: {json.dumps(user, indent=2)}
Response:
"""

## 👩🏼‍💻 Enter Your Travel Preferences (Structured JSON Output)
---
You can change this JSON script below 👇🏼 to create your own trip. Modify:
- `destination`
- `days`
- `budget`
- `food` type
- `interests`

Then run the cells below to generate your own travel plan!

In [6]:
# Example user travel preferences
user_input = {
    "name": "John",
    "destination": "Paris",
    "days": 2,
    "budget": 200,
    "preferences": {
        "food": "Vegetarian",
        "interests": ["Landmark", "Museum"]
    }
}

In [7]:
prompt = build_few_shot_prompt(user_input)

response = model.generate_content(prompt)

# Print raw response
print(response.text)

```json
{
  "day_1": {
    "attractions": [
      "Eiffel Tower (Entry approx. $30, Duration: 2 hrs)",
      "Louvre Museum (Entry approx. $20, Duration: 3 hrs)"
    ],
    "restaurant": "Le Grenier de Notre Dame (Vegetarian options, Estimated around $25)",
    "motel": "Hotel FIAP Paris (Estimated around $40/night)",
    "total_cost": 115
  },
  "day_2": {
    "attractions": [
      "Arc de Triomphe (Entry approx. $15, Duration: 1.5 hrs)",
      "Sainte-Chapelle (Entry approx. $13, Duration: 1 hr)"
    ],
    "restaurant": "Loving Hut (Vegan restaurant, Estimated around $20)",
    "motel": "Hotel FIAP Paris (Estimated around $40/night)",
    "total_cost": 88
  },
  "notes": "Remaining $3 can be used for transport or snacks."
}
```


## 🔎 View Your Personalized Itinerary
---
Gemini replies in JSON format. We’ll parse it and display:
- Attractions
- Motels
- Restaurants
- Estimated cost
- Notes / Tips

In [8]:
# Function to escape markdown-sensitive characters (still useful for other fields)
def clean_for_markdown(text):
    text = text.replace("_", "\\_")
    text = text.replace("~", "\\~")
    text = text.replace("*", "\\*")
    text = text.replace("`", "\\`")
    text = re.sub(r"([a-z])([A-Z])", r"\1 \2", text)
    return text

try:
    # Remove ```json or ``` if present
    cleaned_text = re.sub(r"^```(?:json)?|```$", "", response.text.strip(), flags=re.MULTILINE).strip()
    cleaned_text = cleaned_text.replace("“", "\"").replace("”", "\"").replace("’", "'")

    itinerary = json.loads(cleaned_text)

    display(Markdown("## ✈️| Travel Itinerary"))

    total_cost = 0

    for day, plan in itinerary.items():
        if day.startswith("day"):
            # Replace underscores with spaces and title-case the label
            day_label = day.replace("_", " ").title()  # "day_1" → "Day 1"
            display(Markdown(f"### 🗓️ {day_label}"))

            # Clean each element to escape markdown characters
            parsed_attractions = [clean_for_markdown(attraction) for attraction in plan["attractions"]]
            parsed_restaurant = clean_for_markdown(plan["restaurant"])
            parsed_motel = clean_for_markdown(plan["motel"])

            # Display
            display(Markdown(f"- 🗽 **Attractions:** {', '.join(parsed_attractions)}"))
            display(Markdown(f"- 🍽️ **Restaurant:** {parsed_restaurant}"))
            display(Markdown(f"- 🛏️ **Motel:** {parsed_motel}"))
            display(Markdown(f"- 💰 **Estimated Cost:** ${plan['total_cost']}"))
            total_cost += plan["total_cost"]

        elif day == "notes":
            display(Markdown(f"**📝 Notes:** {clean_for_markdown(plan)}"))

    # Final budget summary
    display(Markdown(f"### 💸 **Total Estimated Trip Cost:** ${total_cost}"))
    remaining = user_input["budget"] - total_cost

    if remaining > 0:
        display(Markdown(f"✅ **Remaining Budget:** ${remaining}"))
    elif remaining < 0:
        display(Markdown(f"⚠️ **Over Budget by:** ${-remaining}"))
    else:
        display(Markdown("📎 **Used the entire budget perfectly!**"))

except Exception as e:
    print("❌ Could not parse response as JSON:", e)
    print("Raw response:\n", response.text)

## ✈️| Travel Itinerary

### 🗓️ Day 1

- 🗽 **Attractions:** Eiffel Tower (Entry approx. $30, Duration: 2 hrs), Louvre Museum (Entry approx. $20, Duration: 3 hrs)

- 🍽️ **Restaurant:** Le Grenier de Notre Dame (Vegetarian options, Estimated around $25)

- 🛏️ **Motel:** Hotel FIAP Paris (Estimated around $40/night)

- 💰 **Estimated Cost:** $115

### 🗓️ Day 2

- 🗽 **Attractions:** Arc de Triomphe (Entry approx. $15, Duration: 1.5 hrs), Sainte-Chapelle (Entry approx. $13, Duration: 1 hr)

- 🍽️ **Restaurant:** Loving Hut (Vegan restaurant, Estimated around $20)

- 🛏️ **Motel:** Hotel FIAP Paris (Estimated around $40/night)

- 💰 **Estimated Cost:** $88

**📝 Notes:** Remaining $3 can be used for transport or snacks.

### 💸 **Total Estimated Trip Cost:** $203

⚠️ **Over Budget by:** $3

## 🔊 Function Calling with Gemini
---
Gemini can now *intelligently decide* which Python function to call, based on user queries.

We’ll define:
- 🔍 A `find_motels` function for matching budget, location
- 🍽️ A `find_restaurants` function based on food type
- 🎢 A `find_attractions` function based on interests

Let’s empower Gemini to become a truly interactive travel assistant!

In [9]:
# Functions Gemini can call

def find_motels(location: str, max_price: float):
    return motels[(motels["location"] == location) & (motels["price_per_night"] <= max_price)].to_dict(orient="records")

def find_restaurants(location: str, food_type: str):
    return restaurants[(restaurants["location"] == location) & (restaurants["type"].str.contains(food_type, case=False))].to_dict(orient="records")

def find_attractions(location: str, interest_type: str):
    return attractions[(attractions["location"] == location) & (attractions["type"].str.contains(interest_type, case=False))].to_dict(orient="records")

## ✒️ Define the Function Schema for Gemini
---
Here we define the function names, descriptions, and parameters Gemini can use to choose what to call. These are sent to Gemini as metadata.

In [10]:
functions = [
    {
        "name": "find_motels",
        "description": "Find budget motels in a specific location.",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string"},
                "max_price": {"type": "number"}
            },
            "required": ["location", "max_price"]
        }
    },
    {
        "name": "find_restaurants",
        "description": "Find restaurants by location and food type.",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string"},
                "food_type": {"type": "string"}
            },
            "required": ["location", "food_type"]
        }
    },
    {
        "name": "find_attractions",
        "description": "Find attractions by location and interest type.",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string"},
                "interest_type": {"type": "string"}
            },
            "required": ["location", "interest_type"]
        }
    }
]

## ⚙️ Loading Gemini with Function Calling Enabled
---

In [11]:
model_fc = genai.GenerativeModel(
    model_name="gemini-1.5-flash",
    tools=[{"function_declarations": functions}]
)

## 💬 Ask Your Custom Travel Query!
---
Try asking things like:
- “Show me motels in Paris under $50”
- “Find vegetarian restaurants in Paris”
- “What museums can I visit in Paris?”

Gemini will choose the right function automatically!

---
<h4>🔧| How to Make the Code Interactive</h4>

To make the query interactive, uncomment the line containing `input()` below. This will prompt you to enter your own travel query when you run the cell.

By default, it’s set to a sample query. If you want to test it with your own query, simply uncomment the line and run the cell again.

Example: `# user_query = input("💬 Enter your travel query: ")` - remove '#' to run and test the chatbot.

---
⚠️ **Note for Kaggle Users: Save Without Crashes**

This cell block below uses `input()` and Gemini chat, which will throw an error if you click **"Save & Run All"** in Kaggle.

To avoid this, do the following:

- Comment this line: `user_query = "What museums can I visit in Paris?"  # Default query` if you want to enter your query manually.
- Uncomment this line: `# user_query = input("💬 Enter your travel query: ")  # Uncomment for interactive input` to run and test the chatbot.

Only run the chatbot **manually**, not during "Save & Run All".

In [12]:
# 🔄 Gemini Chat Start
chat = model_fc.start_chat()

# 🔄 Interactive Query
user_query = "What museums can I visit in Paris?"  # Default query
# user_query = input("💬 Enter your travel query: ")  # Uncomment for interactive input

# Send it to Gemini
response = chat.send_message(user_query)

if response.candidates[0].content.parts:
    result = response.candidates[0].content.parts[0].function_call
    print("🧠 Function to Call:", result.name)

    # Convert args to dict (Safe fix for notebook)
    try:
        args_dict = {k: v for k, v in result.args.items()}
        print("📦 Arguments:", args_dict)
    except Exception as e:
        print(f"❌ Error converting arguments: {e}")
        args_dict = {}

    # Call the corresponding Python function
    if result.name == "find_motels":
        output = find_motels(**args_dict)
    elif result.name == "find_restaurants":
        output = find_restaurants(**args_dict)
    elif result.name == "find_attractions":
        output = find_attractions(**args_dict)
    else:
        output = []

    # Display results
    if output:
        print("🔎 Gemini's Results:")
        for i, entry in enumerate(output, 1):
            print(f"\nResult #{i}")
            for k, v in entry.items():
                print(f"- {k.capitalize()}: {v}")
    else:
        print("🤷 No matching results found.")
else:
    print("❌ Gemini didn’t suggest a function to call.")

🧠 Function to Call: find_attractions
📦 Arguments: {'location': 'Paris', 'interest_type': 'museum'}
🔎 Gemini's Results:

Result #1
- Name: Louvre Museum
- Location: Paris
- Type: Museum
- Duration: 3 hours


## 📄 Ask Your Custom Travel Query!
---
📂 Upload PDFs like rome_guide.pdf, paris_brochure.pdf to your Kaggle notebook directory
Replace the filenames list with your uploaded file names below 👇🏼

In [13]:
# List your travel brochure PDF files here
filenames = ["/kaggle/input/travel-brochures/Japan.pdf", "/kaggle/input/travel-brochures/Malaysia.pdf", "/kaggle/input/travel-brochures/Thailand.pdf"]  # Replace with your own uploaded PDFs

# Helper function to extract all text from PDFs using PyMuPDF
def extract_text_from_pdfs(pdf_files):
    text_chunks = []
    for file in pdf_files:
        with fitz.open(file) as doc:
            for page in doc:
                text = page.get_text()
                # Chunking: Split into small parts (~500 words)
                words = text.split()
                for i in range(0, len(words), 500):
                    chunk = " ".join(words[i:i+500])
                    text_chunks.append(chunk)
    return text_chunks

# Extract text from the uploaded PDFs
pdf_chunks = extract_text_from_pdfs(filenames)

print(f"✅ Extracted {len(pdf_chunks)} chunks from {len(filenames)} files.")

✅ Extracted 75 chunks from 3 files.


## 📊 Generate Embeddings + Build FAISS Index
---
We’ll embed all chunks using Gemini’s embedding API, then build a FAISS index to enable similarity search.

In [14]:
# Load the embedding model
embedding_model = "models/embedding-001"

# Embed all PDF text chunks
pdf_embeddings = []
for chunk in pdf_chunks:
    response = genai.embed_content(model=embedding_model, content=chunk)
    pdf_embeddings.append(response["embedding"])

# Convert to numpy array for FAISS
embedding_matrix = np.array(pdf_embeddings).astype("float32")

# Build the FAISS index
index = faiss.IndexFlatL2(embedding_matrix.shape[1])
index.add(embedding_matrix)

print("✅ FAISS index built with", index.ntotal, "chunks.")

✅ FAISS index built with 75 chunks.


## 🗣️ Ask Questions Based on Brochure PDFs (RAG)
---
You can now ask Gemini any question related to the brochure PDFs you uploaded.
We’ll search the most relevant chunks using FAISS, then pass them to Gemini to answer.

You can change the question below to test custom ones 👇🏼

In [15]:
# ✅ Change this to ask your own brochure-related question
question = "What are some historical places to visit in Japan?"

# 🔍 Step 1: Embed your question
query_embedding = genai.embed_content(model=embedding_model, content=question)["embedding"]
query_vector = np.array(query_embedding).astype("float32").reshape(1, -1)

# 🔍 Step 2: Perform similarity search
k = 3  # Retrieve top 3 similar chunks
_, indices = index.search(query_vector, k)

# 🔍 Step 3: Retrieve top chunks
retrieved_chunks = [pdf_chunks[i] for i in indices[0]]

# 🔍 Step 4: Construct prompt with relevant chunks
rag_prompt = f"""
You are a helpful travel assistant. Based on the travel brochure content below, answer the user's question.

Brochure Extracts:
---\n{retrieved_chunks[0]}\n---\n{retrieved_chunks[1]}\n---\n{retrieved_chunks[2]}\n---

Question: {question}

Answer:"""

# 🔍 Step 5: Ask Gemini using RAG context
rag_response = model.generate_content(rag_prompt)
print("💬 Gemini's Answer:\n")
print(rag_response.text)

💬 Gemini's Answer:

Based on the brochure extracts, here are some historical places to visit in Japan:

*   **Matsumoto Castle:** One of Japan’s oldest fortresses.
*   **Shirakawa-go (near Takayama):** A UNESCO World Heritage Site historic village.
*   **Nara Park (Nara):** Features age-old cultural values and freely roaming deer.
*   **Kyoto:** The former capital with numerous sacred temples and shrines and 17 UNESCO World Heritage Sites.
*   **Kanazawa:** Offers an abundance of intact historic treasures and Kenroku-en Garden.
*   **Hiroshima:** While not explicitly listed as a specific historical site, the brochure mentions the "haunting history of Hiroshima," implying it's a place to reflect on history.
