# 🎨 ArtExplorer: Your Personal AI Museum Docent (Application)

This notebook runs the interactive user interface for the AI Art Docent. It is the final, user-facing component of our submission for the BigQuery AI Hackathon.

### **1. The Challenge: Overcoming "Gallery Fatigue"**

Museums house humanity's greatest treasures, yet visitors often experience "gallery fatigue"—a feeling of being overwhelmed by vast collections with limited context. Traditional keyword searches fall short when trying to find art that matches a feeling or a mood, like "a quiet, lonely night." Our project solves this problem by transforming art discovery from a simple search into a personalized, guided conversation.

### **2. The Solution: A Conversation with Art**

ArtExplorer is a fully interactive prototype, built entirely within this Kaggle notebook, that allows you to:

1.  **Search by Meaning, Not Just Keywords**: Describe a feeling, a mood, or an abstract concept in natural language.
2.  **Discover Relevant Art**: Using the power of BigQuery's `VECTOR_SEARCH`, the application finds artworks that are semantically similar to your query.
3.  **Receive a Personalized Story**: Once an artwork is selected, our "AI Docent," powered by a Gemini model in BigQuery, generates a unique and engaging narrative, explaining the piece in the context of your original search.

### **3. How to Use This Notebook**

This notebook runs the final, interactive application.

* **Prerequisite**: The data and AI models must first be created by running the companion **[Data Pipeline Notebook](https://www.kaggle.com/code/oceanchoi/artexplorer-data-pipeline)** once.
* **To Launch**: Simply run all the cells in this notebook (`Run -> Run All`). The interactive UI will appear at the bottom. 

### **4. End-to-End Technical Architecture**

The full architecture, including the data pipeline and this interactive application, is as follows:

```
[MET Museum API] -> [Python Data Enrichment (Kaggle Notebook)] -> [Google Cloud Storage]
|
v
[BigQuery AI Engine]
/

[ML.GENERATE_EMBEDDING] -> [Vector Table] -> [VECTOR_SEARCH] <---> [Interactive App (UI)] <---> [ML.GENERATE_TEXT (Gemini)]
```

### **5. Alignment with Hackathon Goals**

This project directly addresses the core themes of the BigQuery AI Hackathon:

-   **Innovation**: We combine vector search and generative AI in a novel way to create a new form of human-data interaction, moving beyond simple Q&A to generative, contextual storytelling.
-   **Utility & Impact**: ArtExplorer solves a real-world problem by making large art collections more accessible, personal, and educational for a global audience.
-   **Completeness**: This notebook contains a fully functional, end-to-end prototype that demonstrates a polished and meaningful user experience.

We invite you to run the cells below and experience your own personal museum tour.

**CELL 1: Setup - Install Libraries**

This cell installs the necessary libraries, ensuring compatibility with the Kaggle environment and Google Cloud services.

In [1]:
# ===================================================================
# CELL 1: SETUP - Install Libraries
# ===================================================================
print("--- [1/3] Setting up the environment: Installing libraries... ---")

# Upgrade core Google Cloud libraries to their latest versions.
!pip install --upgrade -q google-cloud-bigquery google-cloud-bigquery-storage google-cloud-aiplatform

# Downgrade protobuf to a compatible version to prevent conflicts.
!pip install -q protobuf==3.20.3

print("✅ Library setup complete.")
print("ℹ️ A kernel restart may be recommended for changes to take full effect.")

--- [1/3] Setting up the environment: Installing libraries... ---
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m41.6/41.6 kB[0m [31m1.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m259.3/259.3 kB[0m [31m7.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m293.6/293.6 kB[0m [31m18.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m8.0/8.0 MB[0m [31m84.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m160.8/160.8 kB[0m [31m9.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m322.0/322.0 kB[0m [31m14.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.5/6.5 MB[0m [31m97.1 MB/s[0m eta [36m0:00:00[0m
[?25h[31mERROR: pip's dependency resolver does not currently take into account all the pa

**CELL 2: Authentication and Configuration**

This cell handles all authentication, library imports, and global configuration for the application.

In [2]:
# ===================================================================
# CELL 2: AUTHENTICATION & CONFIGURATION 
# ===================================================================
print("--- [2/3] Authenticating and setting configuration... ---")

# --- 1. Import Libraries ---
import os, json, pandas as pd, requests
from google.cloud import bigquery
from google.oauth2 import service_account
from kaggle_secrets import UserSecretsClient
import ipywidgets as widgets
from IPython.display import display, HTML, clear_output

# --- 2. GCP Service Account Authentication ---
os.environ['KAGGLE_DISABLE_GCP_INTEGRATION'] = 'true'
user_secrets = UserSecretsClient()

# The scope URL is corrected to a clean, plain string.
credentials = service_account.Credentials.from_service_account_info(
    json.loads(user_secrets.get_secret("GCP_CREDENTIALS")),
    scopes=['https://www.googleapis.com/auth/cloud-platform']
)

# --- 3. Configuration & Client Initialization ---
PROJECT_ID = "semantic-art-explorer"
DATASET_ID = "art_dataset" # Must match the dataset from the pipeline notebook
REGION = "us-central1"
bq_client = bigquery.Client(project=PROJECT_ID, credentials=credentials)

print("✅ Environment ready. Clients initialized successfully.")

--- [2/3] Authenticating and setting configuration... ---
✅ Environment ready. Clients initialized successfully.


**CELL 3: Interactive AI Art Docent Application**

This final cell launches the interactive UI, allowing users to search for art and get AI-generated stories.

In [3]:
# ===================================================================
# CELL 3: PROTOTYPE - The Interactive ArtExplorer AI Docent
# ===================================================================
print("--- [3/3] Launching the Interactive AI Art Docent application... ---")

# --- 1. Define BQML Resource Names ---
TEXT_MODEL_ID = f"{PROJECT_ID}.{DATASET_ID}.art_embedding_model"
STORY_MODEL_ID = f"{PROJECT_ID}.{DATASET_ID}.art_storytelling_model"
VECTORS_TABLE_ID = f"{PROJECT_ID}.{DATASET_ID}.paintings_enriched_with_vectors"

# ===================================================================
# 2. Backend Functions - Logic for Interacting with GCP
# ===================================================================

def find_art_by_text(query_text: str) -> pd.DataFrame:
    """Finds artworks by performing a vector search in BigQuery."""
    safe_query_text = query_text.replace("'", "\\'")
    sql = f"""
        WITH user_query AS (
            SELECT ml_generate_embedding_result FROM ML.GENERATE_EMBEDDING(
                MODEL `{TEXT_MODEL_ID}`, (SELECT '{safe_query_text}' AS content)
            )
        )
        SELECT v.base.Object_ID, v.base.Title, v.base.Artist_Display_Name, v.base.Tags, v.distance
        FROM VECTOR_SEARCH(
            TABLE `{VECTORS_TABLE_ID}`, 'enriched_text_embedding',
            (SELECT * FROM user_query), top_k => 9, distance_type => 'COSINE'
        ) AS v;
    """
    return bq_client.query(sql).to_dataframe()

def get_story_for_art(artwork_details: dict, search_query: str) -> str:
    """Generates a narrative for an artwork using the BQML Gemini model."""
    safe_title = str(artwork_details.get('Title', 'Untitled')).replace("'", "\\'")
    safe_artist = str(artwork_details.get('Artist_Display_Name', 'Unknown Artist')).replace("'", "\\'")
    safe_query = search_query.replace("'", "\\'")
    prompt = f"""As an expert museum docent, your tone should be warm and engaging.
        1. Start with a welcoming greeting.
        2. Explain how this painting, '{safe_title}' by {safe_artist}, relates to the user's search for '{safe_query}'.
        3. Describe the scene vividly, pointing out 1-2 interesting details.
        4. Conclude with an open-ended question to engage the viewer."""
    safe_prompt = json.dumps(prompt)
    sql = f"""
        SELECT ml_generate_text_llm_result AS story FROM ML.GENERATE_TEXT(
            MODEL `{STORY_MODEL_ID}`,
            (SELECT {safe_prompt} AS prompt),
            STRUCT(0.4 AS temperature, 1024 AS max_output_tokens, TRUE AS flatten_json_output)
        );
    """
    df = bq_client.query(sql).to_dataframe()
    return df['story'].iloc[0] if not df.empty else "I'm sorry, I couldn't come up with a story."

def get_direct_image_url(object_id: int) -> str:
    """
    Fetches the direct image URL for a given artwork from the MET API.
    It includes a fallback to check for 'primaryImage'.
    """
    try:
        # The URL string is corrected to be a plain f-string, removing Markdown.
        api_url = f"https://collectionapi.metmuseum.org/public/collection/v1/objects/{object_id}"
        response = requests.get(api_url, timeout=5)
        response.raise_for_status()
        data = response.json()
        return data.get('primaryImageSmall') or data.get('primaryImage')
    except requests.exceptions.RequestException as e:
        # This provides a more helpful debug message if the API call fails.
        print(f"Debug: API call failed for Object ID {object_id}. Error: {e}")
        return None

# ===================================================================
# 3. Frontend UI - Elements, Event Handlers, and Display Logic
# ===================================================================

# --- Define UI Elements ---
search_box = widgets.Text(placeholder='e.g., "a lonely night", "motherly love"', layout=widgets.Layout(width='400px'))
search_button = widgets.Button(description="Search by Text", icon="search", button_style='primary', layout=widgets.Layout(width='auto'))
back_button = widgets.Button(description="New Search", icon="arrow-left", layout=widgets.Layout(width='auto'))
output_area = widgets.Output()
main_interface = widgets.HBox([search_box, search_button])

# --- Define UI Display Functions ---
def display_main_view(b=None):
    with output_area: clear_output(wait=True); display(main_interface)

def display_detail_view(artwork: pd.Series, query: str):
    with output_area:
        clear_output(wait=True); display(widgets.HTML("<h4>🎨 AI Docent is preparing your personalized tour...</h4>"))
        story = get_story_for_art(artwork.to_dict(), query)
        clear_output(wait=True); display(back_button)
        display(HTML(f"<h2>{artwork['Title']}</h2><h4><i>by {artwork['Artist_Display_Name']}</i></h4>"))
        image_url = get_direct_image_url(artwork['Object_ID'])
        if image_url: display(HTML(f"<img src='{image_url}' style='max-width:500px; height:auto; border-radius:8px;'>"))
        else: display(HTML("<p>[Image not available]</p>"))
        display(HTML(f"<h3>AI Docent's Story</h3><p style='line-height:1.6;'>{story.replace(chr(10),'<br>')}</p>"))

def display_results_view(results_df: pd.DataFrame, query: str):
    for _, row in results_df.iterrows():
        story_button = widgets.Button(description=f"Tell me more about '{row['Title']}'", layout=widgets.Layout(width='auto'))
        def make_handler(details, q): return lambda b: display_detail_view(details, q)
        story_button.on_click(make_handler(row, query))
        image_url = get_direct_image_url(row['Object_ID'])
        image_html = f"<img src='{image_url}' style='width:200px; height:auto; border-radius:4px;'>" if image_url else "[Image not available]"
        similarity = (1 - row['distance']) * 100
        display(HTML(f"<div style='display:flex; align-items:center; gap:20px; border-bottom:1px solid #eee; padding:15px 0;'><div style='flex-shrink:0;'>{image_html}</div><div><b>{row['Title']}</b> by {row['Artist_Display_Name']}<br><b>Similarity Score:</b> {similarity:.1f}%</div></div>"))
        display(story_button)

# --- Define Event Handlers ---
def on_search_clicked(b):
    with output_area:
        clear_output(wait=True); display(main_interface)
        query = search_box.value.strip()
        if not query: print("Please enter a search term."); return
        print(f"Searching for artworks related to '{query}'...")
        try:
            results = find_art_by_text(query)
            clear_output(wait=True); display(main_interface)
            if results.empty: print(f"No results found for '{query}'."); return
            display_results_view(results, query)
        except Exception as e: print(f"An unexpected error occurred: {e}"); return

# --- Link Handlers to UI Events ---
search_button.on_click(on_search_clicked)
back_button.on_click(display_main_view)

# ===================================================================
# 4. Launch Application
# ===================================================================
display(HTML("<h1>🎨 ArtExplorer AI Docent</h1>"))
display(HTML("<p>Find art by describing a feeling, a scene, or a style. The AI will act as your personal guide, revealing the stories hidden in every masterpiece.</p>"))
display(output_area)
display_main_view()

--- [3/3] Launching the Interactive AI Art Docent application... ---


Output()