# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Identify lamp make and model using Gemini 2.5 flash and a RAG

 This notebook demonstrates how to identify the make and model of a lamp using the Gemini 2.5 Flash model with a Retrieval Augmented Generation (RAG) approach.

 **Notebook Description:**

 The primary goal of this notebook is to leverage the advanced multimodal capabilities of Gemini 2.5 Flash to analyze images of lamps and extract specific information about their make and model. It employs a RAG strategy, which means it will likely involve:

 1.  **Image Input:** You'll provide images of lamps as input.
 2.  **Information Retrieval (RAG):** The notebook will likely access an external knowledge base or a set of documents containing information about various lamp makes and models. This could be a local dataset, a cloud-hosted database, or even web search results.
 3.  **Gemini 2.5 Flash Integration:** The Gemini 2.5 Flash model will process the input image and the retrieved information to identify the most probable make and model of the lamp.
 4.  **Output:** The notebook will output the identified make and model, potentially with a confidence score or additional descriptive details.

 **Prerequisites for First-Time Users:**

 To successfully run this notebook, please ensure the following:

 1.  **Google Cloud Project:** You need an active Google Cloud project.
 2.  **Enable APIs:**
     *   **Vertex AI API:** This is essential for accessing and using Gemini models. You can enable it through the Google Cloud Console under "APIs & Services" > "Library".
 3.  **Authentication:**
     *   **Google Cloud Authentication:** Ensure your Colab environment is authenticated to your Google Cloud project. You can do this by running the following command in a code cell:
         ```python
         from google.colab import auth
         auth.authenticate_user()
         ```
         Follow the prompts to log in with your Google account that has access to the project.
     *   **Service Account (Optional but Recommended for Production):** For more robust applications, consider setting up a service account with appropriate permissions (e.g., Vertex AI User role) and downloading its JSON key file. You can then authenticate using:
       ```python
         from google.colab import auth
         from google.oauth2 import service_account

         # Replace 'path/to/your/service_account.json' with the actual path to your key file
         credentials = service_account.Credentials.from_service_account_file('path/to/your/service_account.json')
         auth.authenticate_user(credentials=credentials)
         ```
 4.  **Gemini API Key (if not using Vertex AI):** While this notebook likely uses Vertex AI, if it were to use a direct Gemini API, you would need to obtain an API key from Google AI Studio or Google Cloud Console.
 5.  **Input Data:**
     *   **Lamp Images:** Prepare the images of the lamps you want to identify. These should be uploaded to your Colab environment or accessible from Google Cloud Storage.
     *   **RAG Data Source:** If the RAG component relies on a specific data source (e.g., a CSV file, a database, or a collection of documents), ensure this data is accessible and in the correct format. This might involve uploading files to Colab or configuring access to cloud storage.
 6.  **Required Libraries:** The notebook will likely import several libraries. Ensure they are installed. Common ones include:
     ```python
     !pip install google-cloud-aiplatform google-cloud-storage Pillow
     ```
     (The specific libraries might vary based on the notebook's implementation.)

 By following these steps, you should be well-equipped to run this notebook and explore the capabilities of Gemini 2.5 Flash for lamp identification.

## Install Required Libraries

In [None]:
!pip install --upgrade google-cloud-bigquery google-genai

## Configuration

**Important**: Replace the placeholder values below with your actual GCP Project ID and Region.

In [None]:
PROJECT_ID = ''  # @param {type:"string"}
REGION = 'us-central1'      # @param {type:"string"}
BIGQUERY_DATASET = 'imagery_insights___preview___us' # @param {type:"string"}
BIGQUERY_TABLE = 'latest_observations' # @param {type:"string"}
ASSET_TYPE = 'ASSET_CLASS_UTILITY_POLE' # @param {type:"string"}
LIMIT = 100 # @param {type:"integer"}

In [None]:
# Import necessary libraries
import vertexai
from google.cloud import bigquery
from google import genai
from google.genai.types import Content, Part, Tool, Retrieval, VertexRagStore, VertexRagStoreRagResource

# Initialize Vertex AI
vertexai.init(project=PROJECT_ID, location=REGION)

print("Libraries imported and Vertex AI initialized.")

## Fetch Image URIs from BigQuery

Next, we'll query a BigQuery table to get the GCS URIs of the images we want to classify.

In [None]:
# Updated SQL to explicitly select the required columns
BIGQUERY_SQL_QUERY = f"""
SELECT
  asset_id,
  observation_id,
  gcs_uri
FROM
  `{PROJECT_ID}.{BIGQUERY_DATASET}.{BIGQUERY_TABLE}`
WHERE
  asset_type = "{ASSET_TYPE}"
  AND gcs_uri IS NOT NULL
LIMIT {LIMIT};
"""

# Execute BigQuery Query and store the full row data
try:
    bigquery_client = bigquery.Client(project=PROJECT_ID)
    query_job = bigquery_client.query(BIGQUERY_SQL_QUERY)
    # This new variable will hold all the data needed for the DataFrame
    image_data_from_bq = [dict(row) for row in query_job]

    # For compatibility with the existing loop, we also extract the URIs
    gcs_uris = [item.get("gcs_uri") for item in image_data_from_bq if item.get("gcs_uri")]

    print(f"Successfully fetched {len(image_data_from_bq)} items from BigQuery:")
    for item in image_data_from_bq:
        print(f"  - {item}")
except Exception as e:
    print(f"An error occurred while querying BigQuery: {e}")
    image_data_from_bq = []
    gcs_uris = []

## Define Image Classification Function

This function takes a GCS URI and a prompt, then uses the Gemini 2.5 Flash model to generate a description of the image.

In [None]:
def classify_image_with_gemini(gcs_uri: str, prompt: str) -> str:
    """
    Classifies an image using the Gemini 2.5 Flash model by directly passing its GCS URI.
    """
    MODEL = "gemini-2.5-flash" # @param {type:"string"}

    try:
        model = genai.GenerativeModel(MODEL)
        image_part = Part.from_uri(uri=gcs_uri, mime_type="image/jpeg")
        responses = model.generate_content([image_part, prompt])
        return responses.text
    except Exception as e:
        print(f"Error classifying image from URI {gcs_uri}: {e}")
        return "Classification failed."

## Setup for RAG

### Subtask:
Import the necessary libraries for Vertex AI Search and configure the RAG datastore.

In [None]:
# Define the path to the RAG Corpus
RAG_CORPUS_PATH = "projects/sarthaks-lab/locations/us-east4/ragCorpora/6838716034162098176" # @param {type:"string"}

print("Configuration and imports updated for the new RAG method.")

## Create a Lamp Post Check Function

### Subtask:
Develop a function that takes an image URI and uses a simple prompt to quickly determine if the image contains a lamp post.

In [None]:
def is_lamp_post(gcs_uri: str) -> bool:
    """
    Uses the Gemini model to quickly check if an image contains a lamp post.
    """
    MODEL = "gemini-2.5-flash" # @param {type:"string"}

    try:
        model = genai.GenerativeModel(MODEL)
        image_part = Part.from_uri(uri=gcs_uri, mime_type="image/jpeg")
        prompt = "Does this image contain a lamp post or a street light? Answer with only 'yes' or 'no'."

        responses = model.generate_content([image_part, prompt])

        # Clean up the response and check for a 'yes'
        return responses.text.strip().lower() == 'yes'
    except Exception as e:
        print(f"Error checking image {gcs_uri}: {e}")
        return False

# Example usage with a placeholder URI (will not work without a valid URI)
# print(is_lamp_post("gs://your-bucket/path/to/image.jpg"))

## Create a Detailed Description Function

This cell defines a Python function `is_lamp_post` that takes a Google Cloud Storage (GCS) URI of an image as input. It utilizes the `gemini-2.5-flash` model from Vertex AI to determine if the image contains a lamp post or street light. The function sends the image and a specific prompt to the model and returns `True` if the model's response, after cleaning and converting to lowercase, is exactly 'yes', and `False` otherwise. It also includes basic error handling for the model interaction.

In [None]:
def get_detailed_description(gcs_uri: str) -> str:
    """
    Uses the Gemini model to generate a detailed description of a lamp post from an image.
    """
    MODEL = "gemini-2.5-flash" # @param {type:"string"}

    try:
        model = genai.GenerativeModel(MODEL)
        image_part = Part.from_uri(uri=gcs_uri, mime_type="image/jpeg")
        prompt = """
        Describe the lamp post in this image in detail. Focus on the following features:
        - **Overall style**: (e.g., modern, vintage, ornate, simple)
        - **Pole material and color**: (e.g., black metal, grey concrete, brown wood)
        - **Lamp head shape and design**: (e.g., lantern-style, cobra head, globe, multi-light fixture)
        - **Light source**: (e.g., LED, bulb type if visible)
        - **Any distinctive features**: (e.g., decorative elements, banners, bases, arms)
        Provide a concise, structured summary of these features.
        """

        responses = model.generate_content([image_part, prompt])
        return responses.text
    except Exception as e:
        print(f"Error generating description for {gcs_uri}: {e}")
        return "Description generation failed."

# Example usage with a placeholder URI (will not work without a valid URI)
# print(get_detailed_description("gs://your-bucket/path/to/image.jpg"))

## Image Classification and Description with Gemini

This section of the notebook focuses on leveraging the Gemini 2.5 Flash model for image analysis. It defines two key functions:

1.  `is_lamp_post`: This function quickly determines if an image contains a lamp post or street light by sending the image and a specific "yes/no" prompt to the Gemini model.
2.  `get_detailed_description`: This function provides a more in-depth analysis, generating a structured and detailed description of a lamp post within an image, focusing on specific features like style, material, color, and design.

These functions are designed to process images from GCS URIs and extract relevant information using advanced AI capabilities.

In [None]:
def get_make_and_model(client: genai.Client, description: str) -> str:
    """
    Uses the RAG-grounded Gemini model to identify the make and model of a lamp post
    based on its detailed description, using the google.genai SDK.
    """
    try:
        # The client is now passed as an argument to be reused.
        prompt = f"""        You are an expert asset inspector. Your task is to find the **closest possible match** for the lamp post described below using the provided dataset.
        Do not give up easily. It is crucial that you find a match if one exists, even if it's not perfect.

        Description to analyze:
        {description}

        Compare this description against the "Description/Key_Features" and "Dimensions" fields in the dataset. Focus on specific details like \"250W High Pressure Sodium\" or dimensions like \"31.5\"D x 14.75\"W x 14\"H\".

        Provide the answer in the format "Make: [Make], Model: [Model]".
        **Only as a last resort**, if there is absolutely no resemblance to any entry, respond with "Make: Unknown, Model: Unknown".
        """

        # Construct the tool configuration with the RAG corpus
        tools = [
            Tool(
                retrieval=Retrieval(
                    vertex_rag_store=VertexRagStore(
                        rag_resources=[
                            VertexRagStoreRagResource(
                                rag_corpus=RAG_CORPUS_PATH
                            )
                        ]
                    )
                )
            )
        ]

        # As per your example, create a config object to hold the tools
        generate_content_config = genai.types.GenerateContentConfig(
            tools=tools
        )
        # Generate the content using the provided client and pass the tools via the config object
        response = client.models.generate_content(
            model="gemini-2.5-flash",
            contents=[prompt],
            generation_config=generate_content_config,
        )

        return response.text
    except Exception as e:
        print(f"Error identifying make and model: {e}")
        return "Make and model identification failed."

## Process Images for Lamp Post Detection and Description

This section iterates through a list of image URIs. For each image, it first checks if a lamp post is present using the `is_lamp_post` function. If a lamp post is detected, it then proceeds to generate a detailed description of the lamp post using the `get_detailed_description` function. Finally, it attempts to identify the make and model of the lamp post based on the detailed description using the `get_make_and_model` function, leveraging a RAG-grounded Gemini model. The results of these steps (whether a lamp post was detected, its description, and its identified make/model) are then printed.

In [None]:
import json

# Define Project and Region to ensure they are in scope
PROJECT_ID = 'sarthaks-lab'
REGION = 'us-central1'

# Initialize the google.genai client once using the user-provided method
final_results = []
try:
    # Per your instruction, initializing with project and location as direct arguments
    genai_client = genai.Client(
        vertexai=True, project=PROJECT_ID, location=REGION
    )
    print("google.genai client initialized successfully.")
except Exception as e:
    print(f"Error initializing google.genai client: {e}")
    genai_client = None

# Main processing loop to analyze each image
if genai_client:
    for uri in gcs_uris:
        print(f"Processing image: {uri}")

        # Use the first function to check if it's a lamp post
        if is_lamp_post(uri):
            print("  -> Lamp post detected. Generating detailed description...")

            # If it is, get the detailed description
            description = get_detailed_description(uri)
            print(f"  -> Generated Description: {description.strip()}")

            # Use the description to find the make and model with RAG
            print("  -> Identifying make and model using RAG...")
            make_and_model = get_make_and_model(genai_client, description)
            print(f"  -> Identified Make and Model: {make_and_model.strip()}")

            final_results.append({
                "uri": uri,
                "is_lamp_post": True,
                "description": description,
                "make_and_model": make_and_model
            })
        else:
            print("  -> Not a lamp post. Skipping analysis.")
            final_results.append({
                "uri": uri,
                "is_lamp_post": False,
                "description": "N/A",
                "make_and_model": "N/A"
            })

        print("-"*50)

    print("\n--- Analysis Complete: Final Results ---")
    print(json.dumps(final_results, indent=2))
else:
    print("google.genai client was not initialized. Cannot proceed with analysis.")

## Process Images and Create DataFrame

This section runs the analysis loop, stores the results, and then displays them in a DataFrame.

In [None]:
import re
import pandas as pd

# Define Project and Region to ensure they are in scope
PROJECT_ID = 'sarthaks-lab'  # @param {type:"string"}
REGION = 'us-central1'      # @param {type:"string"}

# This list will hold the structured data for the final DataFrame
dataframe_results = []

# Initialize the google.genai client
try:
    genai_client = genai.Client(
        vertexai=True, project=PROJECT_ID, location=REGION
    )
    print("google.genai client initialized successfully.")
except Exception as e:
    print(f"Error initializing google.genai client: {e}")
    genai_client = None

# Main processing loop
if 'genai_client' in locals() and genai_client:
    # Iterate through the full data fetched from BigQuery
    for item in image_data_from_bq:
        uri = item.get("gcs_uri")
        asset_id = item.get("asset_id")
        observation_id = item.get("observation_id")

        if not uri:
            print(f"Skipping item with missing gcs_uri: {item}")
            continue

        print(f"Processing image for asset {asset_id}: {uri}")

        if is_lamp_post(uri):
            print("  -> Lamp post detected. Generating detailed description...")
            description = get_detailed_description(uri)

            print("  -> Identifying make and model using RAG...")
            make_and_model_str = get_make_and_model(genai_client, description)
            print(f"  -> Identified: {make_and_model_str.strip()}")

            # Use regex to parse the Make and Model from the response string
            match = re.match(r"Make:\s*(.*?),?\s*Model:\s*(.*)", make_and_model_str.strip(), re.IGNORECASE)

            if match:
                make = match.group(1).strip()
                model = match.group(2).strip()

                # Add to results list only if a specific make/model was found
                if make.lower() not in ["unknown", "n/a"] and model.lower() not in ["unknown", "n/a", "generic incandescent fixture", "generic mv cobra head"]:
                    print(f"  -> SUCCESS: Found Make: {make}, Model: {model}")
                    dataframe_results.append({
                        "asset_id": asset_id,
                        "observation_id": observation_id,
                        "make": make,
                        "model": model
                    })
                else:
                    print("  -> INFO: Generic or unknown make/model found. Skipping for DataFrame.")
            else:
                print("  -> INFO: Could not parse make and model from response. Skipping for DataFrame.")
        else:
            print("  -> Not a lamp post. Skipping analysis.")
        print("-" * 50)

    print("\n--- Analysis Complete --- ")
    if dataframe_results:
        results_df = pd.DataFrame(dataframe_results)
        print("\n--- Identified Lamp Posts --- ")
        display(results_df)
    else:
        print("\n--- No specific lamp post models were identified. ---")
else:
    print("google.genai client was not initialized. Cannot proceed with analysis.")

## Save the dataframe to BigQuery

This cell initializes the google.genai client, which is used for interacting with Gemini models.
It takes the Google Cloud Project ID and Region as parameters.
The client is initialized with vertexai=True to use Vertex AI's backend.
Error handling is included in case the client initialization fails.
The initialized client is stored in the 'genai_client' variable.

In [None]:
def save_dataframe_to_bigquery(df, project_id, dataset_id, table_id):
    """
    Saves a pandas DataFrame to a specified BigQuery table.

    Args:
        df (pd.DataFrame): The DataFrame to save.
        project_id (str): Your Google Cloud project ID.
        dataset_id (str): The BigQuery dataset ID.
        table_id (str): The BigQuery table ID to create or append to.
    """
    if 'df' not in locals() or df.empty:
        print("The DataFrame is empty or does not exist. Nothing to save to BigQuery.")
        return

    try:
        table_ref = f"{project_id}.{dataset_id}.{table_id}"
        print(f"Attempting to save DataFrame to BigQuery table: {table_ref}")
        df.to_gbq(destination_table=table_ref, project_id=project_id, if_exists='append')
        print(f"Successfully saved DataFrame to BigQuery table: {table_ref}")
    except Exception as e:
        print(f"An error occurred while saving to BigQuery: {e}")

# --- Example Usage (edit and uncomment to run) ---
# if 'results_df' in locals() and not results_df.empty:
#     BIGQUERY_DATASET = 'your_dataset'      # @param {type:"string"}
#     BIGQUERY_TABLE = 'identified_lamp_posts'   # @param {type:"string"}
#     save_dataframe_to_bigquery(results_df, PROJECT_ID, BIGQUERY_DATASET, BIGQUERY_TABLE)
# else:
#     print("DataFrame 'results_df' not found or is empty. Skipping BigQuery upload.")