<a href="https://colab.research.google.com/github/sinajahangir/Flood-Resilience-Analysis/blob/main/FloodAgent.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This AI agent code is designed to extract location parameters (such as city, region, or coordinates) from a user prompt using natural language processing. After identifying the relevant location, the agent queries a flood information system or dataset to retrieve and present current flood-related details (e.g., severity, affected areas, advisories) specific to the extracted location.

In [None]:
# Step 1: Import Libraries
import pandas as pd
import numpy as np
from scipy.spatial import cKDTree # Efficient nearest neighbor search
import google.generativeai as genai
from google.colab import userdata
import os
import re # For parsing LLM output

In [None]:
# --- LLM Setup ---
try:
    # Using Colab Secrets for API key management
    api_key = userdata.get('GOOGLE_API_KEY')
    genai.configure(api_key=api_key)
    print("Gemini API Key configured.")
    # Initialize the generative model
    llm_model = genai.GenerativeModel('gemini-1.5-flash-latest') # Use a model good at instruction following/extraction
    print(f"LLM Model '{llm_model.model_name}' initialized.")
except userdata.SecretNotFoundError:
    print("ERROR: Gemini API Key ('GEMINI_API_KEY') not found in Colab Secrets.")
    print("Please add your API key via the 'Secrets' tab (key icon) on the left.")
    llm_model = None
except Exception as e:
    print(f"An error occurred during Gemini setup: {e}")
    llm_model = None

Gemini API Key configured.
LLM Model 'models/gemini-1.5-flash-latest' initialized.


In [None]:
class CoordinateFloodProximityAgent:
    """
    An agent that uses an LLM to extract latitude and longitude from a user prompt
    and finds the coordinates of the closest flood pixel to that location.
    It retains the building data for context but does not primarily search based on it.
    """

    def __init__(self, buildings_df: pd.DataFrame, building_id_col: str,
                 lat_col: str, lon_col: str,
                 flood_pixel_coords: np.ndarray):
        """
        Initializes the agent with building and flood location data.

        Args:
            buildings_df (pd.DataFrame): DataFrame containing building information.
                                         Used for context if needed, but not primary search key.
            building_id_col (str): Name of the column containing unique building IDs.
            lat_col (str): Name of the column containing the building latitude or Y-coordinate.
            lon_col (str): Name of the column containing the building longitude or X-coordinate.
            flood_pixel_coords (np.ndarray): A NumPy array of shape (N, 2) where N
                                             is the number of flood pixels, and each
                                             row is [x, y] or [lon, lat] coordinates.
        """
        if llm_model is None:
            raise ValueError("LLM Model (Gemini) is not initialized. Cannot create agent.")
        if not isinstance(buildings_df, pd.DataFrame):
             raise TypeError("buildings_df must be a pandas DataFrame.")
        if not isinstance(flood_pixel_coords, np.ndarray) or flood_pixel_coords.ndim != 2 or flood_pixel_coords.shape[1] != 2:
             raise ValueError("flood_pixel_coords must be a 2D NumPy array with shape (N, 2).")
        if not all(col in buildings_df.columns for col in [building_id_col, lat_col, lon_col]):
            print(f"Warning: Building DataFrame does not contain all expected columns ('{building_id_col}', '{lat_col}', '{lon_col}'). Contextual building info might be limited.")
            # Allow proceeding without full building info if only flood search is critical

        self.llm = llm_model
        # Store building info primarily for potential context lookup (optional)
        self.buildings_data = buildings_df[[building_id_col, lat_col, lon_col]].copy()
        self.building_id_col = building_id_col
        self.lat_col = lat_col
        self.lon_col = lon_col

        # Store flood pixel coordinates
        self.flood_pixels = flood_pixel_coords
        print(f"Stored {len(self.flood_pixels)} flood pixel coordinates.")

        # Build a k-d tree for *flood pixels* for efficient nearest neighbor search
        if len(self.flood_pixels) > 0:
             print("Building k-d tree for flood pixels...")
             # Assuming flood_pixel_coords are [Lon, Lat] or [X, Y]
             self.flood_kdtree = cKDTree(self.flood_pixels)
             print("Flood k-d tree built.")
        else:
             print("Warning: No flood pixels provided. Closest pixel search will not work.")
             self.flood_kdtree = None


    def _extract_lat_lon_from_prompt(self, prompt: str) -> tuple[float, float] | None:
        """Uses the LLM to extract latitude and longitude from the user prompt."""
        # (This function is identical to the one in the previous LocationMappingAgent)

        llm_prompt = f"""
        Analyze the following user query and extract the single pair of geographical coordinates (latitude and longitude) mentioned.
        Latitude must be between -90 and 90. Longitude must be between -180 and 180.
        Pay attention to signs (N/S, E/W) or negative values indicating direction.

        User query: "{prompt}"

        Respond ONLY with the extracted coordinates in the format:
        LATITUDE=value, LONGITUDE=value
        For example: LATITUDE=40.7128, LONGITUDE=-74.0060

        If you cannot reliably extract both a valid latitude and a valid longitude from the query, respond with the exact word "UNKNOWN".
        """
        print(f"Sending prompt to LLM for coordinate extraction:\n---\n{llm_prompt}\n---")

        try:
            response = self.llm.generate_content(llm_prompt)
            extracted_text = response.text.strip()
            print(f"LLM response for coordinate extraction: '{extracted_text}'")

            if extracted_text == "UNKNOWN":
                print("LLM indicated coordinates could not be found.")
                return None

            match = re.match(r"LATITUDE=(-?[\d.]+),\s*LONGITUDE=(-?[\d.]+)", extracted_text, re.IGNORECASE)

            if match:
                lat_str, lon_str = match.groups()
                try:
                    latitude = float(lat_str)
                    longitude = float(lon_str)
                    if -90 <= latitude <= 90 and -180 <= longitude <= 180:
                        print(f"Successfully extracted and validated coordinates: Lat={latitude}, Lon={longitude}")
                        return latitude, longitude
                    else:
                        print(f"Extracted coordinates out of valid range: Lat={latitude}, Lon={longitude}")
                        return None
                except ValueError:
                    print(f"Could not convert extracted strings to float: '{lat_str}', '{lon_str}'")
                    return None
            else:
                print(f"LLM response did not match expected format 'LATITUDE=..., LONGITUDE=...': '{extracted_text}'")
                return None

        except Exception as e:
            print(f"Error during LLM call or parsing for coordinate extraction: {e}")
            return None


    def _find_closest_flood_pixel(self, target_lat: float, target_lon: float) -> tuple[np.ndarray | None, float | None]:
        """
        Finds the nearest flood pixel using the flood k-d tree.

        Args:
            target_lat: The target latitude.
            target_lon: The target longitude.

        Returns:
            A tuple (closest_pixel_coords, distance) or (None, None) if error/no data.
            closest_pixel_coords is [longitude, latitude] or [x, y] as in the input array.
        """
        if self.flood_kdtree is None or len(self.flood_pixels) == 0:
            print("No flood pixel data or k-d tree available for search.")
            return None, None

        # Ensure target coords are in the correct order for the tree ([Lon, Lat] or [X,Y])
        target_point = np.array([target_lon, target_lat])

        try:
            # Query the k-d tree: find the 1 nearest neighbor
            distance, index = self.flood_kdtree.query(target_point, k=1)
            closest_pixel_coords = self.flood_pixels[index]
            print(f"Closest flood pixel found: Coords={closest_pixel_coords}, Distance={distance:.4f}")
            return closest_pixel_coords, distance
        except Exception as e:
            print(f"Error during flood k-d tree query: {e}")
            return None, None


    def find_closest_flood_pixel_to_location(self, user_prompt: str) -> str:
        """
        Processes a user prompt to extract coordinates and find the nearest flood pixel.

        Args:
            user_prompt (str): The natural language query from the user containing coordinates.

        Returns:
            str: A natural language response summarizing the findings.
        """
        # 1. Extract Lat/Lon using LLM
        extracted_coords = self._extract_lat_lon_from_prompt(user_prompt)

        if extracted_coords is None:
            # Ask LLM to formulate a response about not finding coordinates
            try:
                 response = self.llm.generate_content(f"The user asked: '{user_prompt}'. I could not extract valid latitude and longitude coordinates from this query. Please formulate a polite response asking the user to provide clear coordinates (e.g., 'latitude 40.7, longitude -74.0').")
                 return response.text.strip()
            except Exception as e:
                 print(f"LLM failed to generate clarification response: {e}")
                 return "I couldn't identify valid geographic coordinates (latitude and longitude) in your request. Please provide them clearly."

        target_lat, target_lon = extracted_coords

        # 2. Find the closest FLOOD PIXEL computationally
        closest_pixel_coords, distance = self._find_closest_flood_pixel(target_lat, target_lon)

        # 3. Formulate the final response (optionally using LLM)
        if closest_pixel_coords is not None and distance is not None:
            # Assuming coords are [Lon, Lat] for reporting
            summary = (f"For the location you provided (approx. Latitude={target_lat:.6f}, Longitude={target_lon:.6f}), "
                       f"the closest flood pixel recorded in the data is at Latitude={closest_pixel_coords[1]:.6f}, Longitude={closest_pixel_coords[0]:.6f}. "
                       f"The calculated distance is about {distance:.4f} units (based on the coordinate system).")

            # Optional: Use LLM to make the response more conversational
            try:
                response_prompt = f"""
                The user asked about a location: "{user_prompt}"
                I extracted the coordinates as: Latitude={target_lat:.6f}, Longitude={target_lon:.6f}.
                I searched the flood data and found the closest recorded flood pixel is at coordinates Latitude={closest_pixel_coords[1]:.6f}, Longitude={closest_pixel_coords[0]:.6f}.
                The distance between the user's point and this flood pixel is {distance:.4f} units (in the data's coordinate system).

                Generate a concise, natural language response for the user, incorporating this information. Mention the user's coordinates, the location of the closest flood pixel, and the distance.
                """
                final_response = self.llm.generate_content(response_prompt)
                return final_response.text.strip()
            except Exception as e:
                print(f"LLM failed to generate final response: {e}")
                # Fallback to the pre-formatted summary
                return summary

        elif self.flood_kdtree is None:
             # Case where no flood data was loaded
             return f"I understood the location (Lat={target_lat:.6f}, Lon={target_lon:.6f}), but I don't have any flood pixel data loaded to search against."
        else:
            # Case where search failed for other reasons
             return f"I extracted the coordinates (Lat={target_lat:.6f}, Lon={target_lon:.6f}), but encountered an error trying to find the closest flood pixel in the data."

In [None]:
# Step 3: Example Usage (requires placeholder data)

# --- Create Placeholder Data (REPLACE with your actual data loading) ---
print("\n--- Creating Placeholder Data ---")
# Placeholder Buildings DataFrame (still useful for potential context, though not primary search key)
building_data = {
    'Bldg_ID': ['Office_Tower_A', 'Main_Library', 'City_Hall', 'Warehouse_7'],
    'Lat': [34.0522, 34.0550, 34.0530, 34.0400],
    'Lon': [-118.2437, -118.2450, -118.2440, -118.2500] # Example: LA area WGS84
}
buildings_df = pd.DataFrame(building_data)
print("Placeholder Buildings DF:\n", buildings_df)

# Placeholder Flood Pixel Coordinates (NumPy array, shape=(N, 2))
# IMPORTANT: Assume these coordinates are in the same CRS as buildings, e.g., [Lon, Lat] for WGS84
flood_pixels = np.array([
    [-118.2430, 34.0515], # South of Office_Tower_A
    [-118.2460, 34.0555], # North of Main_Library
    [-118.2445, 34.0525], # Between City_Hall and Office_Tower_A
    [-118.2510, 34.0395], # Near Warehouse_7
    [-118.2400, 34.0500]  # East of the main group
])
print("\nPlaceholder Flood Pixel Coords (Lon, Lat):\n", flood_pixels)
print("--- End Placeholder Data ---\n")

# --- Run Example IF LLM is available and data is ready ---
if llm_model and not buildings_df.empty and len(flood_pixels) > 0:
    # Instantiate the agent
    try:
        agent = CoordinateFloodProximityAgent(
            buildings_df=buildings_df, # Provide buildings for context if needed later
            building_id_col='Bldg_ID',
            lat_col='Lat',
            lon_col='Lon',
            flood_pixel_coords=flood_pixels # Essential: Flood pixel data
        )

        # --- Test Queries ---
        print("\n--- Testing Queries ---")

        prompt1 = "What's the closest flood pixel to latitude 34.0520, longitude -118.2438?"
        print(f"\nUser Query 1: {prompt1}")
        response1 = agent.find_closest_flood_pixel_to_location(prompt1)
        print(f"Agent Response 1:\n{response1}")

        prompt2 = "find flood near lat: 34.0 N, lon: 118.25 W" # Different format, N/W
        print(f"\nUser Query 2: {prompt2}")
        response2 = agent.find_closest_flood_pixel_to_location(prompt2)
        print(f"Agent Response 2:\n{response2}")

        prompt3 = "Any flooding around 34.055 North, -118.246 West?"
        print(f"\nUser Query 3: {prompt3}")
        response3 = agent.find_closest_flood_pixel_to_location(prompt3)
        print(f"Agent Response 3:\n{response3}")

        prompt4 = "Where is the nearest water if I'm at Buckingham Palace?" # No coordinates
        print(f"\nUser Query 4: {prompt4}")
        response4 = agent.find_closest_flood_pixel_to_location(prompt4)
        print(f"Agent Response 4:\n{response4}")


        print("\n--- End Queries ---")

    except Exception as e:
        print(f"\nAn error occurred during agent instantiation or testing: {e}")

elif not llm_model:
     print("\nSkipping example usage: LLM model not initialized.")
else:
     print("\nSkipping example usage: Placeholder data (buildings or flood pixels) is empty or missing.")


--- Creating Placeholder Data ---
Placeholder Buildings DF:
           Bldg_ID      Lat       Lon
0  Office_Tower_A  34.0522 -118.2437
1    Main_Library  34.0550 -118.2450
2       City_Hall  34.0530 -118.2440
3     Warehouse_7  34.0400 -118.2500

Placeholder Flood Pixel Coords (Lon, Lat):
 [[-118.243    34.0515]
 [-118.246    34.0555]
 [-118.2445   34.0525]
 [-118.251    34.0395]
 [-118.24     34.05  ]]
--- End Placeholder Data ---

Stored 5 flood pixel coordinates.
Building k-d tree for flood pixels...
Flood k-d tree built.

--- Testing Queries ---

User Query 1: What's the closest flood pixel to latitude 34.0520, longitude -118.2438?
Sending prompt to LLM for coordinate extraction:
---

        Analyze the following user query and extract the single pair of geographical coordinates (latitude and longitude) mentioned.
        Latitude must be between -90 and 90. Longitude must be between -180 and 180.
        Pay attention to signs (N/S, E/W) or negative values indicating direction.
