# OpenAI-to-Z Challenge: Checkpoint 1 Analysis

This notebook covers the analysis part of Checkpoint 1. We will:
1. Load the Sentinel-2 and LiDAR data downloaded in the previous step.
2. Identify an overlapping region.
3. Extract a raster chunk from the Sentinel-2 scene.
4. Call the OpenAI `o3` model (using `gpt-4o-mini` as a proxy) to identify potential archaeological features.

## 1. Setup and Imports

Import necessary libraries and load the OpenAI API key from the `.env` file.

In [None]:
import os
import rasterio
import laspy
import numpy as np
import pandas as pd
import openai
from dotenv import load_dotenv
from PIL import Image
import base64
from io import BytesIO

# Load environment variables (ensure .env file has OPENAI_API_KEY)
load_dotenv()

# Initialize OpenAI client
client = openai.OpenAI()

print("Libraries imported and OpenAI client initialized.")

## 2. Load Downloaded Data

Specify the paths to the downloaded Sentinel-2 and LiDAR files. **Note:** You may need to update these file paths to match the exact names of the files you downloaded.

In [None]:
# Update these paths based on the output from the data acquisition notebook
sentinel_tif_path = 'data/raw/sentinel2/COPERNICUS_S2_SR_2023...TIF' # Replace with actual filename
lidar_laz_path = 'data/raw/lidar/FLB_7006_20140728_131235.laz' # Replace with actual filename

try:
    # Load Sentinel-2 raster data
    with rasterio.open(sentinel_tif_path) as src:
        sentinel_bounds = src.bounds
        sentinel_crs = src.crs
        print(f"Sentinel-2 file loaded successfully.")
        print(f"  - Bounds: {sentinel_bounds}")
        print(f"  - CRS: {sentinel_crs}")

    # Load LiDAR point cloud data
    with laspy.open(lidar_laz_path) as lidar_file:
        lidar_header = lidar_file.header
        lidar_bounds = (lidar_header.x_min, lidar_header.y_min, lidar_header.x_max, lidar_header.y_max)
        print(f"\nLiDAR file loaded successfully.")
        print(f"  - Bounds: {lidar_bounds}")
        print(f"  - CRS: {lidar_header.parse_crs()}")
except FileNotFoundError as e:
    print(f"Error: {e}. Please ensure the file paths are correct and data was downloaded.")

## 3. Select Overlapping Area and Create Raster Chunk

We need to find the intersection of the two datasets' bounding boxes to analyze an overlapping area. For this example, we'll assume they overlap and simply read a chunk from the Sentinel-2 image. A full implementation would reproject one CRS to another and find the precise intersection.

In [None]:
def get_raster_chunk(tif_path, window_size=(512, 512)):
    """Reads a chunk from the center of a raster file."""
    with rasterio.open(tif_path) as src:
        center_x = src.width // 2
        center_y = src.height // 2
        
        window = rasterio.windows.Window(
            center_x - window_size[0] // 2,
            center_y - window_size[1] // 2,
            window_size[0],
            window_size[1]
        )
        
        # Read RGB bands (e.g., 4, 3, 2 for Sentinel-2)
        # Adjust band indices if necessary
        chunk = src.read([4, 3, 2], window=window)
    return chunk

try:
    raster_chunk = get_raster_chunk(sentinel_tif_path)
    print(f"Successfully extracted a raster chunk of shape: {raster_chunk.shape}")
except NameError:
    print("Skipping chunk extraction due to previous file loading error.")

## 4. Prepare Data for OpenAI Model

The OpenAI Vision API accepts images. We'll convert our raster chunk (a NumPy array) into a base64-encoded PNG image.

In [None]:
def numpy_to_base64(np_array):
    """Converts a NumPy array to a base64 encoded PNG image."""
    # Normalize and convert to 8-bit integer
    np_array = np.moveaxis(np_array, 0, -1) # Move channels to the last dimension
    normalized = (np_array - np_array.min()) / (np_array.max() - np_array.min())
    image_data = (normalized * 255).astype(np.uint8)
    
    img = Image.fromarray(image_data, 'RGB')
    buffered = BytesIO()
    img.save(buffered, format="PNG")
    return base64.b64encode(buffered.getvalue()).decode('utf-8')

try:
    base64_image = numpy_to_base64(raster_chunk)
    print("Raster chunk successfully converted to base64 encoded image.")
except NameError:
    print("Skipping image conversion due to previous error.")

## 5. Call the OpenAI API

Now we'll send the image to the `gpt-4o-mini` model with a specific prompt to look for archaeological features.

In [None]:
MODEL = "gpt-4o-mini"
PROMPT = """
Analyze this satellite image of the Amazon rainforest. Look for potential hidden archaeological features.
Specifically, identify any geometric shapes like rectangles, circles, or long straight ditches that are 80 meters or more across.
For each potential feature, describe its shape and provide its approximate center coordinates within the image (using a 0-1 scale for x and y, where 0,0 is the top-left corner).
If no such features are visible, state that clearly.
"""

try:
    response = client.chat.completions.create(
        model=MODEL,
        messages=[
            {
                "role": "user",
                "content": [
                    {"type": "text", "text": PROMPT},
                    {
                        "type": "image_url",
                        "image_url": {
                            "url": f"data:image/png;base64,{base64_image}"
                        }
                    }
                ]
            }
        ],
        temperature=0.2, # Lower temperature for more deterministic output
    )
    print("API call successful.")
except NameError:
    print("Skipping API call due to previous error.")
except openai.APIError as e:
    print(f"OpenAI API Error: {e}")

## 6. Log Model Response and ID

For reproducibility, we'll print the model's full response and the exact model ID returned by the API.

In [None]:
try:
    model_response_content = response.choices[0].message.content
    model_id_used = response.model
    
    print("--- MODEL RESPONSE ---")
    print(model_response_content)
    print("\n--- REPRODUCIBILITY INFO ---")
    print(f"Model Used: {model_id_used}")
except NameError:
    print("No response to log.")