# OpenAI-to-Z Challenge: Checkpoint 1 Analysis

This notebook covers the analysis part of Checkpoint 1. We will:
1. Load the DTM and LiDAR data downloaded in the previous step.
2. Extract a raster chunk from the DTM.
3. Call the OpenAI `o3` model (using `gpt-4o-mini` as a proxy) to identify potential archaeological features.

## 1. Setup and Imports

Import necessary libraries and load the OpenAI API key from the `.env` file.

In [1]:
import os
import rasterio
import laspy
import numpy as np
import openai
from dotenv import load_dotenv
from PIL import Image
import base64
from io import BytesIO

# Load environment variables (ensure .env file has OPENAI_API_KEY)
load_dotenv()

# Initialize OpenAI client
client = openai.OpenAI()

print("Libraries imported and OpenAI client initialized.")

Libraries imported and OpenAI client initialized.


## 2. Load Downloaded Data

Specify the paths to the downloaded DTM and LiDAR files.

In [2]:
# Update these paths based on the extracted data
dtm_raster_path = '/Users/shg/Projects/openai-a-z-challenge/data/raw/TAL_A01_2018/TAL_A01_2018_DTM/TAL01L0001C0001.grd'
lidar_laz_path = '/Users/shg/Projects/openai-a-z-challenge/data/raw/TAL_A01_2018/TAL_A01_2018_LAS/TAL01L0001C0001.las'

print(f"Attempting to load DTM from: {dtm_raster_path}")
print(f"Attempting to load LiDAR from: {lidar_laz_path}")

try:
    # Load DTM raster data
    with rasterio.open(dtm_raster_path) as src:
        dtm_bounds = src.bounds
        dtm_crs = src.crs
        print(f"DTM file loaded successfully.")
        print(f"  - Bounds: {dtm_bounds}")
        print(f"  - CRS: {dtm_crs}")

    # Load LiDAR point cloud data
    with laspy.open(lidar_laz_path) as lidar_file:
        lidar_header = lidar_file.header
        lidar_bounds = (lidar_header.x_min, lidar_header.y_min, lidar_header.x_max, lidar_header.y_max)
        print(f"\nLiDAR file loaded successfully.")
        print(f"  - Bounds: {lidar_bounds}")
        print(f"  - CRS: {lidar_header.parse_crs()}")
except (FileNotFoundError, rasterio.errors.RasterioIOError) as e:
    print(f"Error: {e}. Please ensure the file paths are correct and data was downloaded and extracted.")

Attempting to load DTM from: /Users/shg/Projects/openai-a-z-challenge/data/raw/TAL_A01_2018/TAL_A01_2018_DTM/TAL01L0001C0001.grd
Attempting to load LiDAR from: /Users/shg/Projects/openai-a-z-challenge/data/raw/TAL_A01_2018/TAL_A01_2018_LAS/TAL01L0001C0001.las
DTM file loaded successfully.
  - Bounds: BoundingBox(left=610108.0, bottom=8866715.0, right=611109.0, top=8867716.0)
  - CRS: None

LiDAR file loaded successfully.
  - Bounds: (np.float64(610460.18), np.float64(8866715.790000001), np.float64(611108.6), np.float64(8867507.97))
  - CRS: None


## 3. Select Overlapping Area and Create Raster Chunk

We'll read a chunk from the center of the DTM raster for analysis.

In [3]:
def get_raster_chunk(tif_path, window_size=(512, 512)):
    """Reads a chunk from the center of a raster file."""
    with rasterio.open(tif_path) as src:
        center_x = src.width // 2
        center_y = src.height // 2

        window = rasterio.windows.Window(
            center_x - window_size[0] // 2,
            center_y - window_size[1] // 2,
            window_size[0],
            window_size[1]
        )

        # Read single band from the DTM and replicate to 3 channels for visualization
        band = src.read(1, window=window)
        chunk = np.stack([band] * 3, axis=0)
    return chunk

try:
    raster_chunk = get_raster_chunk(dtm_raster_path)
    print(f"Successfully extracted a raster chunk of shape: {raster_chunk.shape}")
except (NameError, FileNotFoundError, rasterio.errors.RasterioIOError) as e:
    print(f"Skipping chunk extraction due to previous error: {e}")

Successfully extracted a raster chunk of shape: (3, 512, 512)


## 4. Prepare Data for OpenAI Model

The OpenAI Vision API accepts images. We'll convert our raster chunk (a NumPy array) into a base64-encoded PNG image.

In [4]:
def numpy_to_base64(np_array):
    """Converts a NumPy array to a base64 encoded PNG image."""
    # Normalize and convert to 8-bit integer
    np_array = np.moveaxis(np_array, 0, -1) # Move channels to the last dimension
    # Handle cases where min and max are the same to avoid division by zero
    min_val, max_val = np_array.min(), np_array.max()
    if max_val == min_val:
        normalized = np.zeros_like(np_array)
    else:
        normalized = (np_array - min_val) / (max_val - min_val)
    image_data = (normalized * 255).astype(np.uint8)

    img = Image.fromarray(image_data, 'RGB')
    buffered = BytesIO()
    img.save(buffered, format="PNG")
    return base64.b64encode(buffered.getvalue()).decode('utf-8')

try:
    base64_image = numpy_to_base64(raster_chunk)
    print("Raster chunk successfully converted to base64 encoded image.")
except NameError:
    print("Skipping image conversion due to previous error.")

Raster chunk successfully converted to base64 encoded image.


## 5. Call the OpenAI API

Now we'll send the image to the `gpt-4o-mini` model with a specific prompt to look for archaeological features.

In [5]:
MODEL = "gpt-4o-mini"
PROMPT = """
Analyze this digital terrain model (DTM) image from the Amazon rainforest. Look for potential hidden archaeological features.
Specifically, identify any geometric shapes like rectangles, circles, or long straight ditches that are 80 meters or more across.
For each potential feature, describe its shape and provide its approximate center coordinates within the image (using a 0-1 scale for x and y, where 0,0 is the top-left corner).
If no such features are visible, state that clearly.
"""

try:
    response = client.chat.completions.create(
        model=MODEL,
        messages=[
            {
                "role": "user",
                "content": [
                    {"type": "text", "text": PROMPT},
                    {
                        "type": "image_url",
                        "image_url": {
                            "url": f"data:image/png;base64,{base64_image}"
                        }
                    }
                ]
            }
        ],
        temperature=0.2, # Lower temperature for more deterministic output
    )
    print("API call successful.")
except NameError:
    print("Skipping API call due to previous error.")
except openai.APIError as e:
    print(f"OpenAI API Error: {e}")

API call successful.


## 6. Log Model Response and ID

For reproducibility, we'll print the model's full response and the exact model ID returned by the API.

In [6]:
try:
    model_response_content = response.choices[0].message.content
    model_id_used = response.model

    print("--- MODEL RESPONSE ---")
    print(model_response_content)
    print("\n--- REPRODUCIBILITY INFO ---")
    print(f"Model Used: {model_id_used}")
except NameError:
    print("No response to log.")

--- MODEL RESPONSE ---
I'm unable to analyze images directly. However, if you're looking for potential archaeological features in a digital terrain model (DTM) of the Amazon rainforest, consider using specialized software or tools designed for remote sensing and terrain analysis. Look for geometric patterns, variations in elevation, or anomalies that could indicate human-made structures. If you have specific features in mind, I can help guide you on how to identify them or what to look for!

--- REPRODUCIBILITY INFO ---
Model Used: gpt-4o-mini-2024-07-18
