# EOPF-Zarr GDAL Driver Test Notebook

This notebook demonstrates the EOPF-Zarr GDAL driver working in the Docker environment.

## Overview

- **Environment**: Ubuntu 25.04 + GDAL 3.10 + Python 3.13
- **Driver**: Custom EOPF-Zarr GDAL driver
- **Use Case**: Reading and processing EOPF/Zarr geospatial data

In [1]:
# Import required libraries
import os
import sys
from osgeo import gdal
import xarray as xr
import zarr
import numpy as np
import pandas as pd

print("📦 Environment Information:")
print(f"   Python: {sys.version}")
print(f"   GDAL: {gdal.VersionInfo()}")
print(f"   NumPy: {np.__version__}")
print(f"   xarray: {xr.__version__}")
print(f"   zarr: {zarr.__version__}")

📦 Environment Information:
   Python: 3.13.3 (main, Jun 16 2025, 18:15:32) [GCC 14.2.0]
   GDAL: 3100200
   NumPy: 2.2.3
   xarray: 2025.7.1
   zarr: 2.18.7


In [2]:
# Check environment variables
print("🔧 Environment Variables:")
env_vars = ['GDAL_DRIVER_PATH', 'GDAL_DATA', 'PROJ_LIB', 'PYTHONPATH']
for var in env_vars:
    value = os.environ.get(var, 'Not set')
    print(f"   {var}: {value}")

🔧 Environment Variables:
   GDAL_DRIVER_PATH: /opt/eopf-zarr/drivers
   GDAL_DATA: /usr/share/gdal
   PROJ_LIB: /usr/share/proj
   PYTHONPATH: /usr/local/lib/python3.13/site-packages


In [3]:
# List available GDAL drivers
gdal.AllRegister()
print(f"📋 Total GDAL drivers available: {gdal.GetDriverCount()}")
print("\n🔍 Looking for EOPF-Zarr driver...")

# Try to find EOPF-Zarr driver
eopf_driver = gdal.GetDriverByName('EOPFZARR')
if eopf_driver:
    print(f"✅ EOPF-Zarr driver found: {eopf_driver.GetDescription()}")
    print(f"   Metadata: {eopf_driver.GetMetadata()}")
else:
    print("⚠️ EOPF-Zarr driver not found")
    print("\n📋 Available drivers containing 'zarr':")
    for i in range(gdal.GetDriverCount()):
        driver = gdal.GetDriver(i)
        name = driver.GetDescription()
        if 'zarr' in name.lower():
            print(f"   {name}")

📋 Total GDAL drivers available: 222

🔍 Looking for EOPF-Zarr driver...
✅ EOPF-Zarr driver found: EOPFZARR
   Metadata: {'DMD_LONGNAME': 'EOPF Zarr Wrapper Driver', 'DCAP_RASTER': 'YES', 'DCAP_VIRTUALIO': 'YES', 'DMD_HELPTOPIC': 'drivers/raster/eopfzarr.html', 'DMD_SUBDATASETS': 'YES', 'DMD_OPENOPTIONLIST': "<OpenOptionList>  <Option name='EOPF_PROCESS' type='boolean' default='NO' description='Enable EOPF features'>    <Value>YES</Value>    <Value>NO</Value>  </Option></OpenOptionList>", 'DCAP_OPEN': 'YES'}


In [4]:
# Create a sample Zarr dataset for testing
print("🧪 Creating test Zarr dataset...")

# Create test data
import tempfile
import shutil

# Create temporary directory for test data
test_dir = '/tmp/test_zarr_data'
if os.path.exists(test_dir):
    shutil.rmtree(test_dir)
os.makedirs(test_dir)

# Create a simple xarray dataset
data = xr.Dataset({
    'temperature': (['x', 'y', 'time'], 
                   np.random.rand(100, 100, 10) * 30 + 273.15),  # Kelvin
    'pressure': (['x', 'y', 'time'], 
                np.random.rand(100, 100, 10) * 500 + 1000),  # hPa
}, coords={
    'x': np.linspace(0, 10, 100),
    'y': np.linspace(0, 10, 100),
    'time': pd.date_range('2023-01-01', periods=10, freq='D')
})

# Add attributes
data.attrs['title'] = 'Test EOPF Dataset'
data.attrs['description'] = 'Sample data for testing EOPF-Zarr driver'
data['temperature'].attrs['units'] = 'K'
data['pressure'].attrs['units'] = 'hPa'

# Save as Zarr
zarr_path = os.path.join(test_dir, 'test_data.zarr')
data.to_zarr(zarr_path)
print(f"✅ Test data saved to: {zarr_path}")
print(f"   Dataset shape: {data.dims}")
print(f"   Variables: {list(data.data_vars)}")

🧪 Creating test Zarr dataset...
✅ Test data saved to: /tmp/test_zarr_data/test_data.zarr
   Variables: ['temperature', 'pressure']


In [7]:
# Test GDAL access to Zarr data
print("🔍 Testing GDAL access to Zarr data...")

# Try to open with GDAL
try:
    dataset = gdal.Open("EOPFZARR:" + zarr_path)
    if dataset:
        print(f"✅ Successfully opened Zarr with GDAL")
        print(f"   Driver: {dataset.GetDriver().GetDescription()}")
        print(f"   Size: {dataset.RasterXSize} x {dataset.RasterYSize}")
        print(f"   Bands: {dataset.RasterCount}")
        
        # Get some metadata
        metadata = dataset.GetMetadata()
        if metadata:
            print(f"   Metadata keys: {list(metadata.keys())[:5]}...")  # Show first 5 keys
        
        dataset = None  # Close dataset
    else:
        print("❌ Could not open Zarr with GDAL")
except Exception as e:
    print(f"❌ Error opening Zarr with GDAL: {e}")

🔍 Testing GDAL access to Zarr data...
✅ Successfully opened Zarr with GDAL
   Driver: EOPFZARR
   Size: 512 x 512
   Bands: 0
   Metadata keys: ['EOPF_PRODUCT', 'spatial_ref', 'EPSG', 'proj:epsg', 'geospatial_lon_min']...


In [8]:
# Test xarray access to confirm data integrity
print("🔍 Testing xarray access to verify data...")

try:
    # Read back with xarray
    loaded_data = xr.open_zarr(zarr_path)
    print(f"✅ Successfully loaded with xarray")
    print(f"   Variables: {list(loaded_data.data_vars)}")
    print(f"   Coordinates: {list(loaded_data.coords)}")
    print(f"   Temperature range: {loaded_data.temperature.min().values:.2f} - {loaded_data.temperature.max().values:.2f} K")
    print(f"   Pressure range: {loaded_data.pressure.min().values:.2f} - {loaded_data.pressure.max().values:.2f} hPa")
    
    # Basic computation
    temp_celsius = loaded_data.temperature - 273.15
    mean_temp = temp_celsius.mean()
    print(f"   Mean temperature: {mean_temp.values:.2f} °C")
    
except Exception as e:
    print(f"❌ Error with xarray: {e}")

🔍 Testing xarray access to verify data...
✅ Successfully loaded with xarray
   Variables: ['pressure', 'temperature']
   Coordinates: ['time', 'x', 'y']
   Temperature range: 273.15 - 303.15 K
   Pressure range: 1000.01 - 1500.00 hPa
   Mean temperature: 15.07 °C


In [9]:
# Summary
print("📊 Test Summary:")
print("=================")
print(f"✅ Python environment: Working")
print(f"✅ GDAL installation: Working (v{gdal.VersionInfo()})")
print(f"✅ xarray/zarr: Working")
print(f"✅ Data creation/access: Working")

if eopf_driver:
    print(f"✅ EOPF-Zarr driver: Loaded")
else:
    print(f"⚠️ EOPF-Zarr driver: Not loaded (but standard Zarr works)")

print("\n🎉 Docker environment is ready for EOPF development!")
print("\n📝 Next steps:")
print("   1. Use this environment for EOPF data processing")
print("   2. Deploy to JupyterHub at https://jupyterhub.user.eopf.eodc.eu")
print("   3. Access your notebooks at http://localhost:8888 (local development)")

📊 Test Summary:
✅ Python environment: Working
✅ GDAL installation: Working (v3100200)
✅ xarray/zarr: Working
✅ Data creation/access: Working
✅ EOPF-Zarr driver: Loaded

🎉 Docker environment is ready for EOPF development!

📝 Next steps:
   1. Use this environment for EOPF data processing
   2. Deploy to JupyterHub at https://jupyterhub.user.eopf.eodc.eu
   3. Access your notebooks at http://localhost:8888 (local development)
