### PACE_L3_basicplot

This notebook is for basic Level 3 plots. 

It uses the Earthaccess and Xarray libraries for data search. This means it is intended for provisional and standard products available on Earthdata search. All relevant information is intended to be specified in the "Identify and search product, specify plotting parameters" cell. 

If multiple images are found 
Results are plotted to the screen and saved as a .png file. 

NOTE:
 - For many granules, it may be better to use L3 products
 - Currently only works for 2D data products. 3D implementation to be specified
 - Interpolation is subject to error, use with caution

In [1]:
import cartopy.crs as ccrs
import cartopy.feature as cfeature
from scipy.interpolate import griddata
import earthaccess
import matplotlib.pyplot as plt
import numpy as np
import xarray as xr
import os
from sklearn.neighbors import KDTree
from PIL import Image

### Earthdata Authentication
We authenticate using our Earthdata Login credentials. Authentication is not needed to search publicly available collections in Earthdata, but is always needed to access data. We can use the login method from the earthaccess package. This will create an authenticated session when we provide a valid Earthdata Login username and password. The earthaccess package will search for credentials defined by environmental variables or within a .netrc file saved in the home directory. If credentials are not found, an interactive prompt will allow you to input credentials.

In [2]:
auth = earthaccess.login(persist=True)

### If desired, print all available PACE Level 3 short names

In [3]:
if False:
    results_oci = earthaccess.search_datasets(instrument="oci",processing_level_id='3')
    results_harp = earthaccess.search_datasets(instrument="harp2",processing_level_id='3')
    results_spex = earthaccess.search_datasets(instrument="spexone",processing_level_id='3')
    results=results_oci + results_harp + results_spex
    for item in results:
        summary = item.summary()
        print(summary["short-name"])

### Identify and search product, specify plotting parameters

In [4]:
tspan = ("2025-08-01", "2025-09-07")
bbox = (-35., -30., 20., 10.0)
granule_name="*.Day.*0p1deg*",  # Daily, 8-day or monthly: Day, 8D or MO | Resolution: 0p1deg or 0.4km

TAG='SE Atlantic smoke: PACE OCI smoke aerosol index'
short_name="PACE_OCI_L3M_AER_UAA_NRT"
variable_name='NUV_AerosolIndex'
plot_range = (0,5)
colormap='hot_r'
save_interpolation=True

outdir='SEAtlantic_smoke_transport_202508_202509'
results = earthaccess.search_data(
    short_name=short_name,
    temporal=tspan,
    granule_name=granule_name
)
print('Number of granules: ',len(results))

Number of granules:  37


In [5]:
paths = earthaccess.open(results)

QUEUEING TASKS | :   0%|          | 0/37 [00:00<?, ?it/s]

PROCESSING TASKS | :   0%|          | 0/37 [00:00<?, ?it/s]

COLLECTING RESULTS | :   0%|          | 0/37 [00:00<?, ?it/s]

### Open and read file(s)

In [6]:
def read_l3_files(file_paths, variable_name, bbox=None):
    """
    Read Level 3 files and combine them
    
    Parameters:
        file_paths: list of file paths from earthaccess.open()
        variable_name: name of the variable to extract
        bbox: tuple of (lon_min, lat_min, lon_max, lat_max) for subsetting
    
    Returns:
        combined_ds: xarray Dataset with the variable and coordinates
    """
    
    print(f"Reading {len(file_paths)} L3 files...")
    
    datasets = []
    
    for i, file_path in enumerate(file_paths):
        try:
            print(f"Processing file {i+1}/{len(file_paths)}")
            
            # Extract date from filename
            import os
            import re
            filename = os.path.basename(str(file_path))
            print(f"  Filename: {filename}")
            
            # Extract date from PACE L3 filename pattern: PACE_OCI.YYYYMMDD.L3m...
            date_match = re.search(r'PACE_OCI\.(\d{8})\.', filename)
            if date_match:
                date_str = date_match.group(1)
                filename_date = f"{date_str[:4]}-{date_str[4:6]}-{date_str[6:8]}"
                print(f"  Extracted date: {filename_date}")
            else:
                print(f"  Warning: Could not extract date from filename")
                # Fallback: try any 8-digit pattern
                date_match = re.search(r'(\d{8})', filename)
                if date_match:
                    date_str = date_match.group(1)
                    filename_date = f"{date_str[:4]}-{date_str[4:6]}-{date_str[6:8]}"
                    print(f"  Extracted date (fallback): {filename_date}")
                else:
                    print(f"  Could not extract date, skipping file")
                    continue
            
            # Read the L3 file - try different group structures
            ds = None
            
            # Try common L3 group structures
            for group in [None, 'geophysical_data', 'science_data']:
                try:
                    if group is None:
                        ds = xr.open_dataset(file_path)
                    else:
                        ds = xr.open_dataset(file_path, group=group)
                    
                    # Check if our variable is in this group
                    if variable_name in ds.variables:
                        print(f"  Found '{variable_name}' in {group or 'root'} group")
                        break
                    else:
                        ds = None
                        
                except Exception as e:
                    continue
            
            if ds is None:
                print(f"  Could not find variable '{variable_name}' in file {i+1}")
                continue
            
            # Create correct time coordinate from filename
            correct_time = np.datetime64(filename_date)
            
            # Add or replace time coordinate
            if 'time' in ds.dims:
                # Replace existing time coordinate
                ds = ds.assign_coords(time=[correct_time])
            else:
                # Add time dimension if it doesn't exist
                ds = ds.expand_dims('time')
                ds = ds.assign_coords(time=[correct_time])
            
            print(f"  Set time coordinate to: {correct_time}")
            
            # Get coordinate information
            # L3 files usually have lat/lon as 1D coordinate arrays
            if 'lat' in ds.coords:
                lat_coord = 'lat'
            elif 'latitude' in ds.coords:
                lat_coord = 'latitude'
            else:
                print(f"  Could not find latitude coordinate in file {i+1}")
                continue
                
            if 'lon' in ds.coords:
                lon_coord = 'lon'
            elif 'longitude' in ds.coords:
                lon_coord = 'longitude'
            else:
                print(f"  Could not find longitude coordinate in file {i+1}")
                continue
            
            # Subset by bounding box if provided
            if bbox is not None:
                lon_min, lat_min, lon_max, lat_max = bbox
                
                # Select data within bounding box
                ds_subset = ds.sel(
                    {lon_coord: slice(lon_min, lon_max),
                     lat_coord: slice(lat_min, lat_max)}
                )
                
                print(f"  Subset to bbox: {ds_subset[variable_name].shape}")
            else:
                ds_subset = ds
            
            # Standardize coordinate names
            if lat_coord != 'latitude':
                ds_subset = ds_subset.rename({lat_coord: 'latitude'})
            if lon_coord != 'longitude':
                ds_subset = ds_subset.rename({lon_coord: 'longitude'})
            
            datasets.append(ds_subset)
            
        except Exception as e:
            print(f"Error reading file {i+1}: {e}")
            continue
    
    if not datasets:
        print("No datasets successfully loaded!")
        return None
    
    print(f"Successfully loaded {len(datasets)} datasets")
    
    # Sort datasets by time before combining (in case files were not in order)
    datasets.sort(key=lambda x: x.time.values[0])
    
    # Combine datasets along time dimension if multiple files
    if len(datasets) == 1:
        combined_ds = datasets[0]
    else:
        try:
            # Concatenate along time dimension
            combined_ds = xr.concat(datasets, dim='time')
            print("Combined datasets along time dimension")
        except:
            try:
                # If concatenation fails, try to merge
                combined_ds = xr.merge(datasets)
                print("Merged datasets")
            except Exception as e:
                print(f"Could not combine datasets: {e}")
                print("Using first dataset only")
                combined_ds = datasets[0]
    
    # Print final time coordinate information
    if 'time' in combined_ds.coords:
        print(f"Final time coordinates:")
        for i, time_val in enumerate(combined_ds.time.values):
            print(f"  {i}: {time_val}")
    
    # Print info about the combined dataset
    print(f"Final dataset shape for '{variable_name}': {combined_ds[variable_name].shape}")
    print(f"Coordinate ranges:")
    print(f"  Longitude: {combined_ds.longitude.min().values:.2f} to {combined_ds.longitude.max().values:.2f}")
    print(f"  Latitude: {combined_ds.latitude.min().values:.2f} to {combined_ds.latitude.max().values:.2f}")
    
    return combined_ds

In [None]:
dataset = read_l3_files(paths, variable_name)


Reading 37 L3 files...
Processing file 1/37
  Filename: PACE_OCI.20250801.L3m.DAY.AER_UAA.V3_1.0p1deg.NRT.nc>
  Extracted date: 2025-08-01


### Plot results

In [None]:
def plot_pace_l3_data_daily(dataset, variable_name, plot_range=None, colormap='viridis', 
                           outdir='.', title_tag='PACE L3', bbox=None, 
                           percentile_range=(2, 98), nan_color='black',
                           ncols=None, figsize_per_plot=(6, 5)):
    """
    Plot Level 3 data with separate subplot for each day
    
    Parameters:
        dataset: xarray Dataset from read_pace_l3_files()
        variable_name: name of variable to plot
        plot_range: tuple of (vmin, vmax) or None for auto-range
        colormap: matplotlib colormap name
        outdir: output directory
        title_tag: tag for plot title
        bbox: tuple of (lon_min, lat_min, lon_max, lat_max) for plot extent only
        percentile_range: percentiles for auto-ranging
        nan_color: color for NaN values
        ncols: number of columns for subplot grid (auto-calculated if None)
        figsize_per_plot: size of each individual subplot
    """
    
    if dataset is None or variable_name not in dataset.variables:
        print("No data to plot!")
        return None, None
    
    # Get the data variable
    data_var = dataset[variable_name]
    
    print(f"Original data shape: {data_var.shape}")
    
    # Check if time dimension exists
    if 'time' not in data_var.dims:
        print("No time dimension found - creating single plot")
        return plot_pace_l3_data_single(dataset, variable_name, plot_range, colormap, 
                                       outdir, title_tag, bbox, percentile_range, nan_color)
    
    # Get number of time steps
    n_times = len(data_var.time)
    print(f"Found {n_times} time steps")
    
    if n_times == 1:
        print("Only one time step - creating single plot")
        return plot_pace_l3_data_single(dataset, variable_name, plot_range, colormap, 
                                       outdir, title_tag, bbox, percentile_range, nan_color)
    
    # Calculate subplot grid dimensions
    if ncols is None:
        ncols = min(4, n_times)  # Max 4 columns
    nrows = int(np.ceil(n_times / ncols))
    
    print(f"Creating {nrows}x{ncols} subplot grid")
    
    # Calculate figure size
    fig_width = ncols * figsize_per_plot[0]
    fig_height = nrows * figsize_per_plot[1]
    
    # Create figure with subplots
    fig = plt.figure(figsize=(fig_width, fig_height))
    
    # Subset data for plotting if bbox is specified
    if bbox is not None:
        lon_min, lat_min, lon_max, lat_max = bbox
        print(f"Subsetting data to bbox: lon {lon_min} to {lon_max}, lat {lat_min} to {lat_max}")
        
        try:
            lons = dataset.longitude.values
            lats = dataset.latitude.values
            
            if lon_max < lon_min:  # bbox crosses dateline
                lon_mask = (lons >= lon_min) | (lons <= lon_max)
            else:
                lon_mask = (lons >= lon_min) & (lons <= lon_max)
            
            lat_mask = (lats >= lat_min) & (lats <= lat_max)
            
            if np.sum(lon_mask) == 0 or np.sum(lat_mask) == 0:
                print("Warning: No data points within specified bbox, using full dataset")
                plot_extent = None
                plot_data = data_var
            else:
                plot_data = data_var.sel(
                    longitude=data_var.longitude[lon_mask],
                    latitude=data_var.latitude[lat_mask]
                )
                plot_extent = [lon_min, lon_max, lat_min, lat_max]
                
        except Exception as e:
            print(f"Error during subsetting: {e}")
            plot_extent = None
            plot_data = data_var
    else:
        plot_data = data_var
        lon_min, lon_max = float(dataset.longitude.min()), float(dataset.longitude.max())
        lat_min, lat_max = float(dataset.latitude.min()), float(dataset.latitude.max())
        plot_extent = [lon_min, lon_max, lat_min, lat_max]
    
    # Calculate plot range using all time steps
    if plot_range is None:
        valid_data = plot_data.values[~np.isnan(plot_data.values)]
        if len(valid_data) == 0:
            print("Warning: No valid data found!")
            vmin, vmax = 0, 1
        else:
            vmin = np.percentile(valid_data, percentile_range[0])
            vmax = np.percentile(valid_data, percentile_range[1])
            print(f"Auto-calculated range for all days: {vmin:.3f} to {vmax:.3f}")
    else:
        vmin, vmax = plot_range
        print(f"Using specified range: {vmin} to {vmax}")
    
    # Create colormap
    base_cmap = plt.colormaps.get_cmap(colormap)
    cmap_with_nan = base_cmap.copy()
    
    # Handle NaN colors for cartopy
    if nan_color.lower() in ['black', 'k']:
        nan_facecolor = 'black'
        cmap_with_nan.set_bad(alpha=0)
    elif nan_color.lower() in ['transparent', 'none']:
        nan_facecolor = 'white'
        cmap_with_nan.set_bad(alpha=0)
    else:
        nan_facecolor = nan_color
        cmap_with_nan.set_bad(alpha=0)
    
    # Create subplots for each day
    axes = []
    ims = []
    
    for i in range(n_times):
        # Create subplot
        ax = fig.add_subplot(nrows, ncols, i+1, projection=ccrs.PlateCarree())
        axes.append(ax)
        
        # Set background color for NaN values
        ax.set_facecolor(nan_facecolor)
        
        # Add map features with BLACK boundaries and land
        ax.add_feature(cfeature.COASTLINE, linewidth=0.5, edgecolor='black')
        ax.add_feature(cfeature.BORDERS, linewidth=0.5, edgecolor='black')
        ax.add_feature(cfeature.STATES, linewidth=0.3, edgecolor='black')
        ax.add_feature(cfeature.LAND, color='black', alpha=0.3)  # Black land
        ax.add_feature(cfeature.OCEAN, color='lightblue', alpha=0.2)
        
        # Set extent
        if plot_extent is not None:
            ax.set_extent(plot_extent, crs=ccrs.PlateCarree())
        
        # Get data for this time step
        day_data = plot_data.isel(time=i)
        
        # Plot the data
        im = ax.pcolormesh(
            day_data.longitude, 
            day_data.latitude, 
            day_data,
            transform=ccrs.PlateCarree(),
            cmap=cmap_with_nan,
            vmin=vmin,
            vmax=vmax,
            shading='auto'
        )
        ims.append(im)
        
        # Format date for title
        time_val = plot_data.time.values[i]
        try:
            # Handle different time formats
            if hasattr(time_val, 'strftime'):
                date_str = time_val.strftime('%Y-%m-%d')
            elif hasattr(time_val, 'astype'):
                # Handle numpy datetime64
                date_str = str(time_val.astype('datetime64[D]'))
            else:
                date_str = str(time_val)[:10]
        except:
            date_str = str(time_val)[:10]
        
        # Set title with date
        ax.set_title(f'{title_tag}\n{date_str}', fontsize=10, pad=10)
        
        # Add gridlines for larger subplots
        if figsize_per_plot[0] >= 6:
            gl = ax.gridlines(draw_labels=True, linewidth=0.3, alpha=0.5)
            gl.top_labels = False
            gl.right_labels = False
            gl.xlabel_style = {'size': 8}
            gl.ylabel_style = {'size': 8}
    
    # Add a single colorbar for all subplots
    # Position colorbar on the right side of the figure
    cbar_ax = fig.add_axes([0.92, 0.15, 0.02, 0.7])  # [left, bottom, width, height]
    cbar = fig.colorbar(ims[0], cax=cbar_ax, orientation='vertical')
    cbar.set_label(variable_name, fontsize=12)
    
    # Add main title with date range
    time_range = f"{str(plot_data.time.values[0])[:10]} to {str(plot_data.time.values[-1])[:10]}"
    fig.suptitle(f'{title_tag} - Daily Views\n{time_range}', fontsize=16, y=0.95)
    
    # Adjust layout
    plt.tight_layout()
    plt.subplots_adjust(right=0.9, top=0.88)  # Make room for colorbar and title
    
    # Generate filename
    if bbox:
        bbox_str = f"{bbox[0]:.0f}_{bbox[1]:.0f}_{bbox[2]:.0f}_{bbox[3]:.0f}"
        region_tag = "subset"
    else:
        bbox_str = "global" 
        region_tag = "global"
    
    start_date = str(plot_data.time.values[0])[:10].replace('-', '')
    end_date = str(plot_data.time.values[-1])[:10].replace('-', '')
    filename = f"PACE_L3_{variable_name}_daily_{start_date}_{end_date}_{region_tag}_{bbox_str}.png"
    
    save_path = os.path.join(outdir, filename)
    plt.savefig(save_path, dpi=300, bbox_inches='tight', facecolor='white')
    print(f"Daily subplot plot saved to: {save_path}")
    
    plt.show()
    
    return fig, axes



In [None]:
def plot_pace_l3_data_single(dataset, variable_name, plot_range=None, colormap='viridis', 
                            outdir='.', title_tag='PACE L3', bbox=None, 
                            percentile_range=(2, 98), nan_color='black'):
    """Single plot version with black boundaries and date in title"""
    
    if dataset is None or variable_name not in dataset.variables:
        print("No data to plot!")
        return None, None
    
    data_var = dataset[variable_name]
    
    # Handle time dimension if present
    if 'time' in data_var.dims:
        if len(data_var.time) == 1:
            plot_data = data_var.isel(time=0)
            time_val = data_var.time.values[0]
        else:
            plot_data = data_var.mean(dim='time')
            time_val = data_var.time.values[0]  # Use first date for title
    else:
        plot_data = data_var
        time_val = None
    
    # Format date for title
    if time_val is not None:
        try:
            if hasattr(time_val, 'strftime'):
                date_str = time_val.strftime('%Y-%m-%d')
            elif hasattr(time_val, 'astype'):
                date_str = str(time_val.astype('datetime64[D]'))
            else:
                date_str = str(time_val)[:10]
        except:
            date_str = str(time_val)[:10]
    else:
        date_str = "Unknown date"
    
    # Subset data for plotting if bbox is specified
    if bbox is not None:
        lon_min, lat_min, lon_max, lat_max = bbox
        try:
            lons = dataset.longitude.values
            lats = dataset.latitude.values
            
            if lon_max < lon_min:
                lon_mask = (lons >= lon_min) | (lons <= lon_max)
            else:
                lon_mask = (lons >= lon_min) & (lons <= lon_max)
            
            lat_mask = (lats >= lat_min) & (lats <= lat_max)
            
            if np.sum(lon_mask) == 0 or np.sum(lat_mask) == 0:
                plot_extent = None
            else:
                plot_data = plot_data.sel(
                    longitude=plot_data.longitude[lon_mask],
                    latitude=plot_data.latitude[lat_mask]
                )
                plot_extent = [lon_min, lon_max, lat_min, lat_max]
        except:
            plot_extent = None
    else:
        lon_min, lon_max = float(dataset.longitude.min()), float(dataset.longitude.max())
        lat_min, lat_max = float(dataset.latitude.min()), float(dataset.latitude.max())
        plot_extent = [lon_min, lon_max, lat_min, lat_max]
    
    # Calculate plot range
    if plot_range is None:
        valid_data = plot_data.values[~np.isnan(plot_data.values)]
        if len(valid_data) == 0:
            vmin, vmax = 0, 1
        else:
            vmin = np.percentile(valid_data, percentile_range[0])
            vmax = np.percentile(valid_data, percentile_range[1])
    else:
        vmin, vmax = plot_range
    
    # Create the plot
    fig = plt.figure(figsize=(15, 10))
    ax = plt.axes(projection=ccrs.PlateCarree())
    
    # Handle NaN colors
    base_cmap = plt.colormaps.get_cmap(colormap)
    cmap_with_nan = base_cmap.copy()
    
    if nan_color.lower() in ['black', 'k']:
        ax.set_facecolor('black')
        cmap_with_nan.set_bad(alpha=0)
    elif nan_color.lower() in ['transparent', 'none']:
        ax.set_facecolor('white')
        cmap_with_nan.set_bad(alpha=0)
    else:
        ax.set_facecolor(nan_color)
        cmap_with_nan.set_bad(alpha=0)
    
    # Add map features with BLACK boundaries and land
    ax.add_feature(cfeature.COASTLINE, linewidth=0.5, edgecolor='black')
    ax.add_feature(cfeature.BORDERS, linewidth=0.5, edgecolor='black')
    ax.add_feature(cfeature.STATES, linewidth=0.3, edgecolor='black')
    ax.add_feature(cfeature.LAND, color='lightbrown', alpha=0.3)  
    ax.add_feature(cfeature.OCEAN, color='lightblue', alpha=0.3)
    
    # Set extent
    if plot_extent is not None:
        ax.set_extent(plot_extent, crs=ccrs.PlateCarree())
    
    # Plot the data
    im = ax.pcolormesh(
        plot_data.longitude, 
        plot_data.latitude, 
        plot_data,
        transform=ccrs.PlateCarree(),
        cmap=cmap_with_nan,
        vmin=vmin,
        vmax=vmax,
        shading='auto'
    )
    
    # Add colorbar
    cbar = plt.colorbar(im, ax=ax, orientation='vertical', shrink=0.8, pad=0.05)
    cbar.set_label(variable_name, fontsize=12)
    
    # Add title with date
    plt.title(f'{title_tag}\n{date_str}', fontsize=14)
    ax.set_xlabel('Longitude', fontsize=12)
    ax.set_ylabel('Latitude', fontsize=12)
    
    # Add gridlines
    gl = ax.gridlines(draw_labels=True, linewidth=0.5, alpha=0.5)
    gl.top_labels = False
    gl.right_labels = False
    
    plt.tight_layout()
    
    # Generate filename
    if bbox:
        bbox_str = f"{bbox[0]:.0f}_{bbox[1]:.0f}_{bbox[2]:.0f}_{bbox[3]:.0f}"
        region_tag = "subset"
    else:
        bbox_str = "global"
        region_tag = "global"
    
    date_str_clean = date_str.replace('-', '')
    filename = f"PACE_L3_{variable_name}_{date_str_clean}_{region_tag}_{bbox_str}.png"
    
    save_path = os.path.join(outdir, filename)
    plt.savefig(save_path, dpi=300, bbox_inches='tight', facecolor='white')
    print(f"Plot saved to: {save_path}")
    
    plt.show()
    
    return fig, ax

In [None]:
def create_l3_animation(dataset, variable_name, plot_range=None, colormap='viridis', 
                            outdir='.', title_tag='PACE L3', bbox=None, 
                            percentile_range=(2, 98), nan_color='black',
                            figsize=(12, 8), duration=1000, loop=True):
    """
    Create an animated GIF of PACE Level 3 data cycling through each day
    
    Parameters:
        dataset: xarray Dataset from read_l3_files()
        variable_name: name of variable to plot
        plot_range: tuple of (vmin, vmax) or None for auto-range
        colormap: matplotlib colormap name
        outdir: output directory
        title_tag: tag for plot title
        bbox: tuple of (lon_min, lat_min, lon_max, lat_max) for plot extent only
        percentile_range: percentiles for auto-ranging
        nan_color: color for NaN values
        figsize: figure size (width, height)
        duration: duration per frame in milliseconds
        loop: whether to loop the animation
    
    Returns:
        filename of saved GIF
    """
    
    if dataset is None or variable_name not in dataset.variables:
        print("No data to animate!")
        return None
    
    # Get the data variable
    data_var = dataset[variable_name]
    
    print(f"Original data shape: {data_var.shape}")
    
    # Check if time dimension exists
    if 'time' not in data_var.dims:
        print("No time dimension found - cannot create animation")
        return None
    
    # Get number of time steps
    n_times = len(data_var.time)
    print(f"Found {n_times} time steps for animation")
    
    if n_times == 1:
        print("Only one time step - cannot create animation")
        return None
    
    # Subset data for plotting if bbox is specified
    if bbox is not None:
        lon_min, lat_min, lon_max, lat_max = bbox
        print(f"Applying bbox constraint: lon {lon_min} to {lon_max}, lat {lat_min} to {lat_max}")
        
        try:
            lons = dataset.longitude.values
            lats = dataset.latitude.values
            
            if lon_max < lon_min:  # bbox crosses dateline
                lon_mask = (lons >= lon_min) | (lons <= lon_max)
            else:
                lon_mask = (lons >= lon_min) & (lons <= lon_max)
            
            lat_mask = (lats >= lat_min) & (lats <= lat_max)
            
            if np.sum(lon_mask) == 0 or np.sum(lat_mask) == 0:
                print("Warning: No data points within specified bbox, using full dataset")
                plot_extent = [lon_min, lon_max, lat_min, lat_max]  # Still use bbox for extent
                plot_data = data_var
            else:
                plot_data = data_var.sel(
                    longitude=data_var.longitude[lon_mask],
                    latitude=data_var.latitude[lat_mask]
                )
                plot_extent = [lon_min, lon_max, lat_min, lat_max]
                print(f"Data subset to shape: {plot_data.shape}")
                
        except Exception as e:
            print(f"Error during subsetting: {e}, using full dataset")
            plot_extent = [lon_min, lon_max, lat_min, lat_max]  # Still use bbox for extent
            plot_data = data_var
    else:
        plot_data = data_var
        lon_min, lon_max = float(dataset.longitude.min()), float(dataset.longitude.max())
        lat_min, lat_max = float(dataset.latitude.min()), float(dataset.latitude.max())
        plot_extent = [lon_min, lon_max, lat_min, lat_max]
        print("Using full dataset extent")
    
    # Calculate plot range using the (possibly subsetted) data
    if plot_range is None:
        valid_data = plot_data.values[~np.isnan(plot_data.values)]
        if len(valid_data) == 0:
            print("Warning: No valid data found!")
            vmin, vmax = 0, 1
        else:
            vmin = np.percentile(valid_data, percentile_range[0])
            vmax = np.percentile(valid_data, percentile_range[1])
            actual_min = np.min(valid_data)
            actual_max = np.max(valid_data)
            print(f"Data range: {actual_min:.4f} to {actual_max:.4f}")
            print(f"Auto-calculated plot range ({percentile_range[0]}-{percentile_range[1]}%): {vmin:.3f} to {vmax:.3f}")
    else:
        vmin, vmax = plot_range
        print(f"Using specified plot range: {vmin} to {vmax}")
    
    # Create colormap with NaN handling
    print(f"Using colormap: {colormap}")
    base_cmap = plt.colormaps.get_cmap(colormap)
    cmap_with_nan = base_cmap.copy()
    
    # Handle NaN colors for cartopy
    if nan_color.lower() in ['black', 'k']:
        nan_facecolor = 'black'
        cmap_with_nan.set_bad(alpha=0)
    elif nan_color.lower() in ['transparent', 'none']:
        nan_facecolor = 'white'
        cmap_with_nan.set_bad(alpha=0)
    else:
        nan_facecolor = nan_color
        cmap_with_nan.set_bad(alpha=0)
    
    print(f"NaN areas will appear as: {nan_facecolor}")
    
    # Create the figure (will be reused for each frame)
    fig = plt.figure(figsize=figsize)
    
    # Store temporary files for frames
    temp_files = []
    
    print("Creating animation frames...")
    
    for i in range(n_times):
        print(f"  Creating frame {i+1}/{n_times}")
        
        # Clear the figure
        fig.clear()
        
        # Create subplot with cartopy projection
        ax = fig.add_subplot(1, 1, 1, projection=ccrs.PlateCarree())
        
        # Set background color for NaN values
        ax.set_facecolor(nan_facecolor)
        
        # Add map features with BLACK boundaries and land (as specified)
        ax.add_feature(cfeature.COASTLINE, linewidth=0.5, edgecolor='black')
        ax.add_feature(cfeature.BORDERS, linewidth=0.5, edgecolor='black')
        ax.add_feature(cfeature.STATES, linewidth=0.3, edgecolor='black')
        ax.add_feature(cfeature.LAND, color='black', alpha=0.3)
        ax.add_feature(cfeature.OCEAN, color='lightblue', alpha=0.2)
        
        # Set map extent using bbox
        ax.set_extent(plot_extent, crs=ccrs.PlateCarree())
        
        # Get data for this time step
        day_data = plot_data.isel(time=i)
        
        # Plot the data using specified colormap and range
        im = ax.pcolormesh(
            day_data.longitude, 
            day_data.latitude, 
            day_data,
            transform=ccrs.PlateCarree(),
            cmap=cmap_with_nan,  # Use specified colormap
            vmin=vmin,           # Use specified/calculated range
            vmax=vmax,           # Use specified/calculated range
            shading='auto'
        )
        
        # Add colorbar
        cbar = plt.colorbar(im, ax=ax, orientation='vertical', 
                           shrink=0.8, pad=0.05)
        cbar.set_label(variable_name, fontsize=12)
        
        # Format date for title
        time_val = plot_data.time.values[i]
        try:
            if hasattr(time_val, 'strftime'):
                date_str = time_val.strftime('%Y-%m-%d')
            elif hasattr(time_val, 'astype'):
                date_str = str(time_val.astype('datetime64[D]'))
            else:
                date_str = str(time_val)[:10]
        except:
            date_str = str(time_val)[:10]
        
        # Set title with date
        plt.title(f'{title_tag}\n{date_str}', fontsize=14, pad=20)
        ax.set_xlabel('Longitude', fontsize=12)
        ax.set_ylabel('Latitude', fontsize=12)
        
        # Add gridlines
        gl = ax.gridlines(draw_labels=True, linewidth=0.3, alpha=0.5)
        gl.top_labels = False
        gl.right_labels = False
        
        plt.tight_layout()
        
        # Save frame as temporary image
        temp_filename = f"temp_frame_{i:03d}.png"
        temp_path = os.path.join(outdir, temp_filename)
        plt.savefig(temp_path, dpi=150, bbox_inches='tight', facecolor='white')
        temp_files.append(temp_path)
    
    # Create GIF from frames
    print("Assembling GIF animation...")
    
    try:        
        # Load all frames
        images = []
        for temp_file in temp_files:
            img = Image.open(temp_file)
            images.append(img)
        
        # Generate output filename
        if bbox:
            bbox_str = f"{bbox[0]:.0f}_{bbox[1]:.0f}_{bbox[2]:.0f}_{bbox[3]:.0f}"
            region_tag = "subset"
        else:
            bbox_str = "global"
            region_tag = "global"
        
        start_date = str(plot_data.time.values[0])[:10].replace('-', '')
        end_date = str(plot_data.time.values[-1])[:10].replace('-', '')
        
        # Include colormap and range in filename for clarity
        range_str = f"{vmin:.2f}_{vmax:.2f}".replace('.', 'p').replace('-', 'neg')
        gif_filename = f"PACE_L3_{variable_name}_{colormap}_{range_str}_animation_{start_date}_{end_date}_{region_tag}_{bbox_str}.gif"
        gif_path = os.path.join(outdir, gif_filename)
        
        # Save as GIF
        images[0].save(
            gif_path,
            save_all=True,
            append_images=images[1:],
            duration=duration,
            loop=0 if loop else 1,
            optimize=True
        )
        
        print(f"GIF animation saved to: {gif_path}")
        print(f"  - Colormap: {colormap}")
        print(f"  - Range: {vmin:.3f} to {vmax:.3f}")
        print(f"  - Extent: {plot_extent}")
        print(f"  - Duration: {duration}ms per frame")
        
        # Clean up temporary files
        for temp_file in temp_files:
            try:
                os.remove(temp_file)
            except:
                pass
        
        plt.close(fig)
        
        return gif_path
        
    except ImportError:
        print("PIL (Pillow) not installed. Install with: pip install Pillow")
        print("Temporary frame files saved for manual GIF creation:")
        for temp_file in temp_files:
            print(f"  {temp_file}")
        return None
    
    except Exception as e:
        print(f"Error creating GIF: {e}")
        print("Temporary frame files saved:")
        for temp_file in temp_files:
            print(f"  {temp_file}")
        return None

In [None]:
# Slow animation (2 seconds per frame) Fast animation has duration = 500 for 0.5 sec per frame
gif_path = create_l3_animation(
    dataset, 
    variable_name,           # 
    plot_range=plot_range,   # 
    colormap=colormap,       # 
    outdir=outdir,           # 
    title_tag=TAG,           # 
    bbox=bbox,               # 
    duration=250,           # 0.5 seconds per frame
    figsize=(16, 8)         # Larger figure
)

In [None]:
# Multi-day subplot plot (automatic grid)
fig, axes = plot_pace_l3_data_daily(
    dataset, 
    variable_name, 
    plot_range=plot_range,
    colormap=colormap,
    outdir=outdir,
    title_tag=TAG, 
    bbox=bbox,
    ncols=2,  
    figsize_per_plot=(10, 6),  # Larger subplots
)

In [None]:
data_var = dataset[variable_name]
print(data_var.time)