---
title: Loading bed picks -- Work in progress exploration
date: 2025-11-25
---

This notebook is meant as a starting point for a conversation about how to handle OPR bed picks and how to integrate other bed pick sources (primarily the BedMap3 dataset) into xOPR.

Right now, this is on a custom branch with some basic fixes to layer loading but not major changes to the structure. We may want to rethink bed picking more significantly, though.

There are really two somewhat distinct bed pick workflows:
1. Primarily working with bed picks and wanting to be able to trace back to the radar data when needed
2. Primarily working with radar data and wanting to use bed picks as context into that data

Serving both with the same interface may be a bit tricky, but I think it'll be worth the effort. Look at the `demo_notebook.ipynb` for context on how the second use case works.

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
import xopr

import holoviews as hv
import xarray as xr
import hvplot
import hvplot.xarray
import hvplot.pandas
import geoviews.feature as gf
import cartopy.crs as ccrs
import rioxarray
from tqdm import tqdm
import numpy as np
import verde as vd

In [None]:
opr = xopr.OPRConnection(cache_dir='radar_cache')

In [None]:
epsg_3031 = ccrs.Stereographic(central_latitude=-90, true_scale_latitude=-71)
coastline = gf.coastline.options(scale='50m').opts(projection=epsg_3031)
velocity = rioxarray.open_rasterio(
    "https://its-live-data.s3.amazonaws.com/velocity_mosaic/v2/static/cog/ITS_LIVE_velocity_120m_RGI19A_0000_v02_v.tif",
    chunks='auto', overview_level=4, cache=False
).squeeze().drop_vars(['spatial_ref', 'band']).rename('velocity (m/year)')
velocity_map = velocity.hvplot.image(x='x', y='y', cmap='gray_r').opts(clim=(0,500))

In [None]:
region = xopr.geometry.get_antarctic_regions(name=["Vincennes_Bay", "Underwood"], merge_regions=True, simplify_tolerance=100)
region_projected = xopr.geometry.project_geojson(region, source_crs='EPSG:4326', target_crs="EPSG:3031")

region_hv = hv.Polygons([region_projected]).opts(
    color='green',
    line_color='black',
    fill_alpha=0.3)

(velocity_map * coastline * region_hv).opts(aspect='equal')

In [None]:
gdf = opr.query_frames(geometry=region).to_crs('EPSG:3031')
print(f"Found {len(gdf)} radar frames in the selected region.")
gdf.head()

In [None]:
radar_frames_hv = gdf.hvplot(by='collection', hover_cols=['id'])
(velocity_map * coastline * region_hv * radar_frames_hv).opts(aspect='equal', legend_position='top_left')

### Getting bed picks

Now that we've got our frames selected, we can load layer information for them. Layer information includes surface and bed and might come from the OPS API or from layerdata files hosted on OPR servers and indexed in the STAC catalog.

See https://gitlab.com/openpolarradar/opr/-/wikis/Layer-File-Guide

In [None]:
layer_ds_list = []

with tqdm(gdf.iterrows(), total=len(gdf)) as t:
    for id, frame in t:
        t.set_description(f"{id}")
        layers = opr.get_layers(frame)
        bed_layer_name = None
        if 'standard:bottom' in layers: # Generally, the picked bed should be in group "standard" with layer name "bottom"
            bed_layer_name = 'standard:bottom'
        elif ':bottom' in layers:
            bed_layer_name = ':bottom' # But occasionally it seems to be missing the group
        else:
            continue  # No bed layer found
        # Layers are stored in terms of two-way travel time to avoid any questions about travel speed within ice
        # This is different from how BedMap layers are stored, but it does make more sense when the radar data is availble to use twtt
        layer_wgs84 = xopr.radar_util.layer_twtt_to_range(layers[bed_layer_name], layers["standard:surface"], vertical_coordinate='wgs84').rename({'lat': 'Latitude', 'lon': 'Longitude'})
        layer_wgs84 = xopr.geometry.project_dataset(layer_wgs84, target_crs='EPSG:3031')
        layer_wgs84 = layer_wgs84.dropna('slow_time', subset=['wgs84'])
        layer_wgs84['source'] = id
        layer_ds_list.append(layer_wgs84)

We can now combine all of the layers to get a pointwise list of bed picks:

In [None]:
bed_merged = xr.concat(layer_ds_list, dim='slow_time')

# Just for plots later
xlim = (bed_merged.x.min().item(), bed_merged.x.max().item())
ylim = (bed_merged.y.min().item(), bed_merged.y.max().item())

bed_merged

In [None]:
bed_hv = bed_merged.hvplot.scatter(x='x', y='y', c='wgs84', cmap='turbo', s=2).opts(clabel='Bed Elevation WGS84 (m)')
(velocity_map.opts(colorbar=False) * coastline * region_hv * radar_frames_hv * bed_hv).opts(aspect='equal', legend_position='top_left', xlim=xlim, ylim=ylim)

### Gridding

My understanding of what Michael and Mickey want to do is that they want to aggregate the picks onto a regular grid, keeping some summary statistics within each cell. This is pretty ugly here, but just to demonstrate the workflow:

In [None]:
def grid_dataarray(d: xr.DataArray, spacing=1000, aggregation_fns={'median': "median", 'std': 'std', 'count': "count"}):
    """
    Grid a DataArray with x,y coordinates into a regular grid using block aggregation.
    
    Parameters
    ----------
    d : xr.DataArray
        Input DataArray with 'x' and 'y' coordinates
    spacing : float
        Grid spacing in the same units as x,y coordinates
    aggregation_fns : dict
        Dictionary mapping aggregation function names to functions (e.g., {'median': np.median, 'std': np.std})
    
    Returns
    -------
    xr.Dataset
        Dataset with variables named {d.name}_{fn_name} for each aggregation function
    """
    # Get data extent
    x_min = d['x'].min().values
    x_max = d['x'].max().values
    y_min = d['y'].min().values
    y_max = d['y'].max().values
    
    # Extract coordinate and data values
    x_data = d['x'].values
    y_data = d['y'].values
    data_values = d.values
    
    # Create grid coordinates
    grid_x, grid_y = vd.grid_coordinates(
        region=(x_min, x_max, y_min, y_max),
        spacing=spacing
    )
    
    # Dictionary to store gridded results for each aggregation function
    data_vars = {}
    
    for fn_name, fn in aggregation_fns.items():
        # Use Verde's BlockReduce with the specified aggregation function
        gridder = vd.BlockReduce(
            reduction=fn, 
            spacing=spacing, 
            region=(x_min, x_max, y_min, y_max),
            center_coordinates=True
        )
        block_coords, block_values = gridder.filter(
            coordinates=(x_data, y_data), 
            data=data_values
        )
        
        # Initialize grid with NaN
        grid_data = np.full(grid_x.shape, np.nan)
        
        # Vectorized approach: compute indices directly from coordinates
        x_indices = np.floor((block_coords[0] - x_min) / spacing).astype(int)
        y_indices = np.floor((block_coords[1] - y_min) / spacing).astype(int)
        
        for x_idx, y_idx, value in zip(x_indices.flatten(), y_indices.flatten(), block_values.flatten()):
            grid_data[y_idx, x_idx] = value
        
        # Store in dictionary with name pattern
        var_name = f"{d.name}_{fn_name}" if d.name else f"data_{fn_name}"
        data_vars[var_name] = (['y', 'x'], grid_data)
    
    # Create Dataset with all aggregated variables
    return xr.Dataset(
        data_vars=data_vars,
        coords={
            'y': grid_y[:, 0],
            'x': grid_x[0, :]
        }
    )

gridded = grid_dataarray(bed_merged['wgs84'], spacing=5000)

gridded_median_hv = hv.Image(gridded, kdims=['x', 'y'], vdims=['wgs84_median', 'wgs84_std', 'wgs84_count']).opts(
    cmap='turbo',
    aspect='equal',
    tools=['hover'],
    colorbar=True,
    clabel='WGS84 Elevation (m)'
)

gridded_std_hv = hv.Image(gridded, kdims=['x', 'y'], vdims=['wgs84_std', 'wgs84_median', 'wgs84_count']).opts(
    cmap='inferno',
    aspect='equal',
    tools=['hover'],
    colorbar=True,
    clabel='Std of WGS84 Elevation (m)'
)

(velocity_map * region_hv * coastline * gridded_median_hv).opts(width=500, aspect='equal', xlim=xlim, ylim=ylim) + \
    (velocity_map * region_hv * coastline * gridded_std_hv).opts(width=500, aspect='equal', xlim=xlim, ylim=ylim)

### Discussion

OPR and BedMap(1/2/3) are overlapping sets of bed picks, but neither fully encompasses the other. OPR is probably the preferrable source when both have the same bed picks, because OPR facilitaties linking back to the source radar data.

(I picked this particular region because there's data from the 2017 UTIG season that we've recently made available through OPR but is missing from BedMap as far as I can tell.)

Mathieu has told me that his workflow for resolving discrepancies involves checking the source radar data when it's available to try to confirm which pick is right. It seems pretty high value to maintain the links back to the radar source and make it really easy to load the source data when something needs to be checked out.

I think it's important that the same basic workflow can be used to fetch either OPR bed picks, BedMap3 bed picks, or a unified set of both (with conflicts resolved). The question is what should this interface actually look like.

It's appealing to me to keep the concept of a STAC catalog that indexes flight paths with data products attached to it with layer picking information. It's not necessarily clear we should follow the OPR standard for what that data product looks like, though. And there might be all-together better ways to deal with all of this.