---
title: xopr Demo Notebook
description: Basic demonstration of the core data loading functions of xopr
date: 2025-08-06
---

This is a basic demonstration of the core features of xopr for loading and plotting radar data.

In [None]:
%load_ext autoreload
%autoreload 2

import numpy as np
import xarray as xr
import geoviews as gv
import geoviews.feature as gf
import cartopy.crs as ccrs
import matplotlib.pyplot as plt

import xopr

import holoviews as hv
import hvplot.xarray
import hvplot.pandas
hvplot.extension('bokeh')

You'll first establish an OPR session. This object serves to contain any needed information about how to connect to OPR and to the STAC API.

Generally, you can just use `opr = xopr.OPRConnection()`, but you may want to customize other options. The most common would be to create a local radar cache, which you can do as shown in the cell below.

If you specify a `cache_dir`, then [fsspec](https://filesystem-spec.readthedocs.io/en/latest/features.html) will automatically manage a cache of any radar data that you need to download. This makes re-running things fast.

In [None]:
# Establish an OPR session
# You'll probably want to set a cache directory if you're running this locally to speed
# up subsequent requests. You can do other things like customize the STAC API endpoint,
# but you shouldn't need to do that for most use cases.
opr = xopr.OPRConnection(cache_dir="/tmp")

# Or you can open a connection without a cache directory (for example, if you're parallelizing
# this on a cloud cluster without persistent storage).
#opr = xopr.OPRConnection()

In the STAC catalog, every season (an entity such as `2022_Antarctica_BaslerMKB`) is a distinct collection. You can list the available collections.

In [None]:
# List the available OPR datasets
collections = opr.get_collections()
print([c['id'] for c in collections])
selected_collection = '2022_Antarctica_BaslerMKB' # Select a collection for demonstration
print(f"Selected collection: {selected_collection}")

Similarly, you can list available segments for a given season. We use the OPR terminology of `segment` to refer to a flight segment. In most cases, one segment corresponds to a single flight, but sometimes flights are broken into multiple segments for various reasons. A segment is uniquely defined by a collection (i.e. `2022_Antarctica_BaslerMKB`) and a segment path (i.e. `20221212_01`).

When we add a season into the STAC catalog, we add every segment for which a `CSARP_standard` product is available. (And we also link to other data products, if those are available.)

In [None]:
# List segments in the selected collection
segments = opr.get_segments(selected_collection)
print(f"Found {len(segments)} segments in collection {selected_collection}")
print(f"The first 3 segments are: {[s['segment_path'] for s in segments[:3]]}")

Once you pick a flight, we can actually start loading data. This happens in two steps. First, we get the STAC items, which are a description of the geometry and properties of each frame of data. They are returned from `query_frames()` as a `GeoPandas` `GeoDataFrame`.

In [None]:
selected_segment = '20230109_01' # Or from the list of segments like this: segments[0]['segment_path']
print(f"Selected segment: {selected_segment}")

stac_items = opr.query_frames(collections=[selected_collection], segment_paths=[selected_segment])
stac_items.head(3)

We can take a look at where our flight line is using the HoloViews accessor method to the `GeoDataFrame`.

In [None]:
background_map = gf.ocean.opts(projection=ccrs.SouthPolarStereo(), scale='50m') * gf.coastline.opts(projection=ccrs.SouthPolarStereo(), scale='50m')
background_map * stac_items.to_crs('EPSG:3031').hvplot(aspect='equal')

Now we'll use the STAC items to actaully load radar data.

How much data this needs to transfer will vary depending on both how long the flight is and how the underlying data is stored. We're working on migrating OPR data files to be cloud-optimized, however most of them aren't and some of them are old-school MATLAB v5 files. xopr is designed to hide these differences from you as much as possible, but that can only go so far.

In [None]:
frames = opr.load_frames(stac_items)

Let's look a single frame. This corresponds to a single `.mat` file that you might download from the OPR website. The structure of this should look familiar to you.

If you've tried directly loading one of these files in Python, you'll probably know that there are some quirks. We try to handle those behind the scenes and give you a nicely-formatted [xarray Dataset](https://docs.xarray.dev/en/stable/generated/xarray.Dataset.html).

If you've never heard of xarray, this might be a good time to go read the [xarray overview doc](https://docs.xarray.dev/en/stable/getting-started-guide/why-xarray.html).

```{tip}
xOPR produces datasets with very long nested attributes. In order to provide a nicer notebook preview of them, we've included an xarray accessor that provides an improved `_repr_html_` function. If you want nice previews of nested dictionaries of attributes, you can use:

`radar_ds.xopr`
```

In [None]:
# Inspect an individual frame
frames[0].xopr

You can manually merge the `Dataset`s if you like, but `xopr.merge_frames()` will do it for you. This helper function will return a list with one `xr.Dataset` per segment path or a single `xr.Dataset` if there is only one segment represented.

:::{tip} You can also pass `merge_flights=True` to `load_frames` and you'll get back a list with one dataset per flight. :::

In [None]:
flight_line = xopr.merge_frames(frames)

## Combine the frames into a single xarray Dataset representing the flight line
#flight_line = xr.concat(frames, dim='slow_time', combine_attrs=merge_dicts_no_conflicts).sortby('slow_time')
flight_line.xopr

For visualizing radargrams, it's helpful (i.e. we can make the plotting more efficient) if the slow time spacing is even. We can ensure this is true by doing some stacking along a fixed-time window:

```python
stacked = flight_line.resample(slow_time='2s').mean()
```

This gives us uniform spacing in the `slow_time` dimension. This allows for more efficient plotting (`imshow` can only be used with fixed spacing -- try `pcolormesh` if your spacing is variable).

In [None]:
stacked = flight_line.resample(slow_time='2s').mean()

stacked['Data'] = 10*np.log10(np.abs(stacked['Data']))

stacked.xopr

OPR data often also includes traced layers (surface, bed, and occasionally internal layers). There are two distinct formats that OPR uses, which you may be familiar with from the CReSIS `imb.picker` tool. One is a database that stores layer picks. The other is a layer file that is available as a separate data product.

xopr allows you to fetch the relevant layer information from either source. To select only database layers, you can call `get_layers(ds, source='db')`. For layer files, you can call `get_layers(ds, source='files')`. By default, `get_layers()` will first look for a layer file and fall back to the database if a file cannot be found.

Once again, we do our best to hide the different formats. Once you've loaded the layers either way, you'll get a dictionary mapping layer IDs to xarray Datasets of the same basic structure.

:::{tip}
How to handle layers is still very much under development. We would love feedback on this topic.
:::

In [None]:
layers = opr.get_layers(stacked)

In [None]:
layers[1] # Display the surface layer as an example

We can also add vertical coordintes to the layers. By default, the layer information is provided in `twtt` (two-way travel time), but it can be helpful to transform this to range or WGS84 elevation.

Remember that `layers` is a `dict` where the keys are OPR layer identifiers. As such, `1` is always the surface and `2` is always the bed.

In [None]:
for layer_idx in layers:
    layers[layer_idx] = xopr.layer_twtt_to_range(layers[layer_idx], layers[1], vertical_coordinate='wgs84')
    layers[layer_idx] = xopr.layer_twtt_to_range(layers[layer_idx], layers[1], vertical_coordinate='range')

Finally, let's make a radargram. If we were successful in loading layers, we will also plot the surface and bed layers here.

:::{tip}
`imshow` is a very fast way of plotting regulary-spaced 2D data. It works great if your `slow_time` spacing is uniform, as it is here because we already resampled during stacking. If you have non-uniformly spaced data, you must use `pcolormesh` or another plotting tool that can handle non-uniformly spaced data.
:::

In [None]:
fig, ax = plt.subplots(figsize=(15, 4))
stacked['Data'].plot.imshow(x='slow_time', cmap='gray', ax=ax)
ax.invert_yaxis()

if layers:
    layers[1]['twtt'].plot(ax=ax, x='slow_time', linestyle=':', label='Surface')
    layers[2]['twtt'].plot(ax=ax, x='slow_time', linestyle='--', label='Bed')
    ax.legend()

ax.set_title(f"{stacked.attrs['collection']} - {stacked.attrs['segment_path']}")
plt.show()

For display purposes, we can also transform the Y axis to show range (distance from the radar instrument, usually positive down) or altitude in WGS84 (negative down):

In [None]:
vcoord = 'wgs84' # Options are 'range' or 'wgs84'

tmp = xopr.interpolate_to_vertical_grid(stacked, vertical_coordinate=vcoord)

fig, ax = plt.subplots(figsize=(15, 4))
tmp['Data'].plot.imshow(x='slow_time', cmap='gray', ax=ax)
if vcoord == 'range':
    ax.invert_yaxis()

if layers:
    layers[1][vcoord].plot(ax=ax, x='slow_time', linestyle=':', label='Surface')
    layers[2][vcoord].plot(ax=ax, x='slow_time', linestyle='--', label='Bed')
    ax.legend()

ax.set_title(f"{stacked.attrs['collection']} - {stacked.attrs['segment_path']}")
plt.show()

OPR data comes from many sources. Collecting and processing that data has required the contributions of many people over multiple decades. We want to make it easy for you, as a user, to figure out how to appropriately cite the data you've used. This is still just an early prototype, but here's an idea of what it might look like to generate a report on how to cite your data:

In [None]:
print(flight_line.xopr.citation)