# Inspect Oceanum OCC CMIP6 Wave Datasources

This notebook provides a comprehensive introduction to searching for and inspecting wave
model outputs from Oceanum's OCC wave climate projections available through Datamesh. 

## What You'll Learn

By the end of this tutorial, you will understand how to:
- Connect to the Datamesh API using the Oceanum Python client
- Search for wave datasets using different strategies (keywords, tags, filters)
- Understand the different types of wave data available (parameters, spectra, statistics)
- Inspect dataset metadata to understand data structure and content
- Navigate the CMIP6 climate projection datasets from Oceanum

## Background: Oceanum's OCC CMIP6 Wave Model Data

Oceanum provides high-resolution wave model outputs based on CMIP6 climate projections.
The data includes:

- **WW3 (WAVEWATCH III)**: Global wave model outputs on a 1-degree grid
- **SWAN (Simulating WAves Nearshore)**: High-resolution regional wave model for New Zealand on a 5km grid
- **Wind forcing**: Global and NIWA downscale (CCAM) wind forcing data from CMIP6 ACCESS-CM2 and EC-Earth3 global climate models
- **Time Periods**: Historical (1985-2015) and future projections (2015-2100) under different emission scenarios
- **Shared Socioeconomic Pathways**: SSP245, SSP370
- **Data Types**:
  - **Parameters**: Three-hourly, integraded wave parameters over the full computational grid (height, period, direction, etc.)
  - **Spectra**: Three-hourly, Frequency-direction 2D wave energy spectra at a large subset of grid points
  - **Gridstats**: Statistical summaries and derived metrics

### Required Python Libraries

- [oceanum](https://oceanum-python.readthedocs.io/en/latest/): Python client for accessing Datamesh


In [1]:
import textwrap
from oceanum.datamesh import Connector

import warnings
warnings.filterwarnings("ignore")


## 1. Setting Up the Connection

The [Connector](https://oceanum-python.readthedocs.io/en/latest/classes/datamesh/oceanum.datamesh.Connector.html) class provides access to methods for querying data and metadata from Datamesh.

**Authentication**: You can provide your Datamesh token directly or set it as an environment variable `DATAMESH_TOKEN`.


In [2]:
conn = Connector(token=None)

Using datamesh API version 0
You are using version 1.0.8 of oceanum_python. A new version is available: 1.0.9. Please update your client to benefit from the latest features and updates.


## 2. Searching for Wave Datasets

The `get_catalog` method allows you to search for datasets using keywords that match against dataset names, descriptions, and tags.


### 2.1 Basic Keyword Search

Let's start with a broad search to see all available Oceanum CMIP6 wave datasources:


In [3]:
cat = conn.get_catalog("occ cmip6 wave")

# List the matching datasources. Each entry is a Datasource object carrying all the metadata

print(f"Found {len(list(cat))} datasets matching 'occ cmip6 wave'")

list(cat)


Found 36 datasets matching 'occ cmip6 wave'


[
         Oceanum CMIP6 WW3 EC-Earth3 global historical gridded wave stats [oceanum_cmip6_ww3_global_ec_earth3_historical_r1i1p1f1_gridstats]
             Extent: (0.0, -78.0, 360.0, 78.0)
             Timerange: 1985-01-01 00:00:00+00:00 to 2015-01-01 00:00:00+00:00
         ,
 
         Oceanum CMIP6 SWAN ACCESS-CM2 New Zealand historical wave spectra [oceanum_cmip6_swan_nz_access_cm2_historical_r1i1p1f1_spectra]
             Extent: (165.8000030517578, -47.79999923706055, 179.8000030517578, -34.0)
             Timerange: 1985-01-01 00:00:00+00:00 to 2015-01-01 00:00:00+00:00
         ,
 
         Oceanum CMIP6 WW3 EC-Earth3 global projections ssp370 gridded wave stats [oceanum_cmip6_ww3_global_ec_earth3_ssp370_r1i1p1f1_gridstats]
             Extent: (0.0, -78.0, 360.0, 78.0)
             Timerange: 2015-01-01 00:00:00+00:00 to 2100-01-01 00:00:00+00:00
         ,
 
         Oceanum CMIP6 WW3 ACCESS-CM2 global projections ssp370 gridded wave stats [oceanum_cmip6_ww3_global_access_c

### 2.2 Refined Search by Model and Time Period

The search results show many datasets. Let's refine our search to focus on historical runs driven by the EC-Earth3 climate model:


In [4]:
cat = conn.get_catalog("occ cmip6 wave historical EC-Earth3")

print(f"Found {len(list(cat))} historical EC-Earth3 datasets")

list(cat)


Found 6 historical EC-Earth3 datasets


[
         Oceanum CMIP6 SWAN EC-Earth3 New Zealand historical gridded wave stats [oceanum_cmip6_swan_nz_ec_earth3_historical_r1i1p1f1_gridstats]
             Extent: (165.0, -48.0, 180.0, -34.0)
             Timerange: 1985-01-01 00:00:00+00:00 to 2015-01-01 00:00:00+00:00
         ,
 
         Oceanum CMIP6 WW3 EC-Earth3 global historical wave spectra [oceanum_cmip6_ww3_global_ec_earth3_historical_r1i1p1f1_spectra]
             Extent: (0.0, -60.0, 359.0, 69.0)
             Timerange: 1985-01-01 00:00:00+00:00 to 2015-01-01 00:00:00+00:00
         ,
 
         Oceanum CMIP6 SWAN EC-Earth3 New Zealand historical wave parameters [oceanum_cmip6_swan_nz_ec_earth3_historical_r1i1p1f1_grid]
             Extent: (165.0, -48.0, 180.0, -34.0)
             Timerange: 1985-01-01 00:00:00+00:00 to 2015-01-01 00:00:00+00:00
         ,
 
         Oceanum CMIP6 WW3 EC-Earth3 global historical gridded wave stats [oceanum_cmip6_ww3_global_ec_earth3_historical_r1i1p1f1_gridstats]
             Extent: 

### 2.3 Tag-Based Search

For more precise searches, you can use tags with the format `tags:tag1&tag2&tag3`. This is particularly useful when you know exactly what type of data you need.

**Example 1**: Search for all global wave spectra datasets


In [5]:
cat = conn.get_catalog("tags:occ&cmip6&ww3&global&spectra")

print(f"Found {len(list(cat))} global spectra datasets")

list(cat)


Found 6 global spectra datasets


[
         Oceanum CMIP6 WW3 ACCESS-CM2 global historical wave spectra [oceanum_cmip6_ww3_global_access_cm2_historical_r1i1p1f1_spectra]
             Extent: (0.0, -60.0, 359.0, 69.0)
             Timerange: 1985-01-01 00:00:00+00:00 to 2015-01-01 00:00:00+00:00
         ,
 
         Oceanum CMIP6 WW3 ACCESS-CM2 global projections ssp245 wave spectra [oceanum_cmip6_ww3_global_access_cm2_ssp245_r1i1p1f1_spectra]
             Extent: (0.0, -60.0, 359.0, 69.0)
             Timerange: 2015-01-01 00:00:00+00:00 to 2100-01-01 00:00:00+00:00
         ,
 
         Oceanum CMIP6 WW3 ACCESS-CM2 global projections ssp370 wave spectra [oceanum_cmip6_ww3_global_access_cm2_ssp370_r1i1p1f1_spectra]
             Extent: (0.0, -60.0, 359.0, 69.0)
             Timerange: 2015-01-01 00:00:00+00:00 to 2100-01-01 00:00:00+00:00
         ,
 
         Oceanum CMIP6 WW3 EC-Earth3 global historical wave spectra [oceanum_cmip6_ww3_global_ec_earth3_historical_r1i1p1f1_spectra]
             Extent: (0.0, -60.0, 3

**Example 2**: Search for SWAN New Zealand gridded parameter datasets


In [6]:
cat = conn.get_catalog("tags:occ&cmip6&swan&nz&parameters")

print(f"Found {len(list(cat))} SWAN NZ parameter datasets")

list(cat)

Found 6 SWAN NZ parameter datasets


[
         Oceanum CMIP6 SWAN ACCESS-CM2 New Zealand historical wave parameters [oceanum_cmip6_swan_nz_access_cm2_historical_r1i1p1f1_grid]
             Extent: (165.0, -48.0, 180.0, -34.0)
             Timerange: 1985-01-01 00:00:00+00:00 to 2015-01-01 00:00:00+00:00
         ,
 
         Oceanum CMIP6 SWAN ACCESS-CM2 New Zealand projections ssp245 wave parameters [oceanum_cmip6_swan_nz_access_cm2_ssp245_r1i1p1f1_grid]
             Extent: (165.0, -48.0, 180.0, -34.0)
             Timerange: 2015-01-01 00:00:00+00:00 to 2100-01-01 00:00:00+00:00
         ,
 
         Oceanum CMIP6 SWAN ACCESS-CM2 New Zealand projections ssp370 wave parameters [oceanum_cmip6_swan_nz_access_cm2_ssp370_r1i1p1f1_grid]
             Extent: (165.0, -48.0, 180.0, -34.0)
             Timerange: 2015-01-01 00:00:00+00:00 to 2100-01-01 00:00:00+00:00
         ,
 
         Oceanum CMIP6 SWAN EC-Earth3 New Zealand historical wave parameters [oceanum_cmip6_swan_nz_ec_earth3_historical_r1i1p1f1_grid]
             E

**Example 3**: Search for all gridded wave statistics datasets


In [7]:
cat = Connector().get_catalog("tags:occ&cmip6&wave&gridstats")

print(f"Found {len(list(cat))} gridded wave statistics datasets")

list(cat)

Using datamesh API version 0
You are using version 1.0.8 of oceanum_python. A new version is available: 1.0.9. Please update your client to benefit from the latest features and updates.
Found 12 gridded wave statistics datasets


[
         Oceanum CMIP6 SWAN EC-Earth3 New Zealand projections ssp370 gridded wave stats [oceanum_cmip6_swan_nz_ec_earth3_ssp370_r1i1p1f1_gridstats]
             Extent: (165.0, -48.0, 180.0, -34.0)
             Timerange: 2015-01-01 00:00:00+00:00 to 2100-01-01 00:00:00+00:00
         ,
 
         Oceanum CMIP6 SWAN ACCESS-CM2 New Zealand historical gridded wave stats [oceanum_cmip6_swan_nz_access_cm2_historical_r1i1p1f1_gridstats]
             Extent: (165.0, -48.0, 180.0, -34.0)
             Timerange: 1985-01-01 00:00:00+00:00 to 2015-01-01 00:00:00+00:00
         ,
 
         Oceanum CMIP6 SWAN ACCESS-CM2 New Zealand projections ssp245 gridded wave stats [oceanum_cmip6_swan_nz_access_cm2_ssp245_r1i1p1f1_gridstats]
             Extent: (165.0, -48.0, 180.0, -34.0)
             Timerange: 2015-01-01 00:00:00+00:00 to 2100-01-01 00:00:00+00:00
         ,
 
         Oceanum CMIP6 SWAN ACCESS-CM2 New Zealand projections ssp370 gridded wave stats [oceanum_cmip6_swan_nz_access_cm2_ssp37

## 3. Understanding Dataset Types

From the search results, you can see three main types of wave datasets:

1. **Parameters** (`_grid`): Raw wave parameters like significant wave height, peak period, mean direction
2. **Spectra** (`_spectra`): Full 2D wave energy spectra (frequency × direction)
3. **Gridstats** (`_gridstats`): Statistical summaries and derived metrics

### Key Differences:
- **Parameters**: Best for general wave climate analysis, coastal engineering applications
- **Spectra**: Required for detailed wave energy analysis, wave transformation studies
- **Gridstats**: Useful for climate change impact assessments, statistical analysis


## 4. Inspecting Dataset Metadata

Each dataset in the catalog is represented by a [Datasource](https://oceanum-python.readthedocs.io/en/latest/classes/datamesh/oceanum.datamesh.Datasource.html#oceanum.datamesh.Datasource) object containing comprehensive metadata.


### 4.1 Getting a Specific Dataset

You can retrieve a specific dataset either from search results:


In [8]:
# Get the first dataset from our search results

ds = list(cat)[0]

print("Dataset from search results:")

print(ds)

Dataset from search results:

        Oceanum CMIP6 SWAN EC-Earth3 New Zealand projections ssp370 gridded wave stats [oceanum_cmip6_swan_nz_ec_earth3_ssp370_r1i1p1f1_gridstats]
            Extent: (165.0, -48.0, 180.0, -34.0)
            Timerange: 2015-01-01 00:00:00+00:00 to 2100-01-01 00:00:00+00:00
        


Or get a dataset directly by its ID. In this example, we'll examine in detail the output from a **SWAN gridded parameters** dataset.


In [10]:
ds = conn.get_datasource("oceanum_cmip6_swan_nz_access_cm2_historical_r1i1p1f1_grid")

print("Dataset retrieved by ID:")

print(ds)

Dataset retrieved by ID:

        Oceanum CMIP6 SWAN ACCESS-CM2 New Zealand historical wave parameters [oceanum_cmip6_swan_nz_access_cm2_historical_r1i1p1f1_grid]
            Extent: (165.0, -48.0, 180.0, -34.0)
            Timerange: 1985-01-01 00:00:00+00:00 to 2015-01-01 00:00:00+00:00
            8 attributes
            28 variables
        


### 4.2 Essential Dataset Attributes

Let's explore the key metadata attributes that help you understand and work with the dataset:


**Dataset ID** (unique identifier):


In [11]:
print("Dataset ID:", ds.id)

Dataset ID: oceanum_cmip6_swan_nz_access_cm2_historical_r1i1p1f1_grid


**Human-readable name**:


In [12]:
print("Dataset Name:", ds.name)

Dataset Name: Oceanum CMIP6 SWAN ACCESS-CM2 New Zealand historical wave parameters


**Detailed description**:


In [13]:
print("Description:")

print(textwrap.fill(ds.description, width=120))

Description:
New Zealand gridded wave parameters from Oceanum SWAN driven by NIWA CCAM downscaled ACCESS-CM2 winds and WW3 global
boundary spectra under CMIP6 historical simulations (r1i1p1f1 ensemble member). This dataset provides comprehensive wave
parameters for the full and partitioned spectra on a 5km New Zealand grid for climate research and oceanographic
applications as part of the Our Changing Coast project.


**Available variables** (wave parameters):


In [14]:
print("Available Variables:")

variables = list(ds.variables.keys())

print(f"Total: {len(variables)} variables")

print("Variables:", variables)

Available Variables:
Total: 28 variables
Variables: ['hs', 'dpm', 'tps', 'botl', 'dspr', 'fspr', 'hsea', 'hswe', 'phs0', 'phs1', 'ptp0', 'ptp1', 'tm01', 'tm02', 'xcur', 'xwnd', 'ycur', 'ywnd', 'pdir0', 'pdir1', 'dpmsea', 'dpmswe', 'pdspr0', 'pdspr1', 'pwlen0', 'pwlen1', 'tpssea', 'tpsswe']


### 4.3 Temporal Coverage


In [15]:
print("Time Coverage:")

print(f"Start: {ds.tstart}")

print(f"End: {ds.tend}")

print(f"Duration: {ds.tend.year - ds.tstart.year} years")

Time Coverage:
Start: 1985-01-01 00:00:00+00:00
End: 2015-01-01 00:00:00+00:00
Duration: 30 years


### 4.4 Spatial Coverage


In [16]:
print("Spatial Extent (West, South, East, North):")

bounds = ds.geom.bounds

print(f"Longitude: {bounds[0]}° to {bounds[2]}°")

print(f"Latitude: {bounds[1]}° to {bounds[3]}°")

Spatial Extent (West, South, East, North):
Longitude: 165.0° to 180.0°
Latitude: -48.0° to -34.0°


### 4.5 Coordinate System


In [17]:
print("Coordinate Mapping:")

for key, value in ds.coordinates.items():
    print(f"  {key} → {value}")

Coordinate Mapping:
  t → time
  x → longitude
  y → latitude


## 5. Detailed Data Schema Inspection

The data schema provides complete information about the dataset structure, including dimensions, coordinates, and variable attributes:


In [18]:
schema = ds.dataschema.model_dump()

print("Dataset Dimensions:")

for dim, size in schema['dims'].items():
    print(f"  {dim}: {size}")

print(f"\nTotal data points: {schema['dims']['time'] * schema['dims']['latitude'] * schema['dims']['longitude']:,}")


Dataset Dimensions:
  time: 87657
  latitude: 281
  longitude: 301

Total data points: 7,414,116,717


### 5.1 Variable Details

Let's examine a few key wave variables in detail:


In [19]:
# Key wave parameters to highlight
key_variables = ['hs', 'dpm', 'tps', 'tm01', 'tm02']

print("Key Wave Variables:")
print("-" * 80)

for var in key_variables:
    if var in schema['data_vars']:
        var_info = schema['data_vars'][var]
        print(f"\n{var.upper()}:")
        print(f"  Name: {var_info['attrs']['long_name']}")
        print(f"  Units: {var_info['attrs']['units']}")
        print(f"  Range: {var_info['attrs']['valid_min']} - {var_info['attrs']['valid_max']}")
        if 'standard_name' in var_info['attrs']:
            print(f"  CF Standard Name: {var_info['attrs']['standard_name']}")


Key Wave Variables:
--------------------------------------------------------------------------------

HS:
  Name: significant height of wind and swell waves
  Units: m
  Range: 0.0 - 50.0
  CF Standard Name: sea_surface_wave_significant_height

DPM:
  Name: mean direction at the spectral peak of wind and swell waves
  Units: degree
  Range: 0.0 - 360.0
  CF Standard Name: sea_surface_wave_from_direction_at_variance_spectral_density_maximum

TPS:
  Name: smooth relative peak wave period of wind and swell waves
  Units: s
  Range: 0.0 - 50.0
  CF Standard Name: sea_surface_wave_period_at_variance_spectral_density_maximum

TM01:
  Name: mean absolute wave period of wind and swell waves from the first frequency moment
  Units: s
  Range: 0.0 - 50.0
  CF Standard Name: sea_surface_wave_mean_period_from_variance_spectral_density_first_frequency_moment

TM02:
  Name: mean absolute wave period of wind and swell waves from the second frequency moment
  Units: s
  Range: 0.0 - 50.0
  CF Standard

### 5.2 Understanding Wave Variable Categories

The dataset contains several categories of wave variables:


In [20]:
# Categorize variables for better understanding
wave_categories = {
    'Total Wave Parameters': ['hs', 'dpm', 'tps', 'dspr', 'fspr', 'tm01', 'tm02'],
    'Wind Wave Parameters': ['hsea', 'dpmsea', 'tpssea', 'phs0', 'ptp0', 'pdir0', 'pdspr0', 'pwlen0'],
    'Swell Wave Parameters': ['hswe', 'dpmswe', 'tpsswe', 'phs1', 'ptp1', 'pdir1', 'pdspr1'],
    'Environmental Forcing': ['xwnd', 'ywnd', 'xcur', 'ycur', 'botl']
}

print("Wave Variable Categories:")
print("=" * 50)

for category, vars_list in wave_categories.items():
    print(f"\n{category}:")
    available_vars = [v for v in vars_list if v in ds.variables.keys()]
    for var in available_vars:
        if var in schema['data_vars']:
            long_name = schema['data_vars'][var]['attrs']['long_name']
            units = schema['data_vars'][var]['attrs']['units']
            print(f"  • {var}: {long_name} ({units})")


Wave Variable Categories:

Total Wave Parameters:
  • hs: significant height of wind and swell waves (m)
  • dpm: mean direction at the spectral peak of wind and swell waves (degree)
  • tps: smooth relative peak wave period of wind and swell waves (s)
  • dspr: directional spreading of wind and swell waves (degree)
  • fspr: normalized width of the frequency spectrum of wind and swell waves (1)
  • tm01: mean absolute wave period of wind and swell waves from the first frequency moment (s)
  • tm02: mean absolute wave period of wind and swell waves from the second frequency moment (s)

Wind Wave Parameters:
  • hsea: significant height of wind waves under 8 seconds period (m)
  • dpmsea: mean direction at the spectral peak of wind waves below 8 seconds period (degree)
  • tpssea: smooth relative peak wave period of wind waves below 8 seconds period (s)
  • phs0: sea surface wind wave significant height (m)
  • ptp0: sea surface wind wave period at variance spectral density maximum (s)


## 6. Summary and Next Steps

### What We've Covered

1. **Connection Setup**: How to connect to Datamesh using the Oceanum Python client
2. **Search Strategies**: Different approaches to find relevant wave datasets
3. **Dataset Types**: Understanding parameters, spectra, and gridstats datasets
4. **Metadata Inspection**: Exploring dataset attributes, coverage, and structure
5. **Variable Understanding**: Categorizing and interpreting wave parameters

### Recommended Next Steps

1. **Data Access**: Learn how to actually download and work with the data using `conn.get_data()`
2. **Spatial/Temporal Filtering**: Practice subsetting data for specific regions and time periods
3. **Data Analysis**: Explore wave climate patterns, trends, and statistics
4. **Visualization**: Create maps and time series plots of wave parameters

### Key Takeaways for New Users

- Start with **parameter datasets** (`_grid`) for general wave analysis
- Use **tag-based searches** for precise dataset discovery
- Always inspect **metadata** before downloading large datasets
- Consider **spatial and temporal coverage** when selecting datasets
- Understand the difference between **historical** and **projection** scenarios

### Getting Help

- [Oceanum Python Documentation](https://oceanum-python.readthedocs.io/en/latest/)
- [Datamesh API Reference](https://oceanum-python.readthedocs.io/en/latest/classes/datamesh/)
- Contact Oceanum support for specific questions about datasets
