# MDIO Template Usage

```{article-info}
:author: Altay Sansal
:date: "{sub-ref}`today`"
:read-time: "{sub-ref}`wordcount-minutes` min read"
:class-container: sd-p-0 sd-outline-muted sd-rounded-3 sd-font-weight-light
```

```{warning}
Most SEG-Y files correspond to standard seismic data types or field configurations. We recommend using
the built-in templates from the registry whenever possible. Create a custom template only when your file
is unusual and cannot be represented by existing templates. In many cases, you can simply customize the
SEG-Y header byte mapping during ingestion without defining a new template.
```

In this tutorial we will walk through the Template Registry and show how to:

- Discover available templates in the registry
- Define and register your own template
- Build a dataset model and convert it to an Xarray Dataset using your custom template

If this is your first time with MDIO, you may want to skim the Quickstart first.

## What is a Template and a Template Registry?

A template defines how an MDIO dataset is structured: names of dimensions and coordinates, the default variable name, chunking hints, and attributes to be stored. Since many seismic datasets share common structures (e.g., 3D post-stack, 2D post-stack, pre-stack CDP/shot, etc.), MDIO ships with a pre-populated template registry and APIs to fetch or register templates.

Fetching a template from it returns a copied instance you can freely customize without affecting others.

In [None]:
from mdio.builder.template_registry import get_template
from mdio.builder.template_registry import get_template_registry
from mdio.builder.template_registry import list_templates

registry = get_template_registry()
registry  # pretty HTML in notebooks

We can list all registered templates and get a list as well.

In [None]:
list_templates()

## Defining a Minimal Custom Template

To define a custom template, subclass `AbstractDatasetTemplate` and set:

- `_name`: a public name for the template
- `_dim_names`: names for each axis of your data variable (the last axis is the trace/time or trace/depth axis)
- `_physical_coord_names` and `_logical_coord_names`: optional additional coordinate variables to store along the spatial grid
- `_load_dataset_attributes()`: optional attributes stored at the dataset level

Below we create a special template that can hold interval velocity field with multiple anisotropy parameters for a depth seismic volume.

The dimensions, dimension-coordinates and non-dimension coordinates will automatically get created using the method
from the base class. However, since we want more variables, we override `_add_variables` to add them.

In [None]:
from mdio.builder.schemas import compressors
from mdio.builder.schemas.chunk_grid import RegularChunkGrid
from mdio.builder.schemas.chunk_grid import RegularChunkShape
from mdio.builder.schemas.dtype import ScalarType
from mdio.builder.schemas.v1.variable import VariableMetadata
from mdio.builder.templates.base import AbstractDatasetTemplate


class AnisotropicVelocityTemplate(AbstractDatasetTemplate):
    """A custom template that has unusual dimensions and coordinates."""

    def __init__(self, data_domain: str = "depth") -> None:
        super().__init__(data_domain)
        # Dimension order matters; the last dimension is the depth
        self._dim_names = ("inline", "crossline", self.trace_domain)
        # Additional coordinates: these are added on top of dimension coordinates
        self._physical_coord_names = ("cdp_x", "cdp_y")
        self._var_chunk_shape = (128, 128, 128)
        self._units = {}

    @property
    def _name(self) -> str:  # public name for the registry
        return "AnisotropicVelocity3DDepth"

    @property
    def _default_variable_name(self) -> str:  # public name for the registry
        return "velocity"

    def _load_dataset_attributes(self) -> dict:
        return {"surveyType": "3D", "gatherType": "line"}

    def _add_variables(self) -> None:
        """Add the variables including default and extra."""
        for name in ["velocity", "epsilon", "delta"]:
            chunk_grid = RegularChunkGrid(configuration=RegularChunkShape(chunk_shape=self.full_chunk_shape))
            unit = self.get_unit_by_key(name)
            self._builder.add_variable(
                name=name,
                dimensions=self._dim_names,
                data_type=ScalarType.FLOAT32,
                compressor=compressors.Blosc(cname=compressors.BloscCname.zstd),
                coordinates=self.physical_coordinate_names,
                metadata=VariableMetadata(chunk_grid=chunk_grid, units_v1=unit),
            )


AnisotropicVelocityTemplate()

## Registering the Custom Template

The registry returns a deep copy of the template on every fetch. To make the template discoverable by name, register it first, then retrieve it with `get_template`.

In [None]:
from mdio.builder.template_registry import register_template

register_template(AnisotropicVelocityTemplate())
print("Registered:", "AnisotropicVelocity3DDepth" in list_templates())

custom_template = get_template("AnisotropicVelocity3DDepth")
custom_template

You can also set units at any time. For this demo we’ll set metric units. The spatial units will be inferred from the SEG-Y binary header during ingestion, but we can override them here. Ingestion will honor what is in the template.

In [None]:
from mdio.builder.schemas.v1.units import LengthUnitModel
from mdio.builder.schemas.v1.units import SpeedUnitModel

custom_template.add_units(
    {
        "depth": LengthUnitModel(length="m"),
        "cdp_x": LengthUnitModel(length="m"),
        "cdp_y": LengthUnitModel(length="m"),
        "velocity": SpeedUnitModel(speed="m/s"),
    }
)
custom_template

## Changing chunk size (chunks) on an existing template

Often you will want to tweak the chunking strategy for performance. You can do this in two ways:

- When defining a subclass, set a default in the constructor (e.g., `self._var_chunk_shape = (...)`).
- On an existing template instance, assign to the `full_chunk_shape` property once you know your final
  dataset sizes (the tuple length must match the number of data dimensions).

Below is a tiny demo showing how to modify the chunk shape on a fetched template. We first build the
template with known sizes to satisfy validation, then update `full_chunk_shape`.

```{note}
In the SEG-Y to MDIO conversion workflow, MDIO infers the final grid shape from the SEG-Y headers. It’s
common to set or adjust `full_chunk_shape` right before calling `segy_to_mdio`, using the same sizes
you expect for the final array.
```

In [None]:
mdio_ds = custom_template.build_dataset(name="demo-only", sizes=(300, 500, 1001))
# pick smaller chunks than the full array for better parallelism and IO
custom_template.full_chunk_shape = (64, 64, 64)
print("Chunk shape set to:", custom_template.full_chunk_shape)

custom_template

## Making Dummy Xarray Dataset

We can now take the MDIO Dataset model and convert it to Xarray with our configuration. If ingesting from SEG-Y, this step
gets executed automatically by the converter before populating the data.

Note that the whole dataset will be populated with the fill values.

In [None]:
from mdio.builder.xarray_builder import to_xarray_dataset

to_xarray_dataset(mdio_ds)

## Recap: Key APIs Used

- Template registry helpers: `get_template_registry`, `list_templates`, `register_template`, `get_template`
- Base template to subclass: `AbstractDatasetTemplate`
- Make Xarray Dataset from MDIO Data Model: `to_xarray_dataset`

With these pieces, you can standardize how your seismic data is represented in MDIO and keep ingestion code concise and repeatable.
