Skip to content

Ergonomic way to apply multiple region extracting templates and merge the results #32

@brews

Description

@brews

Projection workflows often have multiple isku.ExtractionTemplates. The templates are applied one-by-one with isku.extract_regions(...). It's common to apply multiple extractions and merging them together like

templates = [make_climtas, make_tas_20yrmean_annual_histogram]

transformed = xr.merge(
    [
        isku.extract_regions(ds_in, template=t, regions=basic_segment_weights)
        for t in templates
    ]
)

We make users do the merge because the transformations around region extraction are often one of the most expensive steps of a projection workflow. The strategy used for merging can have a dramatic, magical, and sometimes unexpected impacts on the output dataset. Also, the default behavior of xr.merge changes with different versions of xarray. So, we make users do the merge transformed data together to be explicit about what's happening.

All that said, there is a lot of repetition. Is there a more ergonomic way to handle with while still being honest about what's happening? Are users really feeling the friction on this?

In earlier prototypes we used something like

isku.apply_extractions(
    ds_in,
    templates=[make_climtas, make_tas_20yrmean_annual_histogram],
    regions=basic_segment_weights,
)

And there was an optional merge_fn: Callable[[Sequence[xr.Dataset], xr.Dataset] | None = None that used the default xr.merge from whatever version of xarray was installed when merge_fn=None.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions