Projection workflows often have multiple isku.ExtractionTemplates. The templates are applied one-by-one with isku.extract_regions(...). It's common to apply multiple extractions and merging them together like
templates = [make_climtas, make_tas_20yrmean_annual_histogram]
transformed = xr.merge(
[
isku.extract_regions(ds_in, template=t, regions=basic_segment_weights)
for t in templates
]
)
We make users do the merge because the transformations around region extraction are often one of the most expensive steps of a projection workflow. The strategy used for merging can have a dramatic, magical, and sometimes unexpected impacts on the output dataset. Also, the default behavior of xr.merge changes with different versions of xarray. So, we make users do the merge transformed data together to be explicit about what's happening.
All that said, there is a lot of repetition. Is there a more ergonomic way to handle with while still being honest about what's happening? Are users really feeling the friction on this?
In earlier prototypes we used something like
isku.apply_extractions(
ds_in,
templates=[make_climtas, make_tas_20yrmean_annual_histogram],
regions=basic_segment_weights,
)
And there was an optional merge_fn: Callable[[Sequence[xr.Dataset], xr.Dataset] | None = None that used the default xr.merge from whatever version of xarray was installed when merge_fn=None.
Projection workflows often have multiple
isku.ExtractionTemplates. The templates are applied one-by-one withisku.extract_regions(...). It's common to apply multiple extractions and merging them together likeWe make users do the merge because the transformations around region extraction are often one of the most expensive steps of a projection workflow. The strategy used for merging can have a dramatic, magical, and sometimes unexpected impacts on the output dataset. Also, the default behavior of
xr.mergechanges with different versions of xarray. So, we make users do the merge transformed data together to be explicit about what's happening.All that said, there is a lot of repetition. Is there a more ergonomic way to handle with while still being honest about what's happening? Are users really feeling the friction on this?
In earlier prototypes we used something like
And there was an optional
merge_fn: Callable[[Sequence[xr.Dataset], xr.Dataset] | None = Nonethat used the defaultxr.mergefrom whatever version of xarray was installed whenmerge_fn=None.