Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions doc/adr/ADR-001_seperate-validation-and-loading.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,19 +6,19 @@ decision-makers: Denby L. Van Ginderachter M.
informed: Francois B., Buurman S.
---

# Seperation of concerns between dataset-loading, -validation and -alignment
# Separation of concerns between dataset-loading, -validation and -alignment

## Context

Currently `mxalign` implements dataset loading functionality, validation functionality (checking if the loaded dataset has all the correct metadata and) and alignment functionality. However, dataset-loading and -validation fall outside the main scope of the `mxalign` package.
Currently `mxalign` implements dataset loading functionality, validation functionality (checking if the loaded dataset has all the correct metadata and) and alignment functionality. However, dataset-loading and -validation fall outside the main scope of the `mxalign` package.

<!-- This is an optional element. Feel free to remove. -->
## Decision Drivers

* Provide clarity on what the package does by seperating the different concerns
* Provide clarity on what the package does by separating the different concerns
* Provide a clear interface for users who want to bring their own dataset
* Provide flexibility for users
* Ease of maintainance by decoupling concerns
* Ease of maintenance by decoupling concerns

## Considered Options

Expand All @@ -27,7 +27,7 @@ Currently `mxalign` implements dataset loading functionality, validation functio

## Decision Outcome

Chosen option 1., because it allows for most flexibility and provides a clear entry point for users who want to bring their own loader.
Chosen option 1., because it allows for most flexibility and provides a clear entry point for users who want to bring their own loader.

<!-- This is an optional element. Feel free to remove. -->
### Consequences
Expand All @@ -36,4 +36,4 @@ Chosen option 1., because it allows for most flexibility and provides a clear en
* `mxalign` is now only responsible for the alignment tasks

## More Information
Currently the interface between dataset loaded with an `mlwp-data-loaders` loader and `mxalign` is not defined. Ideally `mxalign` should know the traits of the dataset to correctly align dataset. How do we inform `mxalign` on the traits? See [ADR-002](./ADR-002_mxalign-loader-interface.md) for possible options.
Currently the interface between dataset loaded with an `mlwp-data-loaders` loader and `mxalign` is not defined. Ideally `mxalign` should know the traits of the dataset to correctly align dataset. How do we inform `mxalign` on the traits? See [ADR-002](./ADR-002_mxalign-loader-interface.md) for possible options.
2 changes: 1 addition & 1 deletion doc/adr/ADR-002_mxalign-loader-interface.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,4 +71,4 @@ Chosen option: "{title of option 1}", because {justification. e.g., only option,
<!-- This is an optional element. Feel free to remove. -->
## More Information

{You might want to provide additional evidence/confidence for the decision outcome here and/or document the team agreement on the decision and/or define when/how this decision the decision should be realized and if/when it should be re-visited. Links to other decisions and resources might appear here as well.}
{You might want to provide additional evidence/confidence for the decision outcome here and/or document the team agreement on the decision and/or define when/how this decision the decision should be realized and if/when it should be re-visited. Links to other decisions and resources might appear here as well.}
2 changes: 1 addition & 1 deletion doc/adr/template.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,4 +71,4 @@ Chosen option: "{title of option 1}", because {justification. e.g., only option,
<!-- This is an optional element. Feel free to remove. -->
## More Information

{You might want to provide additional evidence/confidence for the decision outcome here and/or document the team agreement on the decision and/or define when/how this decision the decision should be realized and if/when it should be re-visited. Links to other decisions and resources might appear here as well.}
{You might want to provide additional evidence/confidence for the decision outcome here and/or document the team agreement on the decision and/or define when/how this decision the decision should be realized and if/when it should be re-visited. Links to other decisions and resources might appear here as well.}
67 changes: 31 additions & 36 deletions src/mxalign/loaders/anemoi_inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,14 @@
from ..properties.properties import Space, Time, Uncertainty
from .base import BaseLoader

DEFAULTS_NETCDF = {
"chunks": "auto",
"engine": "h5netcdf",
"parallel": True
}
DEFAULTS_NETCDF = {"chunks": "auto", "engine": "h5netcdf", "parallel": True}

DEFAULTS_ZARR = {
"chunks": "auto",
"storage_options": {"anon": True},
}


@register_loader
class AnemoiInferenceLoader(BaseLoader):
name = "anemoi-inference"
Expand All @@ -26,75 +23,73 @@ class AnemoiInferenceLoader(BaseLoader):

def _load(self):


kwargs = self.kwargs.copy()
if isinstance(self.files,str):

if isinstance(self.files, str):
if Path(self.files).suffix.lower() == ".zarr":
files = self.files


for k, v in DEFAULTS_ZARR.items():
kwargs[k] = self.kwargs.get(k,v)
kwargs[k] = self.kwargs.get(k, v)

loader = _open_zarr
else:
files = [self.files]

for k, v in DEFAULTS_NETCDF.items():
kwargs[k] = self.kwargs.get(k,v)
kwargs[k] = self.kwargs.get(k, v)

loader = _open_mf_dataset
else:
files = self.files
if Path(files[0]).suffix.lower() == ".zarr":
if Path(files[0]).suffix.lower() == ".zarr":
for k, v in DEFAULTS_ZARR.items():
kwargs[k] = self.kwargs.get(k,v)
kwargs[k] = self.kwargs.get(k, v)
kwargs["engine"] = "zarr"

else:
else:
for k, v in DEFAULTS_NETCDF.items():
kwargs[k] = self.kwargs.get(k,v)
kwargs[k] = self.kwargs.get(k, v)

loader = _open_mf_dataset


ds = loader(files, **kwargs)
return ds


def _open_mf_dataset(files, **kwargs):

times = xr.open_dataset(files[0], engine=kwargs["engine"], chunks=kwargs["chunks"])["time"].values
lead_times = times - times[0]
times = xr.open_dataset(files[0], engine=kwargs["engine"], chunks=kwargs["chunks"])[
"time"
].values
lead_times = times - times[0]

ds = xr.open_mfdataset(
files,
preprocess=_preprocess,
**kwargs
)
ds = xr.open_mfdataset(files, preprocess=_preprocess, **kwargs)

ds_out = ds.\
assign_coords({"lead_time": ("time", lead_times)}).\
rename_dims({"values": "grid_index"}).\
swap_dims({"time": "lead_time"})
ds_out = (
ds.assign_coords({"lead_time": ("time", lead_times)})
.rename_dims({"values": "grid_index"})
.swap_dims({"time": "lead_time"})
)

return ds_out


def _open_zarr(files, **kwargs):

ds = xr.open_zarr(files, **kwargs)
times = ds["time"].values
lead_times = times - times[0]
lead_times = times - times[0]

ds_out = _preprocess(ds)

ds_out = ds_out.\
assign_coords({"lead_time": ("time", lead_times)}).\
rename_dims({"values": "grid_index"}).\
swap_dims({"time": "lead_time"})

return ds_out

ds_out = (
ds_out.assign_coords({"lead_time": ("time", lead_times)})
.rename_dims({"values": "grid_index"})
.swap_dims({"time": "lead_time"})
)

return ds_out


def _preprocess(ds):
Expand Down
Loading