You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For phase 2, we'd like to produce surface and atmospheric Zarrs that can be updated with preliminary data. Specifically, we intend to backfill the raw data covering 1959 to 1978. It's possible that in the future, ECMWF will produce an even earlier backfill. As I understand it, the standard structure of Zarr only allow appending to the end.
The aim for this issue would be to devise a means of avoiding recomputing our Zarr datasets whenever we want to include new data at earlier times.
For overall pipeline structure, I have the following sketch in mind:
For each epoch of preliminary data we ingest from ECMWF, we manually produce a new cloud optimized dataset. This can use the scripts in this project that we've already developed.
For phase 2, we'd like to produce surface and atmospheric Zarrs that can be updated with preliminary data. Specifically, we intend to backfill the raw data covering 1959 to 1978. It's possible that in the future, ECMWF will produce an even earlier backfill. As I understand it, the standard structure of Zarr only allow appending to the end.
The aim for this issue would be to devise a means of avoiding recomputing our Zarr datasets whenever we want to include new data at earlier times.
For overall pipeline structure, I have the following sketch in mind:
@shoyer @rabernat: Do either of you have any thoughts on how we could structure our Zarr to these ends?
The text was updated successfully, but these errors were encountered: