What version of Parcels are you running?
main
Is your feature request related to a problem?
Currently I'm looking at building out tooling for #2570 so that we can add these items easily to the test suite. I will end up needing quite similar tooling to that already in
https://github.com/Parcels-code/Parcels/blob/1c6369438d8623d67722bc34441dde2dd9180041/src/parcels/_tutorial.py
except that in the _tutorial.py file it assumes that the underlying data is netcdf files, and there being some sort of implicit folder structure. This isn't really ideal:
download_example_dataset(...) returns the folder in which the data is downloaded. From there users have to open it - which is cumbersome as it means they have to be familiar with the structure of the files, calling open_dataset/open_mfdataset etc appropriately with the right paths and globs
- users mainly just want to open an xarray dataset straight after downloading it (I don't see a usecase for users needing access to the files themselves)
- adding separate tooling similar to
_tutorial.py in tests/utils/_datasets.py seems like unnecessary duplication
Describe the solution you'd like
To simplify things, I wonder whether we should migrate all our example datasets to be these zipped zarr stores like in #2570
Changes:
- make a new branch
v4 in parcels-examples repo
- unfortunately we can't just use main otherwise that would break people trying to use v3 code
- remove
download_example_dataset(...) and replace with open_example_dataset(...) (the latter returning a dataset object)
- optionally (but I think would be a good idea), maybe we should delineate between testing and tutorial datasets by doing
open_tutorial_dataset and open_testing_dataset (similarly, list_tutorial_datasets and list_testing_datasets). This is solely to delineate stability (i.e., as devs, we can confidently add or remove testing datasets. Tutorial datasets can also be used in testing, but shouldn't be removed/changed in breaking ways as that would negatively affect users following tutorials. Testing datasets shouldn't be used in tutorials.
cc @erikvansebille @fluidnumerics-joe keen on your thoughts as this will change how we approach datasets in tutorials and testing
Describe alternatives you've considered
Two separate files. This would be manageable in the short term, but I think that this issue is bets for the longterm maintainence of parcels
The main disadvantage of this approach:
- if users are working from netcdf files, this will impact our ability to highlight "if you want to open multiple netcdf files in xarray, you have to do
xr.open_mfdataset("something-*.nc")
- I think this is a completely acceptable disadvantage. There are a million ways to open xarray datasets depending on storage, and I don't think its necessarily up to us to teach xarray API through our code examples (we can have a note block somewhere if we really want to mention it). The main thing I think is that we show how users can get from their model example datasets, to S/Ugrid compliant data, and pass that to the rest of Parcels
Additional context
No response
What version of Parcels are you running?
main
Is your feature request related to a problem?
Currently I'm looking at building out tooling for #2570 so that we can add these items easily to the test suite. I will end up needing quite similar tooling to that already in
https://github.com/Parcels-code/Parcels/blob/1c6369438d8623d67722bc34441dde2dd9180041/src/parcels/_tutorial.py
except that in the
_tutorial.pyfile it assumes that the underlying data is netcdf files, and there being some sort of implicit folder structure. This isn't really ideal:download_example_dataset(...)returns the folder in which the data is downloaded. From there users have to open it - which is cumbersome as it means they have to be familiar with the structure of the files, callingopen_dataset/open_mfdatasetetc appropriately with the right paths and globs_tutorial.pyintests/utils/_datasets.pyseems like unnecessary duplicationDescribe the solution you'd like
To simplify things, I wonder whether we should migrate all our example datasets to be these zipped zarr stores like in #2570
Changes:
v4inparcels-examplesrepodownload_example_dataset(...)and replace withopen_example_dataset(...)(the latter returning a dataset object)open_tutorial_datasetandopen_testing_dataset(similarly,list_tutorial_datasetsandlist_testing_datasets). This is solely to delineate stability (i.e., as devs, we can confidently add or remove testing datasets. Tutorial datasets can also be used in testing, but shouldn't be removed/changed in breaking ways as that would negatively affect users following tutorials. Testing datasets shouldn't be used in tutorials.cc @erikvansebille @fluidnumerics-joe keen on your thoughts as this will change how we approach datasets in tutorials and testing
Describe alternatives you've considered
Two separate files. This would be manageable in the short term, but I think that this issue is bets for the longterm maintainence of parcels
The main disadvantage of this approach:
xr.open_mfdataset("something-*.nc")Additional context
No response