New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
open_datatree performance improvement on NetCDF and Zarr files #9014
base: main
Are you sure you want to change the base?
Conversation
Thank you for opening this pull request! It may take us a few days to respond here, so thank you for being patient. |
@@ -416,6 +415,104 @@ class ZarrStore(AbstractWritableDataStore): | |||
"_close_store_on_close", | |||
) | |||
|
|||
@classmethod | |||
def open_store( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you rewrite open_group
to call open_store()
internally? That would reduce the amount of duplicated code and make this easier to maintain going forward.
…to datatree-zarr merging branches
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had thoughts about the legacyhdf5 api and how it might be incorporated.
@@ -16,7 +16,6 @@ | |||
BackendEntrypoint, | |||
WritableCFDataStore, | |||
_normalize_path, | |||
_open_datatree_netcdf, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before this PR _open_datatree_netcdf
was used by both the netCDF4_.py and h5netcdf_.py backends. Would it be possible to move these changes back into the backends/common location and remove completely rewrite the _open_datatree_netcdf
function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if you move these changes back to backends/common
and leave them in _open_datatree_netcdf
, You might be able to import the store for both the legacyhdf5 and the netcdf4 libraries. and then call _open_datatree_netcdf
for both legacyhdf and netcdf.
currently _open_datatree_netcdf
takes ncDataset: ncDataset | ncDatasetLegacyH5,
you might be able to include a new param with type cdfDataStore: NetCDF4DataStore | H5NetCDFStore
and pass the appropritate one from both the n5netcdf_.py and netCDF4_.py backends.
open_datatree performance improvement on NetCDF files
whats-new.rst