Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically close files using open_datatree context manager #93

Open
TomNicholas opened this issue May 18, 2022 · 6 comments · May be fixed by #114
Open

Automatically close files using open_datatree context manager #93

TomNicholas opened this issue May 18, 2022 · 6 comments · May be fixed by #114
Labels
enhancement New feature or request IO Representation of particular file formats as trees

Comments

@TomNicholas
Copy link
Collaborator

In xarray it's possible to automatically close a dataset after opening by opening it using a context manager. From the documentation:

Datasets have a Dataset.close() method to close the associated netCDF file. However, it’s often cleaner to use a with statement:

# this automatically closes the dataset after use
In [5]: with xr.open_dataset("saved_on_disk.nc") as ds:
   ...:     print(ds.keys())
   ...: 

We currently don't have a DataTree.close() method, or any context manager behaviour for open_datatree. To add them presumably we would need to iterate over all file handles (i.e. groups) and close them one by one.

Related to #90 @jhamman @thewtex

@TomNicholas TomNicholas added enhancement New feature or request IO Representation of particular file formats as trees labels May 18, 2022
@TomNicholas TomNicholas changed the title open_datatree context manager Automatically close files using open_datatree context manager May 18, 2022
@jrmagers
Copy link

Could there be a load_datatree() method to be consistent with xr.load_dataset()? xr.load_dataset()

@TomNicholas
Copy link
Collaborator Author

Could there be a load_datatree() method

Sure, once we have a .load() method too then writing a load_datatree() function would be simple, just like the code for xr.load_dataset() is simple.

Though currently we haven't implemented dask-specific methods yet.

@TomNicholas
Copy link
Collaborator Author

@aurghs, @alexamici and @malmans2 - this issue and related backends issues seem like a good place for you guys to contribute if you wanted. You have expertise on xarray's backends, I don't, and they are pretty separable.

There are likely to be subtleties with respect to tracking multiple open file handles, and be aware that this will need to be done explicitly via a ._close attribute on DataTree after #41 moves that responsibility away from xarray.Dataset.

@malmans2
Copy link
Member

Sure - we are on it!

@malmans2 malmans2 linked a pull request Jun 17, 2022 that will close this issue
5 tasks
@wohenbushuang
Copy link

Is there any update on this issue or context manager? The file becomes occupied after open_datatree, which is so annoying.

@ghiggi
Copy link

ghiggi commented Jul 31, 2023

I am also interested in having this fixed. Can we exploit the logic done in xr.open_mfdataset? With collect the closers of all nodes, and then we assign a partial function to _close like in here? Or would you prefer to design a multicloser class?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request IO Representation of particular file formats as trees
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants