Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load selected variables instead of making them virtual #69

Merged
merged 7 commits into from
Apr 5, 2024

Conversation

TomNicholas
Copy link
Collaborator

@TomNicholas TomNicholas commented Apr 1, 2024

Towards implementing case (1) of #62, at least the opening step. We would still need to be able to save these loaded variables out (to kerchunk and zarr ideally).

Mostly works but there is a subtlety with the difference between a Variable and an IndexVariable causing the test to fail.

@@ -21,6 +21,7 @@ def open_virtual_dataset(
filepath: str,
filetype: Optional[str] = None,
drop_variables: Optional[List[str]] = None,
load_variables: Optional[List[str]] = None,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure "load" is the best name for this kwarg, because these variables will still be lazy. But what else is the opposite of "virtual"? "Real" also has incorrect connotations (and "inline" makes no sense here).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to "loadable", which is marginally better because it doesn't imply that they have already been loaded.




def dataset_from_kerchunk_refs(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're now only using this function in internal tests

Comment on lines +77 to +78
# TODO really we probably want a dedicated xarray backend that iterates over all variables only once
ds = xr.open_dataset(filepath, drop_variables=drop_variables)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point it might be nice to have a dedicated function / xarray backend that only opens the legacy file once, instead of opening it once with kerchunk and once with with xarray.

@TomNicholas TomNicholas added the enhancement New feature or request label Apr 4, 2024
Copy link

codecov bot commented Apr 4, 2024

Codecov Report

Attention: Patch coverage is 85.71429% with 8 lines in your changes are missing coverage. Please review.

Project coverage is 90.18%. Comparing base (817f63b) to head (3c8b131).

Files Patch % Lines
virtualizarr/xarray.py 81.39% 8 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #69      +/-   ##
==========================================
- Coverage   90.57%   90.18%   -0.40%     
==========================================
  Files          14       14              
  Lines         955      998      +43     
==========================================
+ Hits          865      900      +35     
- Misses         90       98       +8     
Flag Coverage Δ
unittests 90.18% <85.71%> (-0.40%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@TomNicholas TomNicholas merged commit 6fef91c into main Apr 5, 2024
5 of 8 checks passed
@TomNicholas TomNicholas deleted the load_variables branch April 5, 2024 00:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant