Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support netcdf variable emulation #5212

Merged
merged 4 commits into from May 19, 2023
Merged

Conversation

pp-mo
Copy link
Member

@pp-mo pp-mo commented Mar 27, 2023

The basic idea : when loading from / saving to a dataset-like object (as provided by #5214), Iris can spot objects which "emulate" a netCDF4.Variable and provide direct access to a content array (i.e. not via file access).
This allows us to directly transfer array data (real or lazy) without copying.

TODO: in retrospect, _in_memory_array is really not a good choice, as it seems to imply numpy data, whereas crucially it will often be dask. We should change the name to something better.
Maybe "_data_array".

This, along with #5214 (formerly #5024), should provide all we need to support the intended "Xarray bridge" - see #4994 .
Scheduled for Iris 3.6

Currently draft/WIP, because this will need adjustment after #5191, and it's far easier to resolve if that goes in first !

Also note comment here

we may have a bit more work to successfully integrate these changes with those from #5095

@pp-mo
Copy link
Member Author

pp-mo commented Mar 27, 2023

Ping @ESadek-MO @trexfeathers. Ideally I'd like this to squeeze into Iris 3.5.

Included in Iris 3.6 project

Although we've now bumped #4994 to Iris 3.7, the remainder of that work will happen in a separate "ncdata" package.
So if we can get these preliminaries into Iris itself, we can go live with that, without requiring a further Iris release to support it (I think 🤞 !)

@codecov
Copy link

codecov bot commented Mar 27, 2023

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (a3931f6) 89.31% compared to head (1518756) 89.32%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #5212   +/-   ##
=======================================
  Coverage   89.31%   89.32%           
=======================================
  Files          89       89           
  Lines       22375    22379    +4     
  Branches     5368     5370    +2     
=======================================
+ Hits        19985    19989    +4     
  Misses       1640     1640           
  Partials      750      750           
Impacted Files Coverage Δ
lib/iris/fileformats/netcdf/loader.py 80.51% <100.00%> (+0.14%) ⬆️
lib/iris/fileformats/netcdf/saver.py 89.02% <100.00%> (+0.02%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@trexfeathers
Copy link
Contributor

Ping @ESadek-MO @trexfeathers. Ideally I'd like this to squeeze into Iris 3.5.

IMHO this should be a stretch goal, since we've already committed to the work that we will do for 3.5.

@pp-mo
Copy link
Member Author

pp-mo commented Mar 27, 2023

Ping @ESadek-MO @trexfeathers. Ideally I'd like this to squeeze into Iris 3.5.

IMHO this should be a stretch goal, since we've already committed to the work that we will do for 3.5.

It was a mistake ! now in Iris 3.6

@pp-mo pp-mo mentioned this pull request Mar 27, 2023
@pp-mo pp-mo changed the title Ncvar emulation Support netcdf variable emulation Apr 11, 2023
@pp-mo pp-mo self-assigned this May 15, 2023
@pp-mo
Copy link
Member Author

pp-mo commented May 16, 2023

Update : rebased

Actually more re-written, but it parallels what went before.
Been out rather too long 😬 🕚

BTW still in draft.

  1. may need more work to get tests passing
  2. want to get this working with prototype ncdata (which now needs some naming updates) before committing to it.

@pp-mo pp-mo marked this pull request as ready for review May 16, 2023 14:49
@pp-mo
Copy link
Member Author

pp-mo commented May 16, 2023

Update : Ready for Review

Existing tests now passing.
Tested against a revised WIP 'ncdata' + seems to function.

Ping @ESadek-MO is this for you to review also ?
( when done with #5214 !! )

Copy link
Contributor

@ESadek-MO ESadek-MO left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me in it's current state, is this waiting on threadsafety as in #5214?

@pp-mo
Copy link
Member Author

pp-mo commented May 19, 2023

Looks good to me in it's current state, is this waiting on threadsafety as in #5214?

These changes are supposedly ~independent, but yes I think it's probably best to get #5214 in first.
I'm just making a merge of the 2 for re-testing against ncdata, and I'm finding conflicts.

Thanks for looking!

@pp-mo
Copy link
Member Author

pp-mo commented May 19, 2023

I think I was wrong about the conflicts -- using an older version.
I am just re-spinning the tests to be sure it's good ...

@pp-mo
Copy link
Member Author

pp-mo commented May 19, 2023

Ok thanks @ESadek-MO , I'm happy this is good to go !

( btw I tested a combination of this+#5414 against prototype ncdata, and it seems good 👍 )

Copy link
Contributor

@ESadek-MO ESadek-MO left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pp-mo, looks good to me!

@ESadek-MO ESadek-MO merged commit a21bfc9 into SciTools:main May 19, 2023
17 checks passed
tkknight added a commit to tkknight/iris that referenced this pull request May 21, 2023
* upstream/main:
  Updated environment lockfiles (SciTools#5332)
  Support netcdf variable emulation (SciTools#5212)
  Support netCDF load+save on dataset-like objects as well as filepaths. (SciTools#5214)
  minor refinement to release do-nothing script (SciTools#5326)
  update bibtex citation for v3.6.0 release (SciTools#5324)
  Whats new updates for v3.6.0 (SciTools#5323)
  Added whatsnew notes on netcdf delayed saving and Distributed support. (SciTools#5322)
  Updated environment lockfiles (SciTools#5320)
  Bugfix 4566 (SciTools#4569)
  Simpler/faster data aggregation code in `aggregated_by` (SciTools#4970)
@pp-mo pp-mo mentioned this pull request Jul 3, 2023
tkknight added a commit to tkknight/iris that referenced this pull request Nov 9, 2023
* upstream/main:
  Updated environment lockfiles (SciTools#5334)
  Iris Docs: Dark theme ready (SciTools#5299)
  Updated environment lockfiles (SciTools#5332)
  Support netcdf variable emulation (SciTools#5212)
  Support netCDF load+save on dataset-like objects as well as filepaths. (SciTools#5214)
  minor refinement to release do-nothing script (SciTools#5326)
  update bibtex citation for v3.6.0 release (SciTools#5324)
  Whats new updates for v3.6.0 (SciTools#5323)
  Added whatsnew notes on netcdf delayed saving and Distributed support. (SciTools#5322)
  Updated environment lockfiles (SciTools#5320)
  Bugfix 4566 (SciTools#4569)
  Simpler/faster data aggregation code in `aggregated_by` (SciTools#4970)
  update release_do_nothing script (SciTools#5311)
  Restore latest Whats New files.
  Updated environment lockfiles (SciTools#5308)
  release_do_nothing script updates from v3.6.0rc0 (SciTools#5306)
  Whats new updates for v3.6.0rc0 (SciTools#5300)
  Bump scitools/workflows from 2023.04.3 to 2023.05.0 (SciTools#5303)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 🏁 Done
Development

Successfully merging this pull request may close these issues.

None yet

3 participants