Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement split cube attributes. #5040

Merged
merged 29 commits into from
Jul 20, 2023

Conversation

pp-mo
Copy link
Member

@pp-mo pp-mo commented Oct 31, 2022

Initial code implementation
This is an alternative to the #5038 approach, which has hit some problems (a lot of consequential knock-ons).
See closing comment of #4982 for details

This introduces a new "split dictionary" type, CubeAttrsDict, and ensures that a Cube.attributes is always one of those.
The new type has special behaviours to make it inter-compatible with a 'normal' dictionary -- or more precisely, a iris.common.mixin.LimitedAttributeDict which is what all other .attributes properties are.

This means that now, a Cube.attributes is a rather different kind of a thing from a Coord.attributes.
We hope this makes few practical differences, compared to a LimitedAttributeDict, but there are a few, such as :

  • the object type and class inheritance has changed
    -- it is still a MutableMapping, but no longer a subtype of dict.
  • ordering of keys is affected
  • 'repr' form has changed
    I thought it was no longer appropriate to make it "look like" a standard dict, as LimitedAttributesDict does, since that style can no longer fully represent the content state. Unfortunately that complicates some of the existing docs-examples, so I've tried to adjust those to cope.

@trexfeathers trexfeathers added the Type: Feature Branch Highlight this for a feature branch label Nov 2, 2022
@pp-mo pp-mo mentioned this pull request Nov 3, 2022
@pp-mo pp-mo linked an issue Nov 3, 2022 that may be closed by this pull request
@pp-mo pp-mo marked this pull request as ready for review November 4, 2022 00:10
@pp-mo
Copy link
Member Author

pp-mo commented Nov 4, 2022

Ping @lbdreyer
I reckon this is now ready to go live.

Since I changed tack (from #5038), there are no longer separate changes to cube metadata #4982 and cube attributes handling #4983 : This is effectively doing both, and no change to CubeMetadata is required after all.

In itself, this changes almost nothing in behaviour, since all the existing code (including netcdf load+save) reads+writes the new split-type attributes dictionary as if it were an ordinary dictionary. So, real behaviour change can only occur when we change the load+save code.
However, there are some odd corner-cases which will not behave quite as before (see list in heading para). Most notably, the ordering of attributes (keys) may be changed -- though NB this does not affect dictionary comparison/equality. And of course, any type-dependent usages may fail.

This was referenced Nov 4, 2022
Copy link
Member

@lbdreyer lbdreyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good so far! Most of my comments are minor, although the two main points I am unsure about are the behaviour of setitem and whether this needs to part of the public API

lib/iris/cube.py Show resolved Hide resolved
lib/iris/cube.py Outdated Show resolved Hide resolved
lib/iris/cube.py Show resolved Hide resolved
lib/iris/tests/unit/cube/test_CubeAttrsDict.py Outdated Show resolved Hide resolved
lib/iris/cube.py Outdated Show resolved Hide resolved
lib/iris/cube.py Outdated Show resolved Hide resolved
lib/iris/cube.py Outdated Show resolved Hide resolved
@pp-mo
Copy link
Member Author

pp-mo commented Feb 20, 2023

Coming up : I'm intending now to repsond to all the above comments.
But I'll also re-base this onto the latest feature-branch version, which will probably introduce some errors/changes in the 'status quo' tests ...

lib/iris/cube.py Outdated Show resolved Hide resolved
Copy link
Contributor

@trexfeathers trexfeathers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My review is in progress, still need to look at test_CubeAttrsDict.py. Here are some comments in the meantime 😊

docs/src/further_topics/metadata.rst Show resolved Hide resolved
lib/iris/cube.py Show resolved Hide resolved
lib/iris/cube.py Outdated Show resolved Hide resolved
lib/iris/cube.py Outdated Show resolved Hide resolved
lib/iris/cube.py Outdated Show resolved Hide resolved
lib/iris/cube.py Outdated Show resolved Hide resolved
lib/iris/cube.py Show resolved Hide resolved
lib/iris/cube.py Show resolved Hide resolved
lib/iris/cube.py Show resolved Hide resolved
lib/iris/cube.py Outdated Show resolved Hide resolved
Copy link
Contributor

@trexfeathers trexfeathers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the rest of my review

lib/iris/cube.py Outdated Show resolved Hide resolved
lib/iris/cube.py Outdated Show resolved Hide resolved
lib/iris/cube.py Show resolved Hide resolved
Copy link
Contributor

@trexfeathers trexfeathers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting the docs rendering nicely! I have therefore been able to finish reviewing the docstrings.

lib/iris/cube.py Outdated Show resolved Hide resolved
lib/iris/cube.py Outdated Show resolved Hide resolved
lib/iris/cube.py Outdated Show resolved Hide resolved
lib/iris/cube.py Outdated Show resolved Hide resolved
lib/iris/cube.py Outdated Show resolved Hide resolved
lib/iris/cube.py Outdated Show resolved Hide resolved
lib/iris/cube.py Show resolved Hide resolved
lib/iris/cube.py Outdated Show resolved Hide resolved
lib/iris/cube.py Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Jul 19, 2023

Codecov Report

Patch coverage: 95.59% and project coverage change: +0.16 🎉

Comparison is base (0b54d58) 89.25% compared to head (96792e3) 89.41%.

❗ Current head 96792e3 differs from pull request most recent head 9efe4f6. Consider uploading reports for the commit 9efe4f6 to get more accurate results

Additional details and impacted files
@@                   Coverage Diff                   @@
##           FEATURE_split_attrs    #5040      +/-   ##
=======================================================
+ Coverage                89.25%   89.41%   +0.16%     
=======================================================
  Files                       88       89       +1     
  Lines                    22197    22499     +302     
  Branches                  4858     5395     +537     
=======================================================
+ Hits                     19811    20118     +307     
+ Misses                    1641     1636       -5     
  Partials                   745      745              
Impacted Files Coverage Δ
lib/iris/_merge.py 97.88% <ø> (ø)
lib/iris/analysis/_area_weighted.py 86.89% <ø> (ø)
lib/iris/analysis/_interpolation.py 95.80% <ø> (ø)
lib/iris/analysis/_regrid.py 83.33% <ø> (ø)
lib/iris/analysis/trajectory.py 94.44% <ø> (ø)
lib/iris/common/resolve.py 92.88% <ø> (ø)
lib/iris/config.py 81.17% <ø> (ø)
lib/iris/experimental/regrid.py 77.38% <ø> (ø)
lib/iris/experimental/ugrid/mesh.py 92.18% <ø> (ø)
lib/iris/experimental/ugrid/utils.py 100.00% <ø> (ø)
... and 31 more

... and 2 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

pp-mo and others added 8 commits July 19, 2023 17:41
Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com>
Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com>
Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com>
Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com>
@pp-mo
Copy link
Member Author

pp-mo commented Jul 20, 2023

Ok, I've tried to make the docs a bit shorter, and hopefully tidier.
N.B. also merged from truck again to fix the linkcheck breakage.
@trexfeathers please re-review !

@trexfeathers
Copy link
Contributor

Love the updated documentation @pp-mo 💯

Three unresolved conversations that are hidden:

@pp-mo
Copy link
Member Author

pp-mo commented Jul 20, 2023

Love the updated documentation @pp-mo 💯

Three unresolved conversations that are hidden:

Oops, yes.
Coming shortly ..

@pp-mo
Copy link
Member Author

pp-mo commented Jul 20, 2023

Thanks @trexfeathers hope that fixes the outstanding.

Copy link
Contributor

@trexfeathers trexfeathers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your hard work on this @pp-mo! Ready to continue the adventure 🌄

@trexfeathers trexfeathers dismissed lbdreyer’s stale review July 20, 2023 13:48

lbdreyer has been pulled onto other things, I have done my best to pick up the comments they raised during my review.

@trexfeathers trexfeathers merged commit 8b751c5 into SciTools:FEATURE_split_attrs Jul 20, 2023
@pp-mo pp-mo deleted the splitattrs_cubeattrs branch August 18, 2023 10:15
ESadek-MO pushed a commit that referenced this pull request Nov 21, 2023
* Split-attrs: Cube metadata refactortests (#4993)

* Convert Test___eq__ to pytest.

* Convert Test_combine to pytest.

* Convert Test_difference to pytest.

* Review changes.

* Split attrs - tests for status quo (#4960)

* Tests for attribute handling in netcdf load/save.

* Tidy test functions.

* Fix import order exception.

* Add cf-global attributes test.

* Towards more pytest-y implemenation.

* Replace 'create_testcase' with fixture which also handles temporary directory.

* Much tidy; use fixtures to parametrise over multiple attributes.

* Fix warnings; begin data-style attrs tests.

* Tests for data-style attributes.

* Simplify setup fixture + improve docstring.

* No parallel test runner, to avoid error for Python>3.8.

* Fixed for new-style netcdf module.

* Small review changes.

* Rename attributes set 'data-style' as 'local-style'.

* Simplify use of fixtures; clarify docstrings/comments and improve argument names.

* Clarify testing sections for different attribute 'styles'.

* Re-enable parallel testing.

* Sorted params to avoid parallel testing bug - pytest#432.

* Rename test functions to make alpha-order match order in class.

* Split netcdf load/save attribute testing into separate sourcefile.

* Add tests for loaded cube attributes; refactor to share code between Load and Roundtrip tests.

* Add tests for attribute saving.

* Fix method names in comments.

* Clarify source of Conventions attributes.

* Explain the test numbering in TestRoundtrip/TestLoad.

* Remove obsolete test helper method.

* Fix small typo; Fix numbering of testcases in TestSave.

* Implement split cube attributes. (#5040)

* Implement split cube attributes.

* Test fixes.

* Modify examples for simpler metadata printouts.

* Added tests, small behaviour fixes.

* Simplify copy.

* Fix doctests.

* Skip doctests with non-replicable outputs (from use of sets).

* Tidy test comments, and add extra test.

* Tiny typo.

* Remove redundant redefinition of Cube.attributes.

* Add CubeAttrsDict in module __all__ + improve docs coverage.

* Review changes - small test changes.

* More review changes.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix CubeAttrsDict example docstrings.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Odd small fixes.

* Improved docstrings and comments; fix doctests.

* Don't sidestep netcdf4 thread-safety.

* Publicise LimitedAttributeDict, so CubeAttrsDict can refer to it.

* Fix various internal + external links.

* Update lib/iris/cube.py

Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com>

* Update lib/iris/cube.py

Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com>

* Update lib/iris/cube.py

Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com>

* Update lib/iris/cube.py

Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com>

* Streamline docs.

* Review changes.

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com>

* Splitattrs ncload (#5384)

* Distinguish local+global attributes in netcdf loads.

* Small test fixes.

* Small doctest fix.

* Fix attribute load-save tests for new behaviour, and old-behaviour equivalence.

* Split attrs docs (#5418)

* Clarification in CubeAttrsDict examples.

* CubeAttrsDict fix docstring typo.

* Raise awareness of split attributes in user guide.

* What's New entry.

* Changes to metadata documentation.

* Splitattrs ncsave redo (#5410)

* Add docs and future switch, no function yet.

* Typing enables code completion for Cube.attributes.

* Make roundtrip checking more precise + improve some tests accordingly (cf. #5403).

* Rework all tests to use common setup + results-checking code.

* Saver supports split-attributes saving (no tests yet).

* Tiny docs fix.

* Explain test routines better.

* Fix init of FUTURE object.

* Remove spurious re-test of FUTURE.save_split_attrs.

* Don't create Cube attrs of 'None' (n.b. but no effect as currently used).

* Remove/repair refs to obsolete routines.

* Check all warnings from save operations.

* Remove TestSave test numbers.

* More save cases: no match with missing, and different cube attribute types.

* Run save/roundtrip tests both with+without split saves.

* Fix.

* Review changes.

* Fix changed warning messages.

* Move warnings checking from 'run' to 'check' phase.

* Simplify and improve warnings checking code.

* Fix wrong testcase.

* Minor review changes.

* Fix reverted code.

* Use sets to simplify demoted-attributes code.

* WIP

* Working with iris 3.6.1, no errors TestSave or TestRoundtrip.

* Interim save (incomplete?).

* Different results form for split tests; working for roundtrip.

* Check that all param lists are sorted.

* Check matrix result-files compatibility; add test_save_matrix.

* test_load_matrix added; two types of load result.

* Finalise special-case attributes.

* Small docs tweaks.

* Add some more testcases,

* Ensure valid sort-order for globals of possibly different types.

* Initialise matrix results with legacy values from v3.6.1 -- all matching.

* Add full current matrix results, i.e. snapshot current behaviours.

* Review changes : rename some matrix testcases, for clarity.

* Splitattrs ncsave redo commonmeta (#5538)

* Define common-metadata operartions on split attribute dictionaries.

* Tests for split-attributes handling in CubeMetadata operations.

* Small tidy and clarify.

* Common metadata ops support mixed split/unsplit attribute dicts.

* Clarify with better naming, comments, docstrings.

* Remove split-attrs handling to own sourcefile, and implement as a decorator.

* Remove redundant tests duplicated by matrix testcases.

* Newstyle split-attrs matrix testing, with fewer testcases.

* Small improvements to comments + docstrings.

* Fix logic for equals expectation; expand primary/secondary independence test.

* Clarify result testing in metadata operations decorator.

* Splitattrs equalise (#5586)

* Add tests in advance for split-attributes handling cases.

* Move dict conversion inside utility, for use elsewhere.

* Add support for split-attributes to equalise_attributes.

* Update lib/iris/util.py

Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com>

* Update lib/iris/tests/unit/util/test_equalise_attributes.py

Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com>

* Simplify and clarify equalise_attributes code.

---------

Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com>

* Fix merge-fail messaging for attribute mismatches. (#5590)

* Extra CubeAttrsDict methods to emulate dictionary behaviours. (#5592)

* Extra CubeAttrsDict methods to emulate dictionary behaviours.

* Don't use staticmethod on fixture.

* Add Iris warning categories to saver warnings.

* Type equality fixes for new flake8.

* Licence header fixes.

* Splitattrs ncsave deprecation (#5595)

* Small improvement to split-attrs whatsnew.

* Emit deprecation warning when saving without split-attrs enabled.

* Stop legacy-split-attribute warnings from upsetting delayed-saving tests.

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature Branch Highlight this for a feature branch
Projects
No open projects
Status: 🏁 Done
Development

Successfully merging this pull request may close these issues.

split-attrs: cube metadata
3 participants