-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FB] [PI-3478] Common metadata API #3583
[FB] [PI-3478] Common metadata API #3583
Conversation
8e571b0
to
14a9556
Compare
Meta-class to support the convenience of creating a namedtuple from | ||
names/members of the metadata class hierarchy. | ||
|
||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"attributes", | ||
) | ||
|
||
__slots__ = () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lib/iris/common/metadata.py
Outdated
return (standard_name, long_name, var_name, stash_name) | ||
|
||
|
||
def MetadataFactory(cls, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that, this is a function, not a class...
match = self.values == other.values | ||
return match | ||
|
||
def __getstate__(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
match = not match | ||
return match | ||
|
||
def __reduce__(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
) | ||
return "{}({})".format(self.__class__.__name__, args) | ||
|
||
def __setstate__(self, state): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
namespace["_names"] = cls._names | ||
|
||
# Dynamically create the class. | ||
Metadata = type(name, bases, namespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dynamically created classes at runtime cannot be pickled, due to not | ||
being defined at the top level of a module. As a result, we require to | ||
use the __reduce__ interface to allow 'pickle' to recreate this class | ||
instance, and dump and load instance state successfully. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
7e75975
to
8068848
Compare
8068848
to
137d4a2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What a big PR! But definitely an improvement to what we had before!
Overall I'm very much in support of the changes. Most of my comments relate to a few (mostly) minor points:
-
Naming
When reviewing this, I did find I was getting confused between what was returned by theMetadataFactory
(returned bycube._metadata
) and the instances of, for example,CubeMetadata
(returned bycube.metadata
). I think to provide more clarity between these I would suggest we rename a few things. (see individual comments below) -
Test names
There was a bit of inconsistency with the naming of tests when the name of class/function being tests starts with an underscore. -
The number of asserts per test
A lot of the tests grouped multiple self.assert*. I would prefer these are split up such that each independent test (i.e.test_standard_name
) tests a single scenario -
I believe we are missing some test coverage for
_BaseMeta.__lt__
lib/iris/common/metadata.py
Outdated
@property | ||
def _names(self): | ||
""" | ||
A tuple containing the value of each name participating in the identify |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a typo? I would expect it to say identity
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, good spot 👀
@@ -4,15 +4,15 @@ | |||
# See COPYING and COPYING.LESSER in the root of the repository for full | |||
# licensing details. | |||
""" | |||
Unit tests for the :func:`iris._cube_coord_common.get_valid_standard_name`. | |||
Unit tests for the :func:`iris.common.mixin.get_valid_standard_name`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unit tests for the :func:`iris.common.mixin.get_valid_standard_name`. | |
Unit tests for the :func:`iris.common.mixin._get_valid_standard_name`. |
And shouldn't this file be called .../common/mixin/test__get_valid_standard_name.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lbdreyer Yup. Originally it wasn't a private function. I ported it across into iris.common.mixin
wrote the tests, then later changed it to be a private function. Forgot to also change the associated comment and test file name. Another good spot!
self.item = CFVariableMixin() | ||
self.item._metadata = metadata | ||
|
||
def test(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you split this into two tests; one for renaming to air_temperature
and one for renaming to nope nope
self.item = CFVariableMixin() | ||
self.item._metadata = metadata | ||
|
||
def test_standard_name(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you split this up into three tests?
For readability it is nicer this way, but you are testing different scenarios here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also do this for the other tests in this test class?
lib/iris/common/metadata.py
Outdated
_TOKEN_PARSE = re.compile(r"""^[a-zA-Z0-9][\w\.\+\-@]*$""") | ||
|
||
|
||
class _BaseMeta(ABCMeta): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this name a bit confusing as we have both METAdata
and METAclass
so it's unclear what the Meta
refers to.
But I don't have a suggestion just yet. I will keep thinking....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lbdreyer Yup, on reflection I do agree. It is confusing 😕
After a bit of thought (and mutual agreement) we're going with _NamedTupleMeta
instead.
from iris.common.metadata import BaseMetadata | ||
|
||
|
||
class Test(tests.IrisTest): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be helpful to include tests for __lt__
on BaseMetadata
@@ -5,43 +5,20 @@ | |||
# licensing details. | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not so sure about the name of this module. mixin
doesn't tell you much about what this module is supposed to contain. Unless you intend all mixin classes to live here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lbdreyer My intent here was simple, as in, it's a namespace for common mixin classes to live.
It might just be the case that CFVariableMixin
will be our only mixin class, but at least we have scope for collecting such classes together in one place, rather than having them sprinkled around the codebase. HTH 😉
lib/iris/common/metadata.py
Outdated
raise ValueError(emsg.format(cls.__name__, bad)) | ||
|
||
# Define the name, (inheritance) bases and namespace of the dynamic class. | ||
name = "Metadata" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we call this MetadataManager
instead?
And perhaps call this function MetadataManagerFactory
instead of MetadataFactory
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lbdreyer Agreed. Let's do both for clarity.
lib/iris/cube.py
Outdated
@@ -775,6 +747,9 @@ def __init__( | |||
if isinstance(data, str): | |||
raise TypeError("Invalid data type: {!r}.".format(data)) | |||
|
|||
# Configure the metadata manager. | |||
self._metadata = MetadataFactory(CubeMetadata) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we call these self._metadata_manager
?
Currently we have:
self.metadata
- returns an instance of BaseMetadata which is a souped up namedtuple
self._metadata
- returns the instance that MetadataFactory returns.
I find that a bit confusing.
Another good thing about self._metadata_manager
is follows a similar naming pattern to self._data_manager
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @bjlittle !
I have now fully completed reviewing this PR so it's ready for you to get started on making changes
lib/iris/coords.py
Outdated
self._metadata.climatological = False | ||
if not self.units.is_time_reference(): | ||
self._metadata.climatological = False | ||
return self._metadata.climatological |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have come around to the idea of this.
We do a is_time_reference
check in the setter, so we are being careful with what is being set as having climatological bounds.
There is a risk is if someone modifies an already existing climatological coordinate to something like a latitude coordinate, without updating the climatological property. In these cases, I would prefer we ensure we save things out correctly (i.e. pick up that this coordinate shouldn't be climatological). It would only be a problem if they were to access coord._metadata.climatological directly, but they shouldn't be doing that
.
So this just needs some testing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lbdreyer Thanks for taking the time to think about this one, much appreciated.
It was a bit cheeky of me to include this in this PR, but since I was in the space, I thought I'd try to address something that reeked of a potential issue for users.
I'll add some tests 😄
5db15a5
to
59dba91
Compare
Co-Authored-By: lbdreyer <lbdreyer@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Happy with the changes. Now just to wait for travis then this can be merged!
@lbdreyer and @alastair-gemmell Awesome, thanks! 🍻 🎉 |
* common metadata api * rationalise _cube_coord_common into common * move state into metadata * MetadataFactory test coverage * temporarily pin back iris-grib * test coverage for iris.common.metadata._BaseMeta * test coverage for iris.common.mixin.LimitedAttributeDict * remove temporary iris-grib pin * review actions * Update lib/iris/tests/unit/common/metadata/test_BaseMetadata.py Co-Authored-By: lbdreyer <lbdreyer@users.noreply.github.com>
* common metadata api * rationalise _cube_coord_common into common * move state into metadata * MetadataFactory test coverage * temporarily pin back iris-grib * test coverage for iris.common.metadata._BaseMeta * test coverage for iris.common.mixin.LimitedAttributeDict * remove temporary iris-grib pin * review actions * Update lib/iris/tests/unit/common/metadata/test_BaseMetadata.py Co-Authored-By: lbdreyer <lbdreyer@users.noreply.github.com>
* PI-3478: Common metadata API (#3583) * common metadata api * rationalise _cube_coord_common into common * move state into metadata * MetadataFactory test coverage * temporarily pin back iris-grib * test coverage for iris.common.metadata._BaseMeta * test coverage for iris.common.mixin.LimitedAttributeDict * remove temporary iris-grib pin * review actions * Update lib/iris/tests/unit/common/metadata/test_BaseMetadata.py Co-Authored-By: lbdreyer <lbdreyer@users.noreply.github.com> * [FB] [PI-3478] Lenient metadata (#3739) * add lenient infra-structure * add metadata lenient __eq__ support * complete __eq__, combine and difference support * explicit inherited lenient_service + support equal convenience * fix attributes difference + lenient kwargs * make lenient public + minor tidy * rename MetadataManagerFactory to metadata_manager_factory * extend lenient_client decorator to support services registration * add lenient test coverage * purge qualname usage in metadata.py * support global enable for lenient services * support partial mapping metadata assignment * purge Lenient.__setattr__ from api * add BaseMetadata compare test coverage * metadata rationalisation * add BaseMetadata difference test coverage * added context manager ephemeral comment clarification * add BaseMetadata __ne__ test coverage * standardise lenient decorator closure names * add BaseMetadata equal test coverage * half dunder context * add AncillaryVariableMetadata test coverage * add additional AncillaryVariableMetadata test coverage * add CellMeasureMetadata test coverage * Clarify lenient_service operation + simplify code. * add CoordMetadata test coverage * add CubeMetadata test coverage * metadata tests use self.cls * fix typo * fix context manager ephemeral services * add logging * Pin pillow to make graphics tests work again. (#3630) * Fixed tests since Numpy 1.18 deprecation of non-int num arguments for linspace. (#3655) * Switched use of datetime.weekday() to datetime.dayofwk. (#3687) * New image hashes for mpl 3x2 (#3682) * New image hash for iris.test.test_plot.TestSymbols.test_cloud_cover with matplotlib 3.2.0. * Further images changes for mpl3x2. * Yet more updated image results. * fix sentinel uniqueness test failure * remove redundant cdm mapping test * difference returns None for no difference * protect Lenient and LENIENT private * privitise lenient framework and add API veneer * add explicit maths feature default * review actions * review actions * trexfeathers review actions * stephenworsley review actions Co-authored-by: Patrick Peglar <patrick.peglar@metoffice.gov.uk> Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com> * [FB] [PI-3478] Lenient cube arithmetic (#3774) * initial cube arithmetic * support in-place cube resolve * fix non in-place broadcasting * remove temporary resolve scenario test * lenient/strict support for attributes dicts with numpy arrays * lenient/strict treatment of scalar coordinates * strict points/bounds matching * lenient/strict prepare local dim/aux/scalar coordinates * support extended broadcasting * always raise exception on points/bounds mismatch * ignore scalar points/bounds mismatches, lenient only * remove todos * tidy logger debugs * qualify src/tgt cube references in debug * Numpy rounding fix (#3758) ensure rounding is numpy like (maintains type) * avoid unittest.mock.sentinel copy issue * fast load np.int32 * fix cube maths doctest * fix iris.common.resolve logging configuration * fix prepare points/bounds + extra metadata cml * support mapping reversal based on free dims * var_name fix for lenient equality * add support for DimCoordMetadata * fix circular flag + support CoordMetadata and DimCoordMetadata exchange * fix circular issue for concatenate DimCoord->AuxCoord demotion * fix concatenate._CubeSignature sorted * minor tweaks * keep lenient_client private in maths * tidy maths * tidy iris.analysis.maths.IFunc * refactor IFunc test * polish in-place support * tidy metadata_resolve Co-authored-by: stephenworsley <49274989+stephenworsley@users.noreply.github.com> * rebase master fix-up for cube arithmetic * add missing new dependency to readthedocs.yml requirements Co-authored-by: lbdreyer <lbdreyer@users.noreply.github.com> Co-authored-by: Patrick Peglar <patrick.peglar@metoffice.gov.uk> Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com> Co-authored-by: stephenworsley <49274989+stephenworsley@users.noreply.github.com>
This PR introduces a common metadata API for cubes, coordinates and auxiliary factories, and also addresses a lot of connective tissue technical debt. This PR underpins the forthcoming changes for cube arithmetic.
Essentially, this PR proposes a
metadata
interface, which provides a unified, consistent treatment of metadata withiniris
:_as_defn
interface has been replaced with themetadata
interfaceMetadataManager
class:MetadataManager
classes at runtime, seeiris.common.metadata.MetadataManagerFactory
namedtuple
metadata subclasses, which specifies the behaviour and state associated with the runtimeMetadataManager
:iris.common.metadata.AncillaryVariableMetadata
iris.common.metadata.BaseMetadata
iris.common.metadata.CellMeasureMetadata
iris.common.metadata.CoordMetadata
metadata
interface has been added to:iris.aux_factory.AuxCoordFactory
iris.coords._DimensionalMetadata
iris.coords.AncillaryVariable
iris.coords.CellMeasure
iris.coords.Coord
iris.cube.Cube
CFVariableMixin
methods have been rationalisedmetadata
interfacemetadata
member setter and gettermetadata
may be set either with:iris.common.metadata
iris.common
module, for common, non-coreiris
behaviour:_cube_coord_common
has been rationalised here, seeiris.common.mixin
metadata
has been rationalised here, seeiris.common.metadata
name
method behaviour is now appropriate and specific to its container i.e., theSTASH
attribute used byname
is now only applicable to aniris.cube.Cube
names
property behaviour is now private and specific to aniris.cube.Cube
LimitedAttributeDict
, which has always been missingiris.aux_factory.AuxCoordFactory
:units
are no longer a read-only property, but part of the containersmetadata
climatological
is now a read-only property ofFalse
Still to do:
iris.common.metadata.MetadataFactory
iris.common.metadata._BaseMeta
iris.common.mixin.LimitedAttributeDict