-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure cubelists only contain cubes #3238
Conversation
I scanned the methods, and if we want to be strict about this I think we also have to consider the special methods Read on ... |
... ASIDE : In the process, I have found that the existing code in the The reason you can say, e.g.
That is wrong, because the So, the In fact, I don't think we need a 'new' at all -- an 'init' will work just fine for this.
But I also found that making this work as (apparently) intended does break some other tests. |
... MEANWHILE ... |
😮 It’s never as simple as it first appears is it? 😆 Looks like Apart from the fact that the check in |
!never! @rcomer 😭
I guess it's a question of intent + interpretation.
So I think in this case the strict interpretation would be that TypeError is only appropriate if the arg is not iterable, whereas bad contents should cause a ValueError ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think in this case the strict interpretation would be that TypeError is only appropriate if the arg is not iterable, whereas bad contents should cause a ValueError
OK thanks. So in that case I think my last TypeError
in extend
should be a ValueError
, but all the others are fine.
4e6b744
to
0ff3183
Compare
I now have a bunch of methods that follow very similar patterns, and repeated code makes me feel slightly queasy. Is there a more elegant way to do this? |
This TODO has been there since the initial commit to GitHub. I'm not clear what "overload" means in this context. Does it mean what we are currently doing? If so, I can make a note to take it out. |
I've added tests for setting more than one item in a slice, e.g. |
@@ -171,7 +172,7 @@ def setUp(self): | |||
self.cubelist2 = iris.cube.CubeList([self.cube2]) | |||
|
|||
def test_pass(self): | |||
cubelist = self.cubelist1.copy() | |||
cubelist = copy.copy(self.cubelist1) | |||
cubelist += self.cubelist2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed copy
method to copy.copy
function for Python2 compatibility, but now cubelist
becomes None
after the add. 🤷♀️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Travis output showing the above:
Python2.7
FAIL: test_pass (iris.tests.unit.cube.test_CubeList.Test_iadd)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/travis/miniconda/envs/test-environment/lib/python2.7/site-packages/scitools_iris-2.3.0.dev0-py2.7.egg/iris/tests/unit/cube/test_CubeList.py", line 177, in test_pass
self.assertEqual(cubelist, self.cubelist1 + self.cubelist2)
AssertionError: None != [<iris 'Cube' of foo / (unknown) (scalar cube)>,
<iris 'Cube' of bar / (unknown) (scalar cube)>]
FAIL: test_pass (iris.tests.unit.cube.test_CubeList.Test_iadd)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/travis/miniconda/envs/test-environment/lib/python3.6/site-packages/scitools_iris-2.3.0.dev0-py3.6.egg/iris/tests/unit/cube/test_CubeList.py", line 177, in test_pass
self.assertEqual(cubelist, self.cubelist1 + self.cubelist2)
AssertionError: None != [<iris 'Cube' of foo / (unknown) (scalar [50 chars]be)>]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test passes if I point it at the master
branch. The equivalent test for extend
passes on this branch. So this appearance of None
is specific to:
- Use of
copy.copy
function (or copying viacubelist = self.cubelist1[:]
) rather thancopy
method. - Using
__iadd__
with the changes I've made.
I'm officially confused.
I think I've fixed the |
Addressed |
cac1f49
to
4187162
Compare
Similar question in general regarding ensuring types within a list: https://stackoverflow.com/questions/12201811/subclass-python-list-to-validate-new-items/12203829#12203829 Personally, I'd implement this as a warning rather than an error - we don't want to completely prevent duck typed Cubes going in (if it acts like a duck, and quacks like a duck, then treat it as a duck). |
Thanks @pelson, yes I'd seen that or similar advice to inherit from The warning seems sensible to me, as it would still provide some information to help debugging when you try to print/extract/save your cubelist. |
I've changed the exceptions to warnings following @pelson's suggestion. I also rationalised the message, which allowed me to make everything a lot cleaner! I have modified some of the Still having problems with a copied cubelist becoming None when I try to |
lib/iris/cube.py
Outdated
@@ -216,7 +216,30 @@ def __repr__(self): | |||
"""Runs repr on every cube.""" | |||
return '[%s]' % ',\n'.join([repr(cube) for cube in self]) | |||
|
|||
def _check_iscube(self, obj): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style question: Is this the right place for these checking functions, or should they be defined outside the class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My take : these don't use instance properties, so they could be static methods.
... They don't use any class properties either, so they don't really need to be in the class at all.
In fact, they don't use private properties of Cube, so they don't really need to be in the module.
At that point (they are just functions), they could go somewhere else.
However, for personal preference, I'd remove them from the class but keep them as private methods in cube.py, just in case they might need to use 'private' cube concepts in future.
Also ... the use of isinstance
prevents any duck typing (lookalike objects can't masquerade as Cubes), which is arguably un-Pythonic.
We have previously used hasattr('add_aux_coord')
for this elsewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so if we check whether the object quacks with an auxcoord instead of checking the type, is there any reason not to revert to raising an exception rather than a warning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh dear, I had somehow skimmed over that latest discussion without taking it in.
Now I see that @pelson and I are just advocating different approaches, and I honestly don't know how to choose between.
Personally though, I must say I do hate all the warnings in Iris. There are still far too many, most occurrences are a pointless nuisance, and on the rare occasions when they aren't no-one is listening any more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem: I'm partly using this as a learning exercise, so exploring different solutions to the same issue is fine 👍
I think I prefer to have an exception on the grounds that, if my code is going to fail, it's better for it to fail sooner rather than later. Also, having the failure at the point that the object is included into the cubelist means that the traceback is going to point me a lot closer to where I made the mistake. Which so far has always been
cubelist.append(some_function_i_forgot_to_put_a_return_statement_in())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should get our ducks in a row 🦆 🦆 🦆 before sending @rcomer on a wild goose chase
nicely done! 😆
As a user, I'd still rather have an exception if possible, for the reasons I gave above.
If something does go wrong with my cubelist, the first thing I'm going to do in an attempt to debug is print it. So if we're looking for a minimal set of cube-like attributes, summary
ought to be up there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we thinking about this the wrong way round? Rather than trying to define which types should be allowed in a cubelist, it might be easier to focus on which types should definitely be rejected.
This started because a rogue None
in a cubelist caused problems, and I wanted a more informative error message. So far I’m not aware of any other types that have caused issues. So my case would be solved by simply throwing an exception if object is None
. We could generalise that a bit if we decide that, at minimum, the cubelist should be printable, so reject any objects that don’t have have a summary
attribute.
The exception message needn’t say anything about how similar to a Cube
the object is, but could just say ”object of type [whichever] does not belong in a cubelist”.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They don't have to be cube-like in all respects.
Are we thinking about this the wrong way round?
I don't think so in this case. Because users can create and use their own CubeList instances, the minimum set of behaviour required for an entry into a CubeList is precisely the Cube's behaviour (and no less).
in fact I was strongly opposed to creating another warning here
Useful to know, thank you. So my biggest concern is that we are essentially introducing a breaking change if we do this as an exception - if @rcomer has been adding None
into a cube list by accident, just think of all the wild things that some of our less educated users have been doing! 😭
I guess there is a workaround though... if users really want to do this they can still do list.append(cube_list_instance, thing_that_isnt_a_cube)
until they sort their 💩 out.
In an attempt to get consensus and prevent this conversation from being open-ended, my refined suggestion:
- CubeList._assert_is_cube - raise a ValueError if not isinstance of cube.
CubeList._assert_is_iterable_of_cubes-> just construct a CubeList of the subset - that way you can honour iterators, and then add the constructed CubeList as necessary.- Update the existing call in
CubeList.__new__
to use_assert_is_cube
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So my biggest concern is that we are essentially introducing a breaking change if we do this as an exception... just think of all the wild things that some of our less educated users have been doing!
I hadn't considered the possibility of cubelists being used to store random types 😮 . Semi-serious question: how far away is Iris 3?
refined suggestion
Just to check I've understood: we make it strict so only Cube
instances are allowed. Because the check is restricted to one method, someone who wants to include ducks in the cubelist just needs to replace that one method?
Points 1 and 2 sound good to me.
Point 3: I think I need to wait for #3264 to be merged, and then update __init__
!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pelson my biggest concern is that we are essentially introducing a breaking change if we do this as an exception - if @rcomer has been adding None into a cube list by accident, just think of all the wild things that some of our less educated users have been doing! sob
Not sure if it helps, but..
In my mind, even if it was previously possible to put non-cubes into a CubeList, that was never intended behaviour -- evidence the code covered by #3264.
So that is a bug, and fixing a bug is not a "breaking" change.
Weaselly, but we've accepted that principle before.
Still to do:
|
…in-only-cubes * 'master' of github.com:SciTools/iris: Test with newest cf-units: no longer called cf_units. (SciTools#3265)
Hi @pp-mo, do you think it's worth persevering with this? If yes, given the discussion above about it possibly being a breaking change, should we target it for Iris3.0? I'm happy to carry on if you think it's worth it, but the issue has only come up for me twice (3 years apart) so I wouldn't say it's top priority. |
I’m now leaning towards the conclusion that this isn’t really worth it: the amount of new code here seems disproportionate to the size of the problem I was originally trying to solve, particularly given the stated desire to have less in this module/class. And then there’s the ducks 🦆 🦆🦆😳 Should this PR be put out of its misery? |
@rcomer I've been racking my brains for a single killer answer to this one ! It probably is the case that, without this, certain awkward + confusing bugs are easy to come by. But as you say, it's quite a lot of fuss to completely resolve it. TBH I've always been a wee bit ambivalent about CubeList myself, as it doesn't seem to serve much purpose except as a home for merge+concatenate, and a specialist printing format -- and only the print format really needs a class object. |
Thanks @pp-mo for giving this some thought. Some operations that fail were listed further down the issue (comment and comment) . Specifically: [cube] = cubelist.extract(name) gives
If you try to print a cubelist with None in it, you get
If you try to save a cubelist with a None to NetCDF you get
|
@bjlittle basically we need a decision whether it’s actually worth all the new code (see our last 3 comments). I did not realise how complicated it would get when I started! 😬 If it is worth it, there is also a question of how you could allow cube-like objects (duck types) to exist in the CubeList. The current implementation uses |
Hi @rcomer @bjlittle .
Sounds to me like this person has a CubeList with another CubeList inside it ! |
Right. I think my enthusiasm for this one has waned somewhat. Due to staff changes in my team I don’t currently have the spare capacity I did so, even if we knew where we wanted to go with it, I’d struggle to justify the time any time soon. So I think it’s time to call it a day. Thanks @pp-mo for all your advice. I learned a lot, so you can still chalk it up against your teaching objectives 👍 |
See #1897.
For completeness, I have addressed
insert
andextend
as well asappend
(have I missed any?), althoughinsert
.extend
, particularly in the case where a cube is passed. Would the distinction betweenappend
andextend
seem a bit arbitrary to a new Python user?