Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Units: Quantity always thinks it is iterable #878

Closed
wants to merge 3 commits into from

Conversation

embray
Copy link
Member

@embray embray commented Apr 18, 2013

Don't have time to look at this today, but probably can tackle over the weekend.

>>> astropy.utils.isiterable([15.,17.]*u.yr)
True
>>> astropy.utils.isiterable(15.*u.yr)
True

@adrn
Copy link
Member Author

adrn commented Mar 27, 2013

It looks like the root of the problem here is that in isiterable(), it has this block:

try:
    iter(obj)
    return True
except TypeError:
    return False

But because Quantity has __getitem__() defined, it can always return an iterator.

How about changing it to:

try:
    iter(obj).next()
    return True
except TypeError:
    return False

I have a branch ready to go with this in a PR if it seems ok...tests still pass.

@mdboom
Copy link
Contributor

mdboom commented Mar 28, 2013

Seems reasonable.

@taldcroft
Copy link
Member

Is there any case where iterating over an iterable (via next()) might have side effects? It's easy enough to contrive such an example, but I don't know if there are realistic cases.

@adrn
Copy link
Member Author

adrn commented Mar 28, 2013

@taldcroft Hm, I guess it could be an issue for file-like objects?

@taldcroft
Copy link
Member

For file-like objects the test of isinstance(obj, collections.Iterable) passes so the iter(obj).next() never gets hit.

Just getting nitpicky, an iterable object can have zero length (like []), in which case the next() call will produce a StopIterationError:

>>> iter([]).next()
ERROR: StopIteration [IPython.core.interactiveshell]

So to me the next() addition seems like not the best strategy here, though I don't have a better alternative...

@adrn
Copy link
Member Author

adrn commented Mar 28, 2013

Good point, but that too would get caught by isinstance(obj, collections.Iterable), right?

adrn added a commit to adrn/astropy that referenced this pull request Apr 1, 2013
@alexrudy
Copy link
Contributor

This issue probably runs a bit deeper than astropy's isiterable() test.

I've hit it in my own code (in which I often use things like collections.Sized and collections.Iterable from python's collections module).

I have the vague idea of hiding the irrelevant attribute methods behind __getattr__ and __hasattr__ functions which would only expose functions like __len__ and other collection methods when that is appropriate (based on .isscalar). Probably smarter would be to separate Quantity in the scalar and non-scalar case, although that drags in the whole question from #929. If Quantity is going to be useful data type for me, then duck typing and ABC typing need to work without manually checking for .isscalar.

By the way, collections.Iterable checks for the presence of an __iter__ method and collections.Iterator checks for __next__, both of which duck-type check correctly.

@embray
Copy link
Member

embray commented Apr 12, 2013

I think that if we are successful in making Quantity an ndarray subtype then that should resolve this issue. isiterable seems to work for "array scalars":

In [9]: isiterable(np.array(1))
Out[9]: False

@alexrudy
Copy link
Contributor

@iguananaut I suspected this might well be part of the solution, however, it is still a little unsatisfying. I suspect that the behavior of "array scalars" are well defined upstream. Although they pass the astropy isiterable test, ArrayScalars don't play nice with python duck-typing in other situations. Really, if we all wrote our python "correctly", we wouldn't need the special isiterable method, and we'd be able to tell just by checking for the presence of the __iter__ method, like built-in python does in collections etc.

Its unfortunate that Numpy breaks this with scalar arrays, and an argument which suggests we might not want to move entirely to scalar arrays for Quantity. (Although there are plenty of great arguments in the other direction)

Do you know if the Numpy crowd has ever considered fixing this?

@embray
Copy link
Member

embray commented Apr 12, 2013

I agree that Numpy scalar arrays are broken in this sense. I think I'm just okay with it because everyone who's used Numpy has just learned that it's something we have to accept. I'd be all for finding a way to fix this from the Numpy end though.

@eteq
Copy link
Member

eteq commented Apr 18, 2013

Yeah, the borken Numpy scalar arrays were the reason for our own isiterable in the first place.

Two alternatives for @adrn's approach:

  1. implement an __iter__ method in Quantity that checks if self.isscalar, and if so issues a TypeError, and if not, does whatever the current behavior is. I think that will fix this
  2. Just short-circuit the problem by adding this at the top if isiterable:
if isinstance(obj, Qunatity):
    return not obj.isscalar

@embray
Copy link
Member

embray commented Apr 18, 2013

I like alternative 1️⃣ I'll try it out and attach a PR if it works.

@embray
Copy link
Member

embray commented Apr 18, 2013

Well, turns out that doesn't work because as soon as you define __iter__ the object becomes an instance of collections.Iterable. That said, defining __iter__ is still a good idea so that we can raise a TypeError as soon as one tries iter(q) on a scalar quantity. So I will use a combination of 1️⃣ and 2️⃣.

@embray
Copy link
Member

embray commented Apr 18, 2013

Nope: That doesn't quite do it either. Putting 'import Quantity' anywhere in utils.misc (even inside the function call) leads to a nasty import loop. This isn't too surprising actually, and in general stuff in utils shouldn't depend on anything else in Astropy--it should be a one way street. So finally I just removed the isinstance(obj, collections.Iterable) and let it fall back on checking if iter(obj) works, which can be slower but generally won't be.

@embray
Copy link
Member

embray commented Apr 18, 2013

Compromise: I changed it to if not isinstance(obj, collections.Iterable). That will work fine for any negatives. But then it checks iter(obj) to catch any false positives.

@eteq
Copy link
Member

eteq commented Apr 21, 2013

whoops, looks like this needs a rebase. Other than that, looks good to me!

embray added a commit to embray/astropy that referenced this pull request Apr 22, 2013
…ator

which is nice, and a little preferable over relying on ``__getitem__`.
Also fixed ``isiterable()`` to better catch false positives in cases
like Numpy arrays and Quantities, where they implement ``__iter__``, but
it can raise a ``TypeError`` in certain cases.
@embray
Copy link
Member

embray commented Apr 22, 2013

Rebased.

@@ -10,14 +10,14 @@
import pytest
import numpy as np

from ...tests.helper import raises
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it appears this line is necessary - raises is used below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that too, just forgot to delete it.

@eteq
Copy link
Member

eteq commented Apr 22, 2013

Looks like it needs another rebase (changes.rst?) or manual merge, but more importantly, the tests are failing. Looks like this changed some behavior that table and parts of fits depend on - any idea what's going on here, @iguananaut ?

@embray
Copy link
Member

embray commented Apr 23, 2013

All those failures are surprising...I'd think I'd have run the tests before pushing but maybe I didn't?

@embray
Copy link
Member

embray commented Apr 23, 2013

The problem here is a bit tricky. ABCs can implement a magic classmethod call __subclasshook__ that it uses to modify the behavior of issubclass (and by extension isinstance). For example collections.Iterable defines a __subclasshook__ which returns True on any class that implements an __iter__ method. That way anything with an __iter__ is automatically a subclass of collections.Iterable even if it doesn't explicitly inherit from this.

According to http://docs.python.org/2/library/collections.html, a class that implements the __getitem__ and __len__ methods can be used as a collections.Sequence implementation (which is also automatically a subclass of Iterable). I assumed (wrongly) that Sequence would also define a __subclasshook__ for this but apparently it does not. This fact is not at all documented, and all I could find were some mailinglist discussions about how it's too hard to determine by introspection that a class with __getitem__ and __len__ will truly behave correctly as a sequence. This fact is the source of the test failures.

I still think this is a bug in Python--both in the documentation and the behavior, because even if a class is not properly identifiable as a sequence, calling iter() on it will still succeed at least up to a point. It's really inconsistent on Python's part that you can call iter() on something that the ABC system can't recognized as Iterable. So until this is fixed I'm removing the check for collections.Iterable completely until it can be relied on to behave more consistently.

I also went ahead and registered a few classes as implementing existing collection types where appropriate.

@embray
Copy link
Member

embray commented Apr 23, 2013

Indicentally, registering the mentioned classes as Sequences fixed all the tests without having to make any changes to isiterable. But I'm still changing isiterable anyways since the behavior of the Sequence ABC is too unreliable here.

…ator

which is nice, and a little preferable over relying on ``__getitem__`.
Also fixed ``isiterable()`` to better catch false positives in cases
like Numpy arrays and Quantities, where they implement ``__iter__``, but
it can raise a ``TypeError`` in certain cases.
@eteq
Copy link
Member

eteq commented Apr 26, 2013

Wow that's surprisingly subtle. I wonder if you should report it to the python core?

Anyway, it's working as expected for us now, so I'll go ahead and merge (I'll just manually fix up CHANGES.rst because that's all that's conflicting).

@embray embray closed this in 3b4223a Apr 26, 2013
eteq added a commit that referenced this pull request Apr 26, 2013
@eteq
Copy link
Member

eteq commented Apr 26, 2013

Weird that this is "closed" instead of "merged". Ah well, nevertheless, it's in.

@embray
Copy link
Member

embray commented May 14, 2013

And as such this got missed by my backport script, so good thing I double-checked. Usually I've seen GitHub be pretty good about attaching a manual merge to a pull request but as you said that didn't happen this time :/

eteq added a commit that referenced this pull request May 14, 2013
Conflicts:
	CHANGES.rst

Conflicts:

	astropy/table/table.py
	astropy/units/tests/test_quantity.py
@embray embray deleted the issue-878 branch September 30, 2014 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants