deep_type throws error on list of numpy ints #27

kyao · 2018-01-09T00:33:05Z

The following code causes deep_type to throw a ValueError:

import pytypes
import numpy as np
pytypes.deep_type([np.int64(1), np.int64(1)])

The text was updated successfully, but these errors were encountered:

Stewori · 2018-01-09T02:51:54Z

Are you using the latest release or the repository version?
I think this was fixed with 15ec80c, but there was not yet a release after that. A repeated element in a list caused a false-positive recursion detection.

kyao · 2018-01-09T04:39:51Z

I am using the latest from the repository. The error occurs even if there is no repeated element.

>>> pytypes.deep_type([np.int64(1), np.int64(2)])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ktyao/miniconda3/envs/d3m/lib/python3.6/site-packages/pytypes/type_util.py", line 282, in deep_type
    return _deep_type(obj, [], 0, depth, max_sample)
  File "/home/ktyao/miniconda3/envs/d3m/lib/python3.6/site-packages/pytypes/type_util.py", line 317, in _deep_type
    tpl = tuple(_deep_type(t, checked, checked_len2, depth-1) for t in obj)
  File "/home/ktyao/miniconda3/envs/d3m/lib/python3.6/site-packages/pytypes/type_util.py", line 317, in <genexpr>
    tpl = tuple(_deep_type(t, checked, checked_len2, depth-1) for t in obj)
  File "/home/ktyao/miniconda3/envs/d3m/lib/python3.6/site-packages/pytypes/type_util.py", line 301, in _deep_type
    if depth == 0 or obj in checked[:checked_len]:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Stewori · 2018-01-09T19:27:15Z

Okay, confirmed. However this boils down to strange behavior with np.int64 and list:

one64 = np.int64(1)
a = [one64, one64]
one64 in a # True
one64 is a # False
one64 in [a] # ValueError

yields the same error, completely independent of pytypes. Is this intended/sane behaviour? I suppose it should maybe filed as a numpy bug. Do know if this is a known issue?

We can add a workaround like this:

def _obj_in_list(obj, lst):
	"""Simply using ``in`` is sometimes not reliable, e.g. for ``numpy.int64``,
	see https://github.com/Stewori/pytypes/issues/27.
	"""
	try:
		return obj in lst
	except ValueError:
		# As a fallback we use the index method
		try:
			lst.index(obj)
			return True
		except ValueError:
			return False

I'm a bit concerned to become bug-compatible with numpy. However in favor of workability and robustness I would add this workaround. Opinions?

kyao · 2018-01-10T00:46:17Z

It is the intended behavior, however I am not sure if it is sane or not. With numpy you can compare a number against a list.

one64 == [one64,  one64]  # Returns an array of True, instead of False

Having a workaround for robustness is good.

With numpy the index method behaves the like in. So, it also fails.

[one64, [a]].index(one64)  # Returns 0
[[a], one64].index(one64)  # ValueError
_obj_in_list(one64, [one64, [a]]) # Returns True
_obj_in_list(one64, [[a], one64]) # Returns False

Perhaps, something like this:

def _obj_in_list2(obj, lst):
    """Simply using ``in`` is sometimes not reliable, e.g. for ``numpy.int64``,
    see https://github.com/Stewori/pytypes/issues/27.
    """
    try:
        return obj in lst
    except ValueError:
        # As a fallback we check each element individually
        for elt in lst:
            result = elt == obj
            if not hasattr(result, '__iter__') and result:
                return True
        return False

Stewori · 2018-01-10T01:08:22Z

one64 == [one64, one64] # Returns an array of True, instead of False

Then maybe an actual numpy-specific workaround would be best:

def _obj_in_list3(obj, lst):
    """Simply using ``in`` is sometimes not reliable, e.g. for ``numpy.int64``,
    see https://github.com/Stewori/pytypes/issues/27.
    """
    try:
        return obj in lst
    except ValueError:
        # NumPy-specific fallback
        return any(obj == lst) # for non-NumPy behavior raises TypeError: 'bool' object is not iterable

I'm not sure if we should also catch the TypeError that occurs if neither in nor NumPy behavior applies. We could add your suggestion as a third fallback or just deal with further issues as they occur.

kyao · 2018-01-10T01:18:01Z

This third version _obj_in_list3 looks good to me.

mitar · 2018-01-13T21:15:02Z

There is another similar issue with numpy and type checking with this package.

import numpy
from pytypes import type_util

type_util._isinstance(numpy.array([1,2,3]), numpy.ndarray)

  File ".../pytypes/type_util.py", line 1489, in _isinstance
    if obj == {}:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Why is there obj == {} check at all? I mean, why a specific check for an empty dict?

Stewori · 2018-01-14T01:43:08Z

Empty lists, sets, dicts and other containers are specially problematic for deep_type, see python/typing#157. And since pytypes' main way of typechecking is combining deep_type and is_subtype, this can cause trouble. Given that there is not satisfying answer in python/typing#157 we model this with a pytypes-specific Empty[T] generic.
I think the if obj == {}: is a legacy hack that I forgot to remove. Maybe it was supposed to speed up _isinstance for empty dicts, but I don't see the usecase any more / cannot remember it (yes, code comments are great). It looks like we can safely remove this block, at least it causes no test failures.

That said, I did not test with numpy types so far. I didn't expect them to be specifically problematic.
Does it work if you remove the if obj == {}: block?

Stewori · 2018-01-14T01:46:25Z

Also: Sorry that I currently have no time to work much on these issues. I'm rather bound by another project. A soon as I can I will wrap things up for a new release, including this issue.

mitar · 2018-01-14T02:19:43Z

I opened: numpy/numpy#10399 and numpy/numpy#10400

I think that the crazy conversion of int64s to numpy array is really a bug in numpy. But probably it is something magical they like and will probably not change. So we should think about how to support that.

mitar · 2018-01-14T02:21:33Z

That said, I did not test with numpy types so far. I didn't expect them to be specifically problematic.
Does it work if you remove the if obj == {}: block?

Yes, this fixes the issue for me.

mitar · 2018-01-14T02:28:49Z

The issue @kyao reported above, is in fact coming from type checking as well:

import numpy
import typing
from pytypes import type_util

type_util._isinstance({'mapping': [numpy.int64(1), numpy.int64(1)]}, typing.Dict)

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

It seems _isinstance first tries to construct a type and it fails in that. But it is also interesting that typing.Dict does not really require such complex type to be constructed. But of course code cannot know that in advance in a general case.

mitar · 2018-01-14T06:23:16Z

One idea. In obj in checked[:checked_len], do we really want equality defined by == or do we want equality defined by is? I think we in fact want the latter, to detect loops. So fix is simple, we should just iterate and instead of == use is.

mitar · 2018-01-14T06:29:08Z

I made #29 which fixes all of the issues mentioned here.

Stewori · 2018-01-14T18:14:54Z

Hey, thanks a lot for the commits. I am currently focused on fixing #22 and somehow want to keep my queue clean, that's why I do not merge it right away. But I will accept it after committing a fix for #22.

Fixes #27 to support numpy values

Stewori added the fixed? label Jan 9, 2018

Stewori added bug and removed fixed? labels Jan 9, 2018

mitar mentioned this issue Jan 14, 2018

Fixes to support numpy values #29

Merged

Stewori closed this as completed in #29 Jan 16, 2018

Stewori added a commit that referenced this issue Jan 16, 2018

Merge pull request #29 from mitar/numpy-fixes

2f1b6d2

Fixes #27 to support numpy values

mitar mentioned this issue May 20, 2020

Equality of numpy scalars and lists fails numpy/numpy#10399

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deep_type throws error on list of numpy ints #27

deep_type throws error on list of numpy ints #27

kyao commented Jan 9, 2018

Stewori commented Jan 9, 2018

kyao commented Jan 9, 2018

Stewori commented Jan 9, 2018

kyao commented Jan 10, 2018

Stewori commented Jan 10, 2018

kyao commented Jan 10, 2018

mitar commented Jan 13, 2018 •

edited

Loading

Stewori commented Jan 14, 2018

Stewori commented Jan 14, 2018

mitar commented Jan 14, 2018 •

edited

Loading

mitar commented Jan 14, 2018

mitar commented Jan 14, 2018

mitar commented Jan 14, 2018 •

edited

Loading

mitar commented Jan 14, 2018

Stewori commented Jan 14, 2018

deep_type throws error on list of numpy ints #27

deep_type throws error on list of numpy ints #27

Comments

kyao commented Jan 9, 2018

Stewori commented Jan 9, 2018

kyao commented Jan 9, 2018

Stewori commented Jan 9, 2018

kyao commented Jan 10, 2018

Stewori commented Jan 10, 2018

kyao commented Jan 10, 2018

mitar commented Jan 13, 2018 • edited Loading

Stewori commented Jan 14, 2018

Stewori commented Jan 14, 2018

mitar commented Jan 14, 2018 • edited Loading

mitar commented Jan 14, 2018

mitar commented Jan 14, 2018

mitar commented Jan 14, 2018 • edited Loading

mitar commented Jan 14, 2018

Stewori commented Jan 14, 2018

mitar commented Jan 13, 2018 •

edited

Loading

mitar commented Jan 14, 2018 •

edited

Loading

mitar commented Jan 14, 2018 •

edited

Loading