Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

isclose() causes infinite recursion with __array_function__ #2029

Closed
pentschev opened this issue Feb 13, 2019 · 15 comments

Comments

Projects
None yet
4 participants
@pentschev
Copy link
Contributor

commented Feb 13, 2019

Snippet to reproduce and traceback follow.

import cupy
x = cupy.random.random((5000, 1000))
cupy.isclose(x, x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/nfs/pentschev/src/cupy/cupy/logic/comparison.py", line 100, in isclose
    return _is_close(a, b, rtol, atol, equal_nan)
  File "cupy/core/_kernel.pyx", line 805, in cupy.core._kernel.ufunc.__call__
    "a positional and keyword argument")
  File "cupy/core/_kernel.pyx", line 682, in cupy.core._kernel._guess_routine
    use_raw_value = _check_should_use_min_scalar(in_args)
  File "cupy/core/_kernel.pyx", line 634, in cupy.core._kernel._guess_routine_from_in_types

  File "/home/nfs/pentschev/.local/lib/python3.5/site-packages/numpy/core/overrides.py", line 165, in public_api
    implementation, public_api, relevant_args, args, kwargs)
  File "cupy/core/core.pyx", line 1252, in cupy.core.core.ndarray.__array_function__
    # Conversion:
  File "/home/nfs/pentschev/.local/lib/python3.5/site-packages/numpy/core/overrides.py", line 165, in public_api
    implementation, public_api, relevant_args, args, kwargs)
  File "cupy/core/core.pyx", line 1252, in cupy.core.core.ndarray.__array_function__
    # Conversion:

...

  File "/home/nfs/pentschev/.local/lib/python3.5/site-packages/numpy/core/overrides.py", line 165, in public_api
    implementation, public_api, relevant_args, args, kwargs)
  File "cupy/core/core.pyx", line 1252, in cupy.core.core.ndarray.__array_function__
    # Conversion:
  File "/home/nfs/pentschev/.local/lib/python3.5/site-packages/numpy/core/overrides.py", line 163, in public_api
    relevant_args = dispatcher(*args, **kwargs)
RecursionError: maximum recursion depth exceeded
@pentschev

This comment has been minimized.

Copy link
Contributor Author

commented Feb 13, 2019

@mrocklin FYI

@mrocklin

This comment has been minimized.

Copy link
Contributor

commented Feb 13, 2019

This requires the __array_function__ interface to be active I assume?

cc @shoyer , this may interest you.

@shoyer

This comment has been minimized.

Copy link

commented Feb 13, 2019

@pentschev

This comment has been minimized.

Copy link
Contributor Author

commented Feb 14, 2019

A little more debugging has shown that cupy.core._kernel._guess_routine_from_in_types() calls numpy.can_cast(), that's the function that is resulting in an infinite loop. Here's a simple example and traceback:

import cupy

x = cupy.random.random((5000, 1000))

cupy.can_cast(x, cupy.float16, 'safe')
Traceback (most recent call last):
  File "can_cast.py", line 5, in <module>
    cupy.can_cast(x, cupy.float16, 'safe')
  File "/home/nfs/pentschev/.local/lib/python3.5/site-packages/numpy/core/overrides.py", line 165, in public_api
    implementation, public_api, relevant_args, args, kwargs)
  File "cupy/core/core.pyx", line 1256, in cupy.core.core.ndarray.__array_function__
  File "/home/nfs/pentschev/.local/lib/python3.5/site-packages/numpy/core/overrides.py", line 165, in public_api
    implementation, public_api, relevant_args, args, kwargs)
  File "cupy/core/core.pyx", line 1256, in cupy.core.core.ndarray.__array_function__
...
  File "/home/nfs/pentschev/.local/lib/python3.5/site-packages/numpy/core/overrides.py", line 165, in public_api
    implementation, public_api, relevant_args, args, kwargs)
  File "cupy/core/core.pyx", line 1246, in cupy.core.core.ndarray.__array_function__
RecursionError: maximum recursion depth exceeded while calling a Python object

I guess this is in part caused by CuPy functions that are aliases of NumPy's. I also verified the same issue with numpy.common_type(), but this is not the case for numpy.issctype(). The difference is that numpy.issctype() doesn't use the array function dispatcher, whereas the other two do.

@shoyer

This comment has been minimized.

Copy link

commented Feb 14, 2019

Probably cupy's implementation of np.can_cast should be adjusted to pass a numpy dtype into np.can_cast, rather than passing in a cupy array which ends up in the recursive loop.

@pentschev

This comment has been minimized.

Copy link
Contributor Author

commented Feb 14, 2019

Sorry for not being clearer, but a numpy dtype is indeed being passed, yet the same error occurs. I thought having cupy.float16 or numpy.float16 wouldn't be of much difference.

Calling cupy.isclose(x, x) on the example above, shows us the following on pdb:

> /home/nfs/pentschev/.local/lib/python3.5/site-packages/numpy/core/overrides.py(165)public_api()
-> implementation, public_api, relevant_args, args, kwargs)
(Pdb) p implementation
<built-in function can_cast>
(Pdb) p public_api
<function can_cast at 0x7efc1e57e9d8>
(Pdb) import numpy
(Pdb) p numpy.can_cast
<function can_cast at 0x7efc1e57e9d8>
(Pdb) import cupy
(Pdb) p cupy.can_cast
<function can_cast at 0x7efc1e57e9d8>
(Pdb) p args
(array([[0.62904646, 0.06241646, 0.20936255, ..., 0.18664336, 0.82792624,
        0.73082611],
       [0.23343974, 0.88243105, 0.41403497, ..., 0.22116572, 0.9613832 ,
        0.95903791],
       [0.01802263, 0.38058099, 0.62365991, ..., 0.68608194, 0.70088272,
        0.62946398],
       ...,
       [0.04541006, 0.59005542, 0.78726097, ..., 0.61837188, 0.96839507,
        0.6721169 ],
       [0.32880355, 0.69907203, 0.34856093, ..., 0.59101406, 0.12200008,
        0.28198768],
       [0.96109902, 0.52462348, 0.77232284, ..., 0.69617867, 0.7905448 ,
        0.72705252]]), <class 'numpy.float16'>)

As seen above, all can_cast() instances have the same address, therefore are the exact same, and args shows that the type being passed is numpy.float16, and not cupy.float16, as per my previous example.

@shoyer

This comment has been minimized.

Copy link

commented Feb 14, 2019

OK, so it looks like the trouble is that cupy.can_cast is exactly the same function as numpy.can_cast. The override for numpy.can_cast is defined as numpy.can_cast -- no wonder you get an infinite loop!

@grlee77

This comment has been minimized.

Copy link
Contributor

commented Feb 14, 2019

Yeah, there are several of these that are just reused from numpy (also dtypes, etc elsewhere in that file):

cupy/cupy/__init__.py

Lines 341 to 366 in 155228f

# -----------------------------------------------------------------------------
# Data type routines (borrowed from NumPy)
# -----------------------------------------------------------------------------
from numpy import can_cast # NOQA
from numpy import common_type # NOQA
from numpy import min_scalar_type # NOQA
from numpy import obj2sctype # NOQA
from numpy import promote_types # NOQA
from numpy import result_type # NOQA
from numpy import dtype # NOQA
from numpy import format_parser # NOQA
from numpy import finfo # NOQA
from numpy import iinfo # NOQA
from numpy import MachAr # NOQA
from numpy import find_common_type # NOQA
from numpy import issctype # NOQA
from numpy import issubclass_ # NOQA
from numpy import issubdtype # NOQA
from numpy import issubsctype # NOQA
from numpy import mintypecode # NOQA
from numpy import sctype2char # NOQA
from numpy import typename # NOQA

@shoyer

This comment has been minimized.

Copy link

commented Feb 14, 2019

@pentschev

This comment has been minimized.

Copy link
Contributor Author

commented Feb 15, 2019

Correct me if I'm wrong, but I believe we'll require changes in both NumPy and CuPy. One fix I can think of is to do something like this:

  • In CuPy's __array_function__ (I'm ignoring submodule handling here for simplicity):
if getattr(cupy, func.__name__) is getattr(numpy, func.__name__):
    return None
  • In NumPy's array_function_dispatch.decorator.public_api():
func = implement_array_function(implementation, public_api, relevant_args, args, kwargs)
if func is None:
    return implementation(*args, **kwargs)
return func

If only CuPy was changed to return getattr(numpy, func.__name__)(*args, **kwargs) instead, it would continue the infinite loop by calling NumPy's function again.

@shoyer

This comment has been minimized.

Copy link

commented Feb 15, 2019

Using None as a sentinel value for "fall back to using the NumPy function" looks quite similar to this (deferred) proposal from the NEP:
http://www.numpy.org/neps/nep-0018-array-function-protocol.html#coercion-to-a-numpy-array-as-a-catch-all-fallback

If we want to change this on the NumPy side we would definitely want to put it inside implement_array_function in C, to avoid the overhead in the case where there is no dispatching to be done. It would appear as another case around these lines:
https://github.com/numpy/numpy/blob/806a9453979d0435073a5b343adc16cdc29a9afb/numpy/core/src/multiarray/arrayfunction_override.c#L291-L299

That said, I'm still not sure it's necessary to expand the protocol on the NumPy side (though this is definitely an option). Within CuPy, there could be special case logic for these wrapped functions. I think only four of these that support overrides (can_cast, common_type, min_scalar_type, result_type). Probably the easiest way to do this would be to redefine these functions in cupy/cupy/__init__.py to convert cupy arrays into numpy dtypes, e.g.,

@functools.wraps(np.can_cast)
def can_cast(from_, to, casting='safe'):
    if isinstance(from_, CupyArray):
        from_ = from_.dtype
    return np.can_cast(from_, to, casting)
@pentschev

This comment has been minimized.

Copy link
Contributor Author

commented Feb 15, 2019

I agree redefining those functions is probably the easiest way now. But there are two things that could be more error-prone:

  1. Libraries other than CuPy would need to identify and handle all such cases
  2. Inclusion/modification of NumPy wrapped functions would require changes in all libraries using the array_function interface

For these two reasons, I feel a better solution would be that NumPy provides a more generic interface, without need to handle individual functions on a per-case basis, but I do not oppose other solutions.

@shoyer

This comment has been minimized.

Copy link

commented Feb 15, 2019

@pentschev Could you please raise an issue in the NumPy Github page to discuss possible changes?

@pentschev

This comment has been minimized.

Copy link
Contributor Author

commented Feb 15, 2019

@shoyer Absolutely.

@pentschev

This comment has been minimized.

Copy link
Contributor Author

commented Feb 27, 2019

This issue has been resolved by #2024, but there's a much broader discussion still going on in numpy/numpy#12974.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.