Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inserting global njit, numpy function into an njit function fails #4602

Closed
2 tasks done
mroeschke opened this issue Sep 22, 2019 · 10 comments
Closed
2 tasks done

Inserting global njit, numpy function into an njit function fails #4602

mroeschke opened this issue Sep 22, 2019 · 10 comments
Labels
no action required No action was needed to resolve.

Comments

@mroeschke
Copy link

Reporting a bug

In [6]: from numba import njit, __version__

In [7]: __version__
Out[7]: '0.45.1'

In [8]: numba_func = njit(np.sum)

In [9]: @njit
   ...: def run_func(values):
   ...:     return numba_func(values)
   ...:

In [10]: run_func(np.arange(10))
---------------------------------------------------------------------------
TypingError                               Traceback (most recent call last)
<ipython-input-10-867344b87300> in <module>
----> 1 run_func(np.arange(10))

/numba/dispatcher.py in _compile_for_args(self, *args, **kws)
    374                 e.patch_message(msg)
    375
--> 376             error_rewrite(e, 'typing')
    377         except errors.UnsupportedError as e:
    378             # Something unsupported is present in the user code, add help info

/numba/dispatcher.py in error_rewrite(e, issue_type)
    341                 raise e
    342             else:
--> 343                 reraise(type(e), e, None)
    344
    345         argtypes = []

/numba/six.py in reraise(tp, value, tb)
    656             value = tp()
    657         if value.__traceback__ is not tb:
--> 658             raise value.with_traceback(tb)
    659         raise value
    660

TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Failed in nopython mode pipeline (step: nopython frontend)
Untyped global name 'isinstance': cannot determine Numba type of <class 'builtin_function_or_method'>

File "../../../../anaconda3/envs/pandas-2s-dev/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 2062:
def sum(a, axis=None, dtype=None, out=None, keepdims=np._NoValue, initial=np._NoValue):
    <source elided>
    """
    if isinstance(a, _gentype):
    ^

[1] During: resolving callee type: type(CPUDispatcher(<function sum at 0x10c79c830>))
[2] During: typing of call at <ipython-input-9-ad952396009d> (3)

[3] During: resolving callee type: type(CPUDispatcher(<function sum at 0x10c79c830>))
[4] During: typing of call at <ipython-input-9-ad952396009d> (3)


File "<ipython-input-9-ad952396009d>", line 3:
def run_func(values):
    return numba_func(values)
    ^

This is not usually a problem with Numba itself but instead often caused by
the use of unsupported features or an issue in resolving types.

To see Python/NumPy features supported by the latest release of Numba visit:
http://numba.pydata.org/numba-doc/latest/reference/pysupported.html
and
http://numba.pydata.org/numba-doc/latest/reference/numpysupported.html

For more information about typing errors and how to debug them visit:
http://numba.pydata.org/numba-doc/latest/user/troubleshoot.html#my-code-doesn-t-compile

If you think your code should work with Numba, please report the error message
and traceback, along with a minimal reproducer at:
https://github.com/numba/numba/issues/new
@mroeschke
Copy link
Author

Here's the response when njit(np.mean) is used instead.

---------------------------------------------------------------------------
TypingError                               Traceback (most recent call last)
<ipython-input-4-867344b87300> in <module>
----> 1 run_func(np.arange(10))

numba/dispatcher.py in _compile_for_args(self, *args, **kws)
    374                 e.patch_message(msg)
    375
--> 376             error_rewrite(e, 'typing')
    377         except errors.UnsupportedError as e:
    378             # Something unsupported is present in the user code, add help info

numba/dispatcher.py in error_rewrite(e, issue_type)
    341                 raise e
    342             else:
--> 343                 reraise(type(e), e, None)
    344
    345         argtypes = []

numba/six.py in reraise(tp, value, tb)
    656             value = tp()
    657         if value.__traceback__ is not tb:
--> 658             raise value.with_traceback(tb)
    659         raise value
    660

TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Internal error at <numba.typeinfer.CallConstraint object at 0x61e368ed0>.
Failed in nopython mode pipeline (step: analyzing bytecode)
Use of unsupported opcode (SETUP_EXCEPT) found

File "../../../../anaconda3/envs/pandas-2s-dev/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 3110:
def mean(a, axis=None, dtype=None, out=None, keepdims=np._NoValue):
    <source elided>
    if type(a) is not mu.ndarray:
        try:
        ^

[1] During: resolving callee type: type(CPUDispatcher(<function mean at 0x10a35d680>))
[2] During: typing of call at <ipython-input-3-ad952396009d> (3)

Enable logging at debug level for details.

File "<ipython-input-3-ad952396009d>", line 3:
def run_func(values):
    return numba_func(values)
    ^

This is not usually a problem with Numba itself but instead often caused by
the use of unsupported features or an issue in resolving types.

To see Python/NumPy features supported by the latest release of Numba visit:
http://numba.pydata.org/numba-doc/latest/reference/pysupported.html
and
http://numba.pydata.org/numba-doc/latest/reference/numpysupported.html

For more information about typing errors and how to debug them visit:
http://numba.pydata.org/numba-doc/latest/user/troubleshoot.html#my-code-doesn-t-compile

If you think your code should work with Numba, please report the error message
and traceback, along with a minimal reproducer at:
https://github.com/numba/numba/issues/new

@esc
Copy link
Member

esc commented Sep 23, 2019

@mroeschke thank you for asking this question about Numba. Could you possibly expand a bit on what larger problem you are attempting to solve? I.e. is anything stopping you from using the following construct:

In [8]: @njit
   ...: def f(a):
   ...:     return np.sum(a)
   ...:

In [9]: f(np.arange(10))
Out[9]: 45

In [10]: %timeit f.py_func(np.arange(10))
2.74 µs ± 34 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [11]: %timeit f(np.arange(10))
614 ns ± 6.43 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

@esc esc added the needtriage label Sep 23, 2019
@stuartarchibald
Copy link
Contributor

As hinted at by @esc, the issue is that you are trying to jit the np.FOO function from NumPy directly, as in, jit the actual NumPy implementation. This is unlikely to work as there's probably all sorts of unsupported things (like isinstance) in the NumPy implementation. Would recommend doing what @esc suggested, wrapping the function. If you have a bunch of these things where you want to pass some NumPy API function to a jitted function, perhaps consider using a factory function to do the jitting.

@stuartarchibald stuartarchibald added no action required No action was needed to resolve. and removed needtriage labels Sep 23, 2019
@mroeschke
Copy link
Author

I'm currently trying to implement rolling apply in pandas, in which a user can pass an arbitrary function that can be applied to a window of data.

My exact implementation can be found in https://github.com/twosigma/pandas/pull/29/files#diff-0de5c5d9abfcdd141e83701eaaec4358R1145, but posting the relevant part:

# I have a factory function to generate a "roll_apply" function 
# given a passed function.
def make_rolling_apply(func):

    numba_func = numba.njit(func)

    # I'd like this function signature to remain fixed 
    # (consistent with other roll functions)
    @numba.njit
    def roll_apply(
        values: np.ndarray,
        begin: np.ndarray,
        end: np.ndarray,
        minimum_periods: int,
    ):
        result = np.empty(len(begin))
        for i, (start, stop) in enumerate(zip(begin, end)):
            window = values[start:stop]
            count_nan = np.sum(np.isnan(window))
            if len(window) - count_nan >= minimum_periods:
                result[i] = numba_func(window, *args)
            else:
                result[i] = np.nan
        return result

    return roll_apply

If I change numba_func to:

@numba.njit
def numba_func(window):
    return func(window)

this will fail in nopython mode (which I would really like to maintain here for performance)

E           Failed in nopython mode pipeline (step: nopython frontend)
E           Untyped global name 'func': cannot determine Numba type of <class 'function'>

@stuartarchibald
Copy link
Contributor

@mroeschke Thanks for the update. This might help? It's unfortunately not particularly elegant.

import numba
import numpy as np

def make_rolling_apply(func, args=(), kwargs=()):

    @numba.generated_jit(nopython=True)
    def numba_func(window, *_args):
        if getattr(np, func.__name__, False):
            def impl(window, *_args):
                return func(window, *_args)
            return impl
        else:
            jf = numba.njit(func)
            def impl(window, *_args):
                return jf(window, *_args)
            return impl

    # I'd like this function signature to remain fixed 
    # (consistent with other roll functions)
    @numba.njit
    def roll_apply(
        values: np.ndarray,
        begin: np.ndarray,
        end: np.ndarray,
        minimum_periods: int,
    ):
        result = np.empty(len(begin))
        for i, (start, stop) in enumerate(zip(begin, end)):
            window = values[start:stop]
            count_nan = np.sum(np.isnan(window))
            if len(window) - count_nan >= minimum_periods:
                result[i] = numba_func(window, *args)
            else:
                result[i] = np.nan
        return result

    return roll_apply


def _apply(the_func, args, kwargs):
    # this stuff would come from self?
    N = 10
    values = np.ones((N),)
    begin = np.arange(N)
    end = begin + 1
    minimum_periods = 1

    impl = make_rolling_apply(the_func, args=args, kwargs=kwargs)

    return impl(values, begin, end, minimum_periods)

print(_apply(np.sum, (), {}))

def foo(window, *args):
    arg1, arg2 = args
    return (window[0] + arg1) / arg2

print(_apply(foo, args=(10, 20), kwargs={}))

def bar(window, *args):
    arg1, arg2 = args
    return (window[0] + arg1) / arg2[1]

print(_apply(bar, args=(10, np.ones(5)), kwargs={}))

@mroeschke
Copy link
Author

Thanks @stuartarchibald! I'll try this solution tonight. Special handling of numpy function may the right work around here.

@mroeschke
Copy link
Author

Your solution worked @stuartarchibald, thanks!

So should I assume njit(np.<function>) won't be supported?

@stuartarchibald
Copy link
Contributor

@mroeschke great! It might also be worth adding a check in the NumPy function identification part to make sure it is indeed the NumPy function, i.e. this breaks:

In [4]: import numpy as np                                                                                                      

In [5]: def sum(x): 
   ...:     pass 
   ...:                                                                                                                         

In [6]: func = sum                                                                                                              

In [7]: getattr(np, func.__name__, False)                                                                                       
Out[7]: <function numpy.sum(a, axis=None, dtype=None, out=None, keepdims=<no value>, initial=<no value>)>

which probably needs something like:

In [11]: possible_np_func = getattr(np, func.__name__, False)                                                                   

In [12]: if func is possible_np_func: 
    ...:     print("is Numpy func") 
    ...: else: 
    ...:     print("is not NumPy func") 
    ...:                                                                                                                        
is not NumPy func

to make sure only NumPy functions are treated this way.

So should I assume njit(np.<function>) won't be supported?

This is up for discussion at the Numba core developer meeting today, with this #4599 in the works use cases similar to that presented are more likely. We briefly discussed at an issue triage session yesterday how such a feature could be implemented too, it seems like it's technically possible.

@stuartarchibald
Copy link
Contributor

Outcome of discussion was #4608

@mroeschke
Copy link
Author

Thanks. I'll close this issue in favor of #4608

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no action required No action was needed to resolve.
Projects
None yet
Development

No branches or pull requests

3 participants