You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I found an example when getitem implementation for a new extension type (pandas.RangeIndex) cannot be compiled in a parfor loop, if getitem result is used as a key in a typed.Dict lookup operation.
The reason for the problem is likely unsafe numpy-like type unification rules in Number.unify() that unify two variables (temporary phi variables from different branches inside a loop) of types int64, uint64 to float64, affecting result type of getitem operation and finally causing failure since this result is used as a key in typed.Dict with integer domain key_type. Below is a test that can reproduce the problem:
import numba
from numba import types
from numba.extending import overload
import unittest
import numpy as np
class TestSuite(unittest.TestCase):
def test_reproducer1(self):
def range_getitem(start, stop, step, idx):
pass
@overload(range_getitem)
def range_getitem_ovld(start, stop, step, idx):
if not (isinstance(idx, types.Integer)
and isinstance(start, types.Number)
and isinstance(stop, types.Number)
and isinstance(step, types.Number)):
return
def range_getitem_impl(start, stop, step, idx):
range_len = len(range(start, stop, step))
# this line causes type unification problem between int64 (range_len + idx) and uint64 (idx)
idx = (range_len + idx) if idx < 0 else idx
return start + step * idx # with this line it fails
# return start + step * types.int64(idx) # this can be a workaround
return range_getitem_impl
@numba.njit(parallel=True)
def test_impl(n, values, start, stop, step):
values_dict = {}
for x in values:
values_dict[x] = True
res = np.empty(n, dtype=types.bool_)
for i in numba.prange(n):
res[i] = range_getitem(start, stop, step, i) in values_dict
return res
start, stop, step = 1, 10, 2
range_len = len(range(start, stop, step))
values = np.asarray([1, 3, 4, 2, 9], dtype=np.int64)
print(test_impl(range_len, values, start, stop, step))
if __name__ == "__main__":
unittest.main()
And here's the IR of range_getitem_impl and typevars dict after constraints propagation is finished:
======================================================================
ERROR: test_reproducer1 (__main__.TestSuite)
----------------------------------------------------------------------
Traceback (most recent call last):
File "example_test.py", line 48, in test_reproducer1
print(test_impl(range_len, values, start, stop, step))
File "C:\Users\akozlov\AppData\Local\Continuum\anaconda3\numba_master\numba\num
ba\core\dispatcher.py", line 415, in _compile_for_args
error_rewrite(e, 'typing')
File "C:\Users\akozlov\AppData\Local\Continuum\anaconda3\numba_master\numba\num
ba\core\dispatcher.py", line 358, in error_rewrite
reraise(type(e), e, None)
File "C:\Users\akozlov\AppData\Local\Continuum\anaconda3\numba_master\numba\num
ba\core\utils.py", line 80, in reraise
raise value.with_traceback(tb)
numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython m
ode backend)
Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<built-in function contains>) found for si
gnature:
>>> contains(DictType[int64,bool], float64)
There are 14 candidate implementations:
- Of which 12 did not match due to:
Overload in function 'contains': File: <built-in>: Line <N/A>.
With argument(s): '(DictType[int64,bool], float64)':
No match.
- Of which 2 did not match due to:
Overload in function 'contains': File: <built-in>: Line <N/A>.
With argument(s): '(DictType[int64,bool], float64)':
Rejected as the implementation raised a specific error:
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<intrinsic _cast>) found for signature:
>>> <unknown function>(float64, class(int64))
There are 2 candidate implementations:
- Of which 2 did not match due to:
Overload in function '_cast': File: numba\typed\typedobjectutils.py: Line 23
.
With argument(s): '(float64, class(int64))':
Rejected as the implementation raised a specific error:
TypingError: cannot safely cast float64 to int64. Please cast explicitly.
raised from C:\Users\akozlov\AppData\Local\Continuum\anaconda3\numba_master\
numba\numba\typed\typedobjectutils.py:73
During: resolving callee type: Function(<intrinsic _cast>)
During: typing of call at C:\Users\akozlov\AppData\Local\Continuum\anaconda3\n
umba_master\numba\numba\typed\dictobject.py (807)
File "numba\typed\dictobject.py", line 807:
def impl(d, k):
k = _cast(k, keyty)
^
raised from C:\Users\akozlov\AppData\Local\Continuum\anaconda3\numba_master\num
ba\numba\core\typeinfer.py:994
During: typing of intrinsic-call at example_test.py (42)
File "example_test.py", line 42:
def test_impl(n, values, start, stop, step):
<source elided>
for i in numba.prange(n):
res[i] = range_getitem(start, stop, step, i) in values_dict
^
During: lowering "id=0[LoopNest(index_variable = parfor_index.3, range = (0, n, 1
))]{52: <ir.Block at example_test.py (41)>}Var(parfor_index.3, example_test.py:41
)" at example_test.py (41)
I verified problem is still applicable on master: 0.50.0.dev0+378.g079a70274
Best Regards,
Alexey.
The text was updated successfully, but these errors were encountered:
Thanks for the report. I think that the prange induction variable is unsigned where possible to ensuring various vectorizations occur based on the use of an unsigned scheduler. What you are seeing is indeed due to NumPy type unification rules between a signed and an unsigned int, and that it unifies to float. @DrTodd13 any ideas if there's something that can be done here? I'm not sure that there is a general solution.
@stuartarchibald@kozlov-alexey I am going to remove the ParallelAccelerator tag here because this is really about type unification and I'm going to turn this into a feature request. The parallel=True docs should perhaps be more upfront to users that if the range index is known to be non-negative then the index is typed as unsigned. I agree with @sklam that the long-term solution is to separate Python from Numpy type unification but in the short-term, the workaround you found is the best you can do.
the change log (https://github.com/numba/numba/blob/master/CHANGE_LOG).
to write one see http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports).
Hi,
I found an example when getitem implementation for a new extension type (pandas.RangeIndex) cannot be compiled in a parfor loop, if getitem result is used as a key in a typed.Dict lookup operation.
The reason for the problem is likely unsafe numpy-like type unification rules in Number.unify() that unify two variables (temporary phi variables from different branches inside a loop) of types int64, uint64 to float64, affecting result type of getitem operation and finally causing failure since this result is used as a key in typed.Dict with integer domain key_type. Below is a test that can reproduce the problem:
And here's the IR of range_getitem_impl and typevars dict after constraints propagation is finished:
The error is following:
I verified problem is still applicable on master: 0.50.0.dev0+378.g079a70274
Best Regards,
Alexey.
The text was updated successfully, but these errors were encountered: