Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type unification problem of integer temp variables in a parfor loop #5801

Open
2 tasks done
kozlov-alexey opened this issue Jun 2, 2020 · 2 comments
Open
2 tasks done

Comments

@kozlov-alexey
Copy link
Contributor

Hi,

I found an example when getitem implementation for a new extension type (pandas.RangeIndex) cannot be compiled in a parfor loop, if getitem result is used as a key in a typed.Dict lookup operation.

The reason for the problem is likely unsafe numpy-like type unification rules in Number.unify() that unify two variables (temporary phi variables from different branches inside a loop) of types int64, uint64 to float64, affecting result type of getitem operation and finally causing failure since this result is used as a key in typed.Dict with integer domain key_type. Below is a test that can reproduce the problem:

import numba
from numba import types
from numba.extending import overload

import unittest
import numpy as np

class TestSuite(unittest.TestCase):

    def test_reproducer1(self):

        def range_getitem(start, stop, step, idx):
            pass

        @overload(range_getitem)
        def range_getitem_ovld(start, stop, step, idx):
            if not (isinstance(idx, types.Integer)
                    and isinstance(start, types.Number)
                    and isinstance(stop, types.Number)
                    and isinstance(step, types.Number)):
                return

            def range_getitem_impl(start, stop, step, idx):
                range_len = len(range(start, stop, step))

                # this line causes type unification problem between int64 (range_len + idx) and uint64 (idx)             
                idx = (range_len + idx) if idx < 0 else idx
                return start + step * idx                   # with this line it fails
                # return start + step * types.int64(idx)    # this can be a workaround
            return range_getitem_impl

        @numba.njit(parallel=True)
        def test_impl(n, values, start, stop, step):
            values_dict = {}
            for x in values:
                values_dict[x] = True

            res = np.empty(n, dtype=types.bool_)
            for i in numba.prange(n):
                res[i] = range_getitem(start, stop, step, i) in values_dict
            return res

        start, stop, step = 1, 10, 2
        range_len = len(range(start, stop, step))
        values = np.asarray([1, 3, 4, 2, 9], dtype=np.int64)
        print(test_impl(range_len, values, start, stop, step))


if __name__ == "__main__":
    unittest.main()

And here's the IR of range_getitem_impl and typevars dict after constraints propagation is finished:

label 0:
    start = arg(0, name=start)               ['start']
    stop = arg(1, name=stop)                 ['stop']
    step = arg(2, name=step)                 ['step']
    idx = arg(3, name=idx)                   ['idx']
    $2load_global.0 = global(len: <built-in function len>) ['$2load_global.0']
    $4load_global.1 = global(range: <class 'range'>) ['$4load_global.1']
    $12call_function.5 = call $4load_global.1(start, stop, step, func=$4load_global.1, args=[Var(start, example_test.py:24), Var(stop, example_test.py:24), Var(step, example_test.py:24)], kws=(), vararg=None) ['$12call_function.5', '$4load_global.1', 'start', 'step', 'stop']
    $14call_function.6 = call $2load_global.0($12call_function.5, func=$2load_global.0, args=[Var($12call_function.5, example_test.py:24)], kws=(), vararg=None) ['$12call_function.5', '$14call_function.6', '$2load_global.0']
    range_len = $14call_function.6           ['$14call_function.6', 'range_len']
    $const20.8 = const(int, 0)               ['$const20.8']
    $22compare_op.9 = idx < $const20.8       ['$22compare_op.9', '$const20.8', 'idx']
    bool24 = global(bool: <class 'bool'>)    ['bool24']
    $24pred = call bool24($22compare_op.9, func=bool24, args=(Var($22compare_op.9, example_test.py:29),), kws=(), vararg=None) ['$22compare_op.9', '$24pred', 'bool24']
    branch $24pred, 26, 34                   ['$24pred']
label 26:
    $30binary_add.2 = range_len + idx        ['$30binary_add.2', 'idx', 'range_len']
    $phi36.0 = $30binary_add.2               ['$30binary_add.2', '$phi36.0']        # this will be typed as int64
    jump 36                                  []
label 34:
    $phi36.0.1 = idx                         ['$phi36.0.1', 'idx']                # this will be typed uint64 (same as arg.idx)
    jump 36                                  []
label 36:
    $phi36.0.2 = phi(incoming_values=[Var($phi36.0, example_test.py:29), Var($phi36.0.1, example_test.py:29)], incoming_blocks=[26, 34]) ['$phi36.0', '$phi36.0.1', '$phi36.0.2']    # this will unify types for $phi36.0 and $phi36.0.1
    idx.1 = $phi36.0.2                       ['$phi36.0.2', 'idx.1']
    $44binary_multiply.4 = step * idx.1      ['$44binary_multiply.4', 'idx.1', 'step']
    $46binary_add.5 = start + $44binary_multiply.4 ['$44binary_multiply.4', '$46binary_add.5', 'start']
    $48return_value.6 = cast(value=$46binary_add.5) ['$46binary_add.5', '$48return_value.6']
    return $48return_value.6                 ['$48return_value.6']
[Current context]: File "C:\Users\akozlov\AppData\Local\Continuum\anaconda3\numba_master\numba\numba\core\typed_passes.py", line 71, in type_inference_stage
>>> infer.typevars
{'arg.start': arg.start := int64
'arg.stop': arg.stop := int64
'arg.step': arg.step := int64
'arg.idx': arg.idx := uint64
'$2load_global.0': $2load_global.0 := Function(<built-in function len>)
'$4load_global.1': $4load_global.1 := Function(<class 'range'>)
'$const20.8': $const20.8 := Literal[int](0)
'bool24': bool24 := Function(<class 'bool'>)
'start': start := int64
'stop': stop := int64
'step': step := int64
'idx': idx := uint64
'$12call_function.5': $12call_function.5 := range_state_int64
'$14call_function.6': $14call_function.6 := int64
'range_len': range_len := int64
'$22compare_op.9': $22compare_op.9 := bool
'$24pred': $24pred := bool
'$30binary_add.2': $30binary_add.2 := int64
'$phi36.0': $phi36.0 := int64
'$phi36.0.1': $phi36.0.1 := uint64
'$phi36.0.2': $phi36.0.2 := float64
'idx.1': idx.1 := float64
'$44binary_multiply.4': $44binary_multiply.4 := float64
'$46binary_add.5': $46binary_add.5 := float64
'$48return_value.6': $48return_value.6 := float64}

The error is following:

======================================================================
ERROR: test_reproducer1 (__main__.TestSuite)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "example_test.py", line 48, in test_reproducer1
    print(test_impl(range_len, values, start, stop, step))
  File "C:\Users\akozlov\AppData\Local\Continuum\anaconda3\numba_master\numba\num
ba\core\dispatcher.py", line 415, in _compile_for_args
    error_rewrite(e, 'typing')
  File "C:\Users\akozlov\AppData\Local\Continuum\anaconda3\numba_master\numba\num
ba\core\dispatcher.py", line 358, in error_rewrite
    reraise(type(e), e, None)
  File "C:\Users\akozlov\AppData\Local\Continuum\anaconda3\numba_master\numba\num
ba\core\utils.py", line 80, in reraise
    raise value.with_traceback(tb)
numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython m
ode backend)
Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<built-in function contains>) found for si
gnature:

 >>> contains(DictType[int64,bool], float64)

There are 14 candidate implementations:
  - Of which 12 did not match due to:
  Overload in function 'contains': File: <built-in>: Line <N/A>.
    With argument(s): '(DictType[int64,bool], float64)':
   No match.
  - Of which 2 did not match due to:
  Overload in function 'contains': File: <built-in>: Line <N/A>.
    With argument(s): '(DictType[int64,bool], float64)':
   Rejected as the implementation raised a specific error:
     TypingError: Failed in nopython mode pipeline (step: nopython frontend)
   No implementation of function Function(<intrinsic _cast>) found for signature:


    >>> <unknown function>(float64, class(int64))

   There are 2 candidate implementations:
     - Of which 2 did not match due to:
     Overload in function '_cast': File: numba\typed\typedobjectutils.py: Line 23
.
       With argument(s): '(float64, class(int64))':
      Rejected as the implementation raised a specific error:
        TypingError: cannot safely cast float64 to int64. Please cast explicitly.

     raised from C:\Users\akozlov\AppData\Local\Continuum\anaconda3\numba_master\
numba\numba\typed\typedobjectutils.py:73

   During: resolving callee type: Function(<intrinsic _cast>)
   During: typing of call at C:\Users\akozlov\AppData\Local\Continuum\anaconda3\n
umba_master\numba\numba\typed\dictobject.py (807)


   File "numba\typed\dictobject.py", line 807:
       def impl(d, k):
           k = _cast(k, keyty)
           ^

  raised from C:\Users\akozlov\AppData\Local\Continuum\anaconda3\numba_master\num
ba\numba\core\typeinfer.py:994

During: typing of intrinsic-call at example_test.py (42)

File "example_test.py", line 42:
        def test_impl(n, values, start, stop, step):
            <source elided>
            for i in numba.prange(n):
                res[i] = range_getitem(start, stop, step, i) in values_dict
                ^

During: lowering "id=0[LoopNest(index_variable = parfor_index.3, range = (0, n, 1
))]{52: <ir.Block at example_test.py (41)>}Var(parfor_index.3, example_test.py:41
)" at example_test.py (41)

I verified problem is still applicable on master: 0.50.0.dev0+378.g079a70274

Best Regards,
Alexey.

@stuartarchibald
Copy link
Contributor

Thanks for the report. I think that the prange induction variable is unsigned where possible to ensuring various vectorizations occur based on the use of an unsigned scheduler. What you are seeing is indeed due to NumPy type unification rules between a signed and an unsigned int, and that it unifies to float. @DrTodd13 any ideas if there's something that can be done here? I'm not sure that there is a general solution.

@DrTodd13
Copy link
Collaborator

@stuartarchibald @kozlov-alexey I am going to remove the ParallelAccelerator tag here because this is really about type unification and I'm going to turn this into a feature request. The parallel=True docs should perhaps be more upfront to users that if the range index is known to be non-negative then the index is typed as unsigned. I agree with @sklam that the long-term solution is to separate Python from Numpy type unification but in the short-term, the workaround you found is the best you can do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants