Inconsistent error reporting for out of range indexing with parallel vs serial and lists vs arrays #4600

rborder · 2019-09-20T20:01:24Z

Reporting a bug

I am using the master version of Numba (most recent is visible in
the change log (https://github.com/numba/numba/blob/master/CHANGE_LOG).
I have included below a minimal working reproducer (if you are unsure how
to write one see http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports).

I'm observing inconsistent error behavior with parallel and serial out-of-range indexing dependent on whether operating on reflected list vs numpy array:

Minimal reproducer

import numpy as np
from numba import njit, prange

@njit(parallel=False)
def out_of_bounds_serial(array):
    for i in prange(len(array)+1):
        array[i]

@njit(parallel=True)
def out_of_bounds_parallel(array):
    for i in prange(len(array)+1):
        array[i]

out_of_bounds_serial([1,2,3])                ## IndexError: getitem out of range
out_of_bounds_serial(np.array([1,2,3]))      ## No error
out_of_bounds_parallel([1,2,3])              ## No error
out_of_bounds_parallel(np.array([1,2,3]))    ## No error

The text was updated successfully, but these errors were encountered:

esc · 2019-09-23T09:12:13Z

@rborder thanks for reporting this, I can reproduce it. The last three cases should probably raise the same kind of error that the first one raises?

stuartarchibald · 2019-09-23T09:40:28Z

Bounds checking of array access is not implemented, in part because it would harm performance. There's a WIP PR here to add it as an option #4432. The issue in the case of list access under the parallel target is probably a bug/oversight.

DrTodd13 · 2019-10-01T17:44:42Z

I don't think the bounds check needs to be inside the loop for parfors. We could look at the arrays accessed and put a check before the loop (in the parfor init block actually) comparing the size of the container with the parfor stop index. If the stop index is greater than the container size the the loop would eventually access out-of-bounds so we could assert early. It's a bit weird that the assertion would happen before any iteration of the loop is run. Does that matter? @stuartarchibald

stuartarchibald · 2019-10-03T08:55:34Z

I don't think the bounds check should be on by default as it adds work, #4432 enables it via an option and were this carried into parfors the same option should be used.

As to determining accesses before a loop runs, I'm not sure that's always possible? Consider this case:

from numba import njit, prange
import numpy as np

for _range in (njit(lambda x: range(x)), njit(inline='always')(lambda x: prange(x))):
    @njit(parallel=True)
    def foo(v):
        N = len(v)
        for i in _range(N):
            if i > v[i] // 2:
                b = "branch 1"
                idx = i
            else:
                b = "branch 2"
                idx = v[i] # the index is a function of the runtime values
            print("i =", i, b, "idx= ", idx)
            v[int(idx)] += 1
        return v

    x = np.arange(10)
    print(foo(x))

I think focusing on the GEP based validation might be easier to reason about?

. It's a bit weird that the assertion would happen before any iteration of the loop is run. Does that matter? @stuartarchibald

If it is possible to do this reliably via static analysis I think that this would be fine. It is "weird" but I don't think it would interrupt the user experience on the basis that no guarantee is made about what parallel transforms take place or what they actually do. Most important would be to ensure that the out of bounds location is tracked correctly so as to provide useful feedback to the user.

esc added the needtriage label Sep 23, 2019

stuartarchibald added bug ParallelAccelerator and removed needtriage labels Sep 23, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent error reporting for out of range indexing with parallel vs serial and lists vs arrays #4600

Inconsistent error reporting for out of range indexing with parallel vs serial and lists vs arrays #4600

rborder commented Sep 20, 2019 •

edited

esc commented Sep 23, 2019

stuartarchibald commented Sep 23, 2019

DrTodd13 commented Oct 1, 2019

stuartarchibald commented Oct 3, 2019

Inconsistent error reporting for out of range indexing with parallel vs serial and lists vs arrays #4600

Inconsistent error reporting for out of range indexing with parallel vs serial and lists vs arrays #4600

Comments

rborder commented Sep 20, 2019 • edited

Reporting a bug

Minimal reproducer

esc commented Sep 23, 2019

stuartarchibald commented Sep 23, 2019

DrTodd13 commented Oct 1, 2019

stuartarchibald commented Oct 3, 2019

rborder commented Sep 20, 2019 •

edited