Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent error reporting for out of range indexing with parallel vs serial and lists vs arrays #4600

Open
2 tasks done
rborder opened this issue Sep 20, 2019 · 4 comments

Comments

@rborder
Copy link

rborder commented Sep 20, 2019

Reporting a bug

I'm observing inconsistent error behavior with parallel and serial out-of-range indexing dependent on whether operating on reflected list vs numpy array:

Minimal reproducer

import numpy as np
from numba import njit, prange

@njit(parallel=False)
def out_of_bounds_serial(array):
    for i in prange(len(array)+1):
        array[i]

@njit(parallel=True)
def out_of_bounds_parallel(array):
    for i in prange(len(array)+1):
        array[i]

out_of_bounds_serial([1,2,3])                ## IndexError: getitem out of range
out_of_bounds_serial(np.array([1,2,3]))      ## No error
out_of_bounds_parallel([1,2,3])              ## No error
out_of_bounds_parallel(np.array([1,2,3]))    ## No error
@esc
Copy link
Member

esc commented Sep 23, 2019

@rborder thanks for reporting this, I can reproduce it. The last three cases should probably raise the same kind of error that the first one raises?

@esc esc added the needtriage label Sep 23, 2019
@stuartarchibald
Copy link
Contributor

Bounds checking of array access is not implemented, in part because it would harm performance. There's a WIP PR here to add it as an option #4432. The issue in the case of list access under the parallel target is probably a bug/oversight.

@DrTodd13
Copy link
Collaborator

DrTodd13 commented Oct 1, 2019

I don't think the bounds check needs to be inside the loop for parfors. We could look at the arrays accessed and put a check before the loop (in the parfor init block actually) comparing the size of the container with the parfor stop index. If the stop index is greater than the container size the the loop would eventually access out-of-bounds so we could assert early. It's a bit weird that the assertion would happen before any iteration of the loop is run. Does that matter? @stuartarchibald

@stuartarchibald
Copy link
Contributor

I don't think the bounds check should be on by default as it adds work, #4432 enables it via an option and were this carried into parfors the same option should be used.

As to determining accesses before a loop runs, I'm not sure that's always possible? Consider this case:

from numba import njit, prange
import numpy as np

for _range in (njit(lambda x: range(x)), njit(inline='always')(lambda x: prange(x))):
    @njit(parallel=True)
    def foo(v):
        N = len(v)
        for i in _range(N):
            if i > v[i] // 2:
                b = "branch 1"
                idx = i
            else:
                b = "branch 2"
                idx = v[i] # the index is a function of the runtime values
            print("i =", i, b, "idx= ", idx)
            v[int(idx)] += 1
        return v

    x = np.arange(10)
    print(foo(x))

I think focusing on the GEP based validation might be easier to reason about?

. It's a bit weird that the assertion would happen before any iteration of the loop is run. Does that matter? @stuartarchibald

If it is possible to do this reliably via static analysis I think that this would be fine. It is "weird" but I don't think it would interrupt the user experience on the basis that no guarantee is made about what parallel transforms take place or what they actually do. Most important would be to ensure that the out of bounds location is tracked correctly so as to provide useful feedback to the user.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants