storage variables inside functions do not work in nopython mode #4246

SumNeuron · 2019-07-01T11:57:43Z

A common coding convention is to use a "storage" variable to collect results inside of a function, e.g.

def foo(some_var):
    results = []
    for some_el in some_var:
        results.append(some_el)
    return results

Numba supports numpy arrays, but it seems this is not the case for multidimensional arrays.

# this does not work
@njit(int64[:,:](int64))
def bar(number):
    results = np.array([], dtype=np.int64).reshape(-1, 2)
    for i in range(number):
        results = np.concatenate(results, np.array([[number, number + 1]]))
    return results

# this works
@njit(int64[:,:](int64, int64[:,:]))
def baz(number, results):
    # results = np.array([], dtype=np.int64).reshape(-1, 2)
    for i in range(number):
        results = np.concatenate(results, np.array([[number, number + 1]]))
    return results

this forces a weird wrapper requirement like:

@jit(forceobj=True)
def fuzz(number):
    # make this here because numba no like
    results = np.array([], dtype=np.int64).reshape(-1, 2)
    return baz(number, results)

because although numpy arrays are allowed as default arguments, numba does not like not recieving the optinal argument

@njit(int64[:,:](int64, int64[:,:]))
def buzz(number, results=np.array([], dtype=np.int64).reshape(-1, 2)):
    for i in range(number):
        results = np.concatenate(results, np.array([[number, number + 1]]))
    return results

buzz(10) #<--- throws error

please fix

The text was updated successfully, but these errors were encountered:

sklam · 2019-07-01T12:57:18Z

The problem in

# this does not work
@njit(int64[:,:](int64))
def bar(number):
    results = np.array([], dtype=np.int64).reshape(-1, 2)

is due to the empty list [] being untyped. We should consider giving empty list a special type as it is often used by itself. A weird but working alternative will be results = np.array([0][:0]).reshape(-1, 2). This directly workaround the issue of untyped [] but forcing the type to be typeof(0) and then making it an emptylist. A more sensible workaround is to use a different array constructor; i.e. results = np.empty(0, dtype=int64).reshape(-1, 2).

The other reported error:

@njit(int64[:,:](int64, int64[:,:]))
def buzz(number, results=np.array([], dtype=np.int64).reshape(-1, 2)):
    for i in range(number):
        # slightly modified to add missing parenthesis so that the `np.array([[numba...` is not treated 
        # as the `axis` arg
        results = np.concatenate((results, np.array([[number, number + 1]])))
    return results

buzz(10) #<--- throws error

is caused by the signature. The provided type signature does not account for the second argument to be of optional argument with default value. It works when the user-provided signature is omitted. Numba will infer the signature to be (int64, omitted(default=array([], shape=(0, 2), dtype=int64)))

SumNeuron · 2019-07-01T13:03:09Z

@sklam

re: empty list
whatever works to get around this would be nice. Until implemented, perhaps add this as an example in docs.

re: function signature
is there then a way to tell numba in the signature both what it will be as well as that it is optional? To my knowledge omitted is not a numba type.

@njit([
    int64[:, :] (int64, omitted(int64[:,:]))
)

# or

@njit([
    int64[:, :] (int64, ?int64[:,:]) # <--- use ? to denote optional / omitted value
)

or shouldn't numba not care how the array is initialized because I return it, and in the user provided function signature I state what it will be?

sklam · 2019-07-01T13:24:56Z

It's best to just leave out the signature. Just a bare @njit:

@njit
def buzz(number, , results=np.array([], dtype=np.int64).reshape(-1, 2)):
    ...

SumNeuron · 2019-07-04T08:47:54Z

@sklam going along this discussion, what is the most performant way to use this convention with prange e.g.

@njit
def foo(iterable):
    # empty list with shape of thing we want to store
    results = np.array([0][:0]).reshape(-1, 3)
    for i in prange(len(iterable)):
        # some things happen here
        # maybe a call to a helper njit function
        partial = _helper(iterable[i]) 
        
        # this probably has a cleaner solution
        results = np.concatenate((results, np.array([partial]))) # <--- this feels wrong
        # because np.append is not supported - as well as axis - we have to remake 
        # an array rather than just to stack the results
    return results

sklam added the discussion An issue requiring discussion label Jul 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage variables inside functions do not work in nopython mode #4246

storage variables inside functions do not work in nopython mode #4246

SumNeuron commented Jul 1, 2019

sklam commented Jul 1, 2019 •

edited

SumNeuron commented Jul 1, 2019 •

edited

sklam commented Jul 1, 2019

SumNeuron commented Jul 4, 2019

storage variables inside functions do not work in nopython mode #4246

storage variables inside functions do not work in nopython mode #4246

Comments

SumNeuron commented Jul 1, 2019

sklam commented Jul 1, 2019 • edited

SumNeuron commented Jul 1, 2019 • edited

sklam commented Jul 1, 2019

SumNeuron commented Jul 4, 2019

sklam commented Jul 1, 2019 •

edited

SumNeuron commented Jul 1, 2019 •

edited