Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage variables inside functions do not work in nopython mode #4246

Open
SumNeuron opened this issue Jul 1, 2019 · 4 comments
Open

storage variables inside functions do not work in nopython mode #4246

SumNeuron opened this issue Jul 1, 2019 · 4 comments
Labels
discussion An issue requiring discussion

Comments

@SumNeuron
Copy link

A common coding convention is to use a "storage" variable to collect results inside of a function, e.g.

def foo(some_var):
    results = []
    for some_el in some_var:
        results.append(some_el)
    return results

Numba supports numpy arrays, but it seems this is not the case for multidimensional arrays.

# this does not work
@njit(int64[:,:](int64))
def bar(number):
    results = np.array([], dtype=np.int64).reshape(-1, 2)
    for i in range(number):
        results = np.concatenate(results, np.array([[number, number + 1]]))
    return results

# this works
@njit(int64[:,:](int64, int64[:,:]))
def baz(number, results):
    # results = np.array([], dtype=np.int64).reshape(-1, 2)
    for i in range(number):
        results = np.concatenate(results, np.array([[number, number + 1]]))
    return results

this forces a weird wrapper requirement like:

@jit(forceobj=True)
def fuzz(number):
    # make this here because numba no like
    results = np.array([], dtype=np.int64).reshape(-1, 2)
    return baz(number, results)

because although numpy arrays are allowed as default arguments, numba does not like not recieving the optinal argument

@njit(int64[:,:](int64, int64[:,:]))
def buzz(number, results=np.array([], dtype=np.int64).reshape(-1, 2)):
    for i in range(number):
        results = np.concatenate(results, np.array([[number, number + 1]]))
    return results

buzz(10) #<--- throws error

please fix

@sklam
Copy link
Member

sklam commented Jul 1, 2019

The problem in

# this does not work
@njit(int64[:,:](int64))
def bar(number):
    results = np.array([], dtype=np.int64).reshape(-1, 2)

is due to the empty list [] being untyped. We should consider giving empty list a special type as it is often used by itself. A weird but working alternative will be results = np.array([0][:0]).reshape(-1, 2). This directly workaround the issue of untyped [] but forcing the type to be typeof(0) and then making it an emptylist. A more sensible workaround is to use a different array constructor; i.e. results = np.empty(0, dtype=int64).reshape(-1, 2).

The other reported error:

@njit(int64[:,:](int64, int64[:,:]))
def buzz(number, results=np.array([], dtype=np.int64).reshape(-1, 2)):
    for i in range(number):
        # slightly modified to add missing parenthesis so that the `np.array([[numba...` is not treated 
        # as the `axis` arg
        results = np.concatenate((results, np.array([[number, number + 1]])))
    return results

buzz(10) #<--- throws error

is caused by the signature. The provided type signature does not account for the second argument to be of optional argument with default value. It works when the user-provided signature is omitted. Numba will infer the signature to be (int64, omitted(default=array([], shape=(0, 2), dtype=int64)))

@SumNeuron
Copy link
Author

SumNeuron commented Jul 1, 2019

@sklam

re: empty list
whatever works to get around this would be nice. Until implemented, perhaps add this as an example in docs.

re: function signature
is there then a way to tell numba in the signature both what it will be as well as that it is optional? To my knowledge omitted is not a numba type.

@njit([
    int64[:, :] (int64, omitted(int64[:,:]))
)

# or

@njit([
    int64[:, :] (int64, ?int64[:,:]) # <--- use ? to denote optional / omitted value
)

or shouldn't numba not care how the array is initialized because I return it, and in the user provided function signature I state what it will be?

@sklam
Copy link
Member

sklam commented Jul 1, 2019

It's best to just leave out the signature. Just a bare @njit:

@njit
def buzz(number, , results=np.array([], dtype=np.int64).reshape(-1, 2)):
    ...

@sklam sklam added the discussion An issue requiring discussion label Jul 1, 2019
@SumNeuron
Copy link
Author

@sklam going along this discussion, what is the most performant way to use this convention with prange e.g.

@njit
def foo(iterable):
    # empty list with shape of thing we want to store
    results = np.array([0][:0]).reshape(-1, 3)
    for i in prange(len(iterable)):
        # some things happen here
        # maybe a call to a helper njit function
        partial = _helper(iterable[i]) 
        
        # this probably has a cleaner solution
        results = np.concatenate((results, np.array([partial]))) # <--- this feels wrong
        # because np.append is not supported - as well as axis - we have to remake 
        # an array rather than just to stack the results
    return results

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion An issue requiring discussion
Projects
None yet
Development

No branches or pull requests

2 participants