Nested jit annotations #3332

ghost · 2018-09-23T18:26:03Z

I have code with a structure like the example below:

@numba.jit()
def test():
  state = 0

  @numba.jit(nopython=True)
  def sub(param):
    nonlocal state
    state += param

  for i in range(100):
    param = i # Assume this line is nopython incompatible code
    sub(param)

  print(state)

Because of the nonlocal state variable, the test function has to be jitted. Otherwise, I would just remove the first annotation because what really matters is the sub function. Therefore, I wanted to to enable the nopython mode for it, which unfortunately fails with an error:

  @numba.jit(nopython=True)
  ^
[1] During: lowering "$0.11 = make_function(name=$const0.10, code=<code object fast_sub at 0x0000028ABF11E6F0, file ".../scratches/scratch.py", line 89>, closure=$0.8, defaults=None)" at .../scratches/scratch.py (89)
-------------------------------------------------------------------------------

In the real code, there are lots of state variables. If there is no better solution, I will probably just use a list which I can pass around to share the state. Nested numba.jit annotations would be much more convenient though. What do you think?

The text was updated successfully, but these errors were encountered:

stuartarchibald · 2018-09-24T11:46:21Z

Thanks for the report. On Numba 0.40.0, this seems to just work under nopython mode as closures are simply inlined:

from numba import njit

def test():
  state = 0

  def sub(param):
    nonlocal state
    state += param

  for i in range(100):
    param = i
    sub(param)

  print(state)

print(test())
print(njit()(test)())

output:

$python issue3332.py 
4950
None
4950
None

If objectmode is forced it does fail with:

Numba encountered the use of a language feature it does not support in this context: <creating a function from a closure> (op code: make_function not supported). If the feature is explicitly supported it is likely that the result of the expression is being used in an unsupported manner.

File "issue3332.py", line 6:

  def sub(param):
  ^

I think this is a result of the looplifting pass running before the closure inlining pass, which means that if there's a closure present (make_function opcode) at looplift time the IR legalization will fail following the looplift transform reentering the pipeline with mutated IR from looplifting but with the illegal make_function present. Switching these passes around in the pipeline seems to fix it.

stuartarchibald · 2018-09-24T11:48:48Z

As to your actual query about state etc, Numba 0.40.0 has a new feature, objectmode contexts (run bits of nopython mode functions in objectmode), does this help your situation? http://numba.pydata.org/numba-doc/latest/user/withobjmode.html

ghost · 2018-09-24T12:15:10Z

@stuartarchibald I have just upgraded to 0.40.0, but still have the same problem. Have you seen that there is a second annotation on the sub function? Also, I cannot use nopython mode for the test function itself. I will have a look at objectmode later.

stuartarchibald · 2018-09-24T12:48:44Z

Yes, I saw that, it's currently illegal/unsupported behaviour, hence me removing it as it looked like it would inline fine under njit via the closure inlining pass, I guess your actual function won't do this hence the problem? It sounds like you want the opposite of what with objmode does in nopython mode, i.e. run the function in objectmode but use nopython mode a couple of places? If most of your function would compile under nopython mode then with objmode will probably help get the bits that won't work to run?

ghost · 2018-09-27T14:42:54Z

I was able to solve the problem with objmode. Just curious, why is it necessary to provide the return types when they can be figured out automatically for function arguments?

Also, I run into a limitation. I have a pandas data frame with different data types (floats and bools) for which I want to use the numpy array in the numba optimised code. numba.typeof isn't able to return a data type string for it. I guess mixed arrays are not supported yet, right?

ghost · 2018-09-27T15:28:36Z

One more idea. For my use case, the following pattern would be very useful:

def gen():
  while True:
    yield 1

@njit
def test():
  with objmode(g="object"):
    g = gen()

  while True:
    with objmode(val="int"):
      val = next(g)

    # Do something with val

The variable g could not be used inside the nopython code, but in other objmode sections.

ghost · 2018-09-27T15:39:39Z

One last question: Is it possible to return a list or tuple of arrays from objmode? I have tried list(array(float64, 1d, C)), tuple(array(float64, 1d, C)), list(float64[:]), tuple(float64[:]) but none worked.

stuartarchibald · 2018-09-27T16:59:03Z

Just curious, why is it necessary to provide the return types when they can be figured out automatically for function arguments?

Perhaps you may be running some function in object mode where type inference can't follow what would be returned? This is also an experimental feature, things may change :)

Also, I run into a limitation. I have a pandas data frame with different data types (floats and bools) for which I want to use the numpy array in the numba optimised code. numba.typeof isn't able to return a data type string for it. I guess mixed arrays are not supported yet, right?

Think this comes out as a NumPy array of dtype object, which is not supported. You could coerce the dataframe backing array into a dtype:

In [16]: d = pd.DataFrame(data = {'col1':[1., 2.], 'col2':[np.bool(1), np.bool(0)]})

In [17]: d.values.dtype
Out[17]: dtype('O')

In [18]: pd.DataFrame(d, dtype=np.float64).values.dtype
Out[18]: dtype('float64')

but this obviously incurs cost. Another option is to use the to_records() method on the DataFrame to get a NumPy recarray which may be recognised by Numba, again, cost incurred, not sure how efficient it'd be. It'd probably be most efficient to just partition your columns by type so that e.g. Numba would just see primitive types through the use of multiple args (one for each column/homogeneously typed data set).

One more idea. For my use case, the following pattern would be very useful:

Thanks, IIRC there are some plans on the horizon for thinking about pass through cases.

One last question: Is it possible to return a list or tuple of arrays from objmode? I have tried list(array(float64, 1d, C)), tuple(array(float64, 1d, C)), list(float64[:]), tuple(float64[:]) but none worked.

Is this the sort of thing you are after? :

from numba import njit, objmode
import numpy as np

@njit
def test():
    with objmode(val='List(float64[:])'):
        val = [np.arange(10.), np.ones(4)]
    return val

print(test())

or do you mean you want the return statement in the objmode block (not supported!)?

ghost · 2018-09-27T17:29:40Z

Makes all sense.

or do you mean you want the return statement in the objmode block (not supported!)?

No, this is exactly what I wanted. List(...) does not seem to be documented here. Therefore, I just tried to use whatever numba.typeof returned but this didn't work either. How can I figure out the type string which I have to use when it is missing in the documentation? Tuples, for example, would be interesting too.

stuartarchibald · 2018-09-27T18:33:46Z

hmmm, that should probably be documented, thanks for raising it, I've opened a ticket #3349. Basically, whatever string you write gets eval'd with numba.types as globals. So if you are looking for a type, that's the place to look. Here's a homogeneous 2-tuple of float64 1D arrays.

from numba import njit, objmode
import numpy as np

@njit
def test():
    with objmode(val='UniTuple(float64[:], 2)'):
        val = (np.arange(10.), np.ones(4))
    return val

print(test())

ghost · 2018-09-27T18:43:44Z

I see. Thank you very much for your excellent help!

stuartarchibald · 2018-09-27T18:48:50Z

No problem, thanks for using Numba :)

ghost · 2018-09-27T20:17:13Z

Sorry, I have one more question: It seems like to_records() is supported for my mixed array. At least I can pass such an array to a nopython function. The following should describe the array:

print(vals.shape)  # (10,)
print(vals.dtype)  # (numpy.record, [('index', '<i8'), ('a', '<f8'), ('b', '?')])
print(numba.typeof(vals))  # unaligned array(Record([('index', '<i8'), ('a', '<f8'), ('b', '|b1')]), 1d, C)
print(numba.from_dtype(vals.dtype))  # Record([('index', '<i8'), ('a', '<f8'), ('b', '|b1')])

I have tried again various type string combinations for objmode, but could not get it running. Would you mind explaining in more details, how I can determine the type string from the information above?

stuartarchibald · 2018-09-28T12:28:42Z

hmmm, this was hard. I'm not hugely familiar with the recarray impl in Numba so there may be a better way. Independent of this, the str const constraint makes it hard to deal with more advanced types, I'll raise this at the next core developer meeting (but also acknowledge that this is a new, under development and generally experimental feature).

from numba import njit, objmode, typeof, from_dtype, types, numpy_support
import numpy as np
from pandas import DataFrame

df = DataFrame(data = {'col1':[1., 2.], 'col2':[np.bool(1), np.bool(0)]})

pdrec = df.to_records()
dt = numpy_support.from_struct_dtype(pdrec.dtype)

def rec2str(rec):
    attrs = ['descr', 'fields', 'size', 'aligned']
    subsmap = {}
    for x in attrs:
        subsmap[x] = str(getattr(rec, x))
    subsmap['dtype'] = rec.dtype.descr
    template = "Record(\"{descr}\", {fields}, {size}, {aligned}, {dtype})"
    ret = template.format(**subsmap)
    # make sure it's valid
    eval(ret, {}, types.__dict__)
    return ret.replace('"','\\"')

# This gives the record type to paste in the `objmode` type annotation.
print("Formatted str const: %s" % rec2str(dt))

@njit
def test_record_get(recarr):
    with objmode(f="Record(\"[('index', '<i8'), ('col1', '<f8'), ('col2', '|b1')]\", {'index': (int64, 0), 'col1': (float64, 8), 'col2': (bool, 16)}, 17, False, [('index', '<i8'), ('col1', '<f8'), ('col2', '|b1')])"):
        f = recarr[1]
    return f

@njit
def test_record_slice(recarr):

    with objmode(g= "Array(Record(\"[('index', '<i8'), ('col1', '<f8'), ('col2', '|b1')]\", {'index': (int64, 0), 'col1': (float64, 8), 'col2': (bool, 16)}, 17, False, [('index', '<i8'), ('col1', '<f8'), ('col2', '|b1')]), 1, 'C')"):
        g = recarr[1:]
    return g

print(test_record_get(pdrec))
print(test_record_slice(pdrec))

ping @sklam any ideas for a better way?

sklam · 2018-09-28T13:41:33Z

(replying to #3332 (comment))

It's definitely too difficult to use. This is where we need to do something like with objmode(g=typeof(recarra))

ghost · 2018-09-28T14:16:12Z

@sklam Yes, this would be useful (at first, I even thought that I could use the current numba.typeof for this). However, I still think, numba should just figure out the type by itself similar as for function arguments. Then users don't have to deal with it at all.

stuartarchibald added the bug label Sep 24, 2018

sklam mentioned this issue Sep 28, 2018

With-objmode usability issue for long type string. #3356

Closed

stuartarchibald added the objectmode object mode related issue label Dec 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nested jit annotations #3332

Nested jit annotations #3332

ghost commented Sep 23, 2018

stuartarchibald commented Sep 24, 2018

stuartarchibald commented Sep 24, 2018

ghost commented Sep 24, 2018

stuartarchibald commented Sep 24, 2018

ghost commented Sep 27, 2018

ghost commented Sep 27, 2018

ghost commented Sep 27, 2018 •

edited by ghost

Loading

stuartarchibald commented Sep 27, 2018

ghost commented Sep 27, 2018

stuartarchibald commented Sep 27, 2018

ghost commented Sep 27, 2018

stuartarchibald commented Sep 27, 2018

ghost commented Sep 27, 2018

stuartarchibald commented Sep 28, 2018

sklam commented Sep 28, 2018

ghost commented Sep 28, 2018

Nested jit annotations #3332

Nested jit annotations #3332

Comments

ghost commented Sep 23, 2018

stuartarchibald commented Sep 24, 2018

stuartarchibald commented Sep 24, 2018

ghost commented Sep 24, 2018

stuartarchibald commented Sep 24, 2018

ghost commented Sep 27, 2018

ghost commented Sep 27, 2018

ghost commented Sep 27, 2018 • edited by ghost Loading

stuartarchibald commented Sep 27, 2018

ghost commented Sep 27, 2018

stuartarchibald commented Sep 27, 2018

ghost commented Sep 27, 2018

stuartarchibald commented Sep 27, 2018

ghost commented Sep 27, 2018

stuartarchibald commented Sep 28, 2018

sklam commented Sep 28, 2018

ghost commented Sep 28, 2018

ghost commented Sep 27, 2018 •

edited by ghost

Loading