Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generator iterator fails from within NPM function if created outside the NPM function #3061

Open
jllanfranchi opened this issue Jun 27, 2018 · 11 comments

Comments

@jllanfranchi
Copy link

jllanfranchi commented Jun 27, 2018

Feature request (but possibly a bug?)

Note: I'm using numba 0.38.0 in Python 2.7.12

Creating the generator iterator from a generator within the nopython function, then calling it within that function, works:

@njit
def gen():
    for _ in range(10):
        yield np.random.rand(5)

@njit
def callit():
    g = gen()
    return next(g)

callit() # succeeds

But creating the generator iterator from the generator outside the nopython function that calls it fails:

@njit
def gen():
    for _ in range(10):
        yield np.random.rand(5)

g = gen()

@njit
def callit():
    return next(g)

callit() # fails

With error message:

---------------------------------------------------------------------------
TypingError                               Traceback (most recent call last)
<ipython-input-68-05f5a5c324b2> in <module>()
     10     return next(g)
     11 
---> 12 callit()

/home/justin/anaconda2/lib/python2.7/site-packages/numba/dispatcher.pyc in _compile_for_args(self, *args, **kws)
    342                 raise e
    343             else:
--> 344                 reraise(type(e), e, None)
    345         except errors.UnsupportedError as e:
    346             # Something unsupported is present in the user code, add help info

/home/justin/anaconda2/lib/python2.7/site-packages/numba/six.pyc in reraise(tp, value, tb)

TypingError: Failed at nopython (nopython frontend)
Untyped global name 'g': cannot determine Numba type of <type '_dynfunc._Generator'>

File "<ipython-input-68-05f5a5c324b2>", line 10:
def callit():
    return next(g)
    ^

this use case would be helpful for being able to do costly (CPU and/or memory) initialization within the function using arguments that are difficult for numba to handle, and I think it makes the code quite clean.

My specific use case is that I want to initialize a fixed number of nopython-mode generators where the user can provide these to my function (note that each generator requires different arguments for its initialization, so their arguments have to be handled outside of nopython code). These generators are then handed to my nopython function that just calls next(gen0), next(gen1), etc.

The best alternative I see would be to use a factory function to define a nopython mode function after the initialization is performed in "regular" Python that then bakes-in the (potentially large) results of the initialization at the time it is compiled. While this works, it would be nice to be able to achieve this the more "straightforward" way.

@stuartarchibald
Copy link
Contributor

Thanks for the report. This isn't yet supported. Also, of concern is that it segfaults for me.

$ gdb -ex=r --args python issue3061.py 
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-110.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from <numba_env>/numba_latest/bin/python3.6...done.
Starting program: <numba_env>/numba_latest/bin/python issue3061.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Missing separate debuginfo for <numba_env>/numba_latest/lib/python3.6/site-packages/numpy/../../../libiomp5.so
Detaching after fork from child process 9615.
Detaching after fork from child process 9617.
Detaching after fork from child process 9618.
Detaching after fork from child process 9619.
Traceback (most recent call last):
  File "issue3061.py", line 15, in <module>
    callit() # fails
  File "<path>numba/dispatcher.py", line 349, in _compile_for_args
    error_rewrite(e, 'typing')
  File "<path>numba/dispatcher.py", line 314, in error_rewrite
    raise e
  File "<path>numba/dispatcher.py", line 325, in _compile_for_args
    return self.compile(tuple(argtypes))
  File "<path>numba/dispatcher.py", line 653, in compile
    cres = self._compiler.compile(args, return_type)
  File "<path>numba/dispatcher.py", line 83, in compile
    pipeline_class=self.pipeline_class)
  File "<path>numba/compiler.py", line 873, in compile_extra
    return pipeline.compile_extra(func)
  File "<path>numba/compiler.py", line 367, in compile_extra
    return self._compile_bytecode()
  File "<path>numba/compiler.py", line 804, in _compile_bytecode
    return self._compile_core()
  File "<path>numba/compiler.py", line 791, in _compile_core
    res = pm.run(self.status)
  File "<path>numba/compiler.py", line 253, in run
    raise patched_exception
  File "<path>numba/compiler.py", line 245, in run
    stage()
  File "<path>numba/compiler.py", line 459, in stage_nopython_frontend
    self.locals)
  File "<path>numba/compiler.py", line 974, in type_inference_stage
    infer.build_constraint()
  File "<path>numba/typeinfer.py", line 816, in build_constraint
    self.constrain_statement(inst)
  File "<path>numba/typeinfer.py", line 1016, in constrain_statement
    self.typeof_assign(inst)
  File "<path>numba/typeinfer.py", line 1079, in typeof_assign
    self.typeof_global(inst, inst.target, value)
  File "<path>numba/typeinfer.py", line 1177, in typeof_global
    typ = self.resolve_value_type(inst, gvar.value)
  File "<path>numba/typeinfer.py", line 1100, in resolve_value_type
    raise TypingError(msg, loc=inst.loc)
numba.errors.TypingError: Failed at nopython (nopython frontend)
Untyped global name 'g': cannot determine Numba type of <class '_dynfunc._Generator'>

File "issue3061.py", line 13:
def callit():
    return next(g)
    ^


Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7fe9230 in ?? ()
Missing separate debuginfos, use: debuginfo-install libuuid-2.23.2-52.el7.x86_64
(gdb) bt
#0  0x00007ffff7fe9230 in ?? ()
#1  0x00007fffdf690293 in generator_clear ()
   from <path>numba/_dynfunc.cpython-36m-x86_64-linux-gnu.so
#2  0x00005555556bcf32 in collect ()
#3  0x000055555574762a in _PyGC_CollectNoFail ()
#4  0x00005555556fdc16 in PyImport_Cleanup ()
#5  0x0000555555762411 in Py_FinalizeEx ()
#6  0x000055555576c7d6 in Py_Main ()
#7  0x00005555556344be in main ()

@Jul3k
Copy link

Jul3k commented Mar 2, 2020

@stuartarchibald, has there been and update on this or is there a workaround so a generator can be passed to a njit function?

@stuartarchibald
Copy link
Contributor

@Jul3k just from memory, I'm pretty sure this still won't work.

@Jul3k
Copy link

Jul3k commented Mar 2, 2020

@stuartarchibald this explains why my code also fails with cannot determine Numba type of <class '_dynfunc._Generator'> . I was just wondering if this can be solved by manually specifying the function argument type or if this is a deeper problem?

from numba import njit, jit
@njit
def gen(N):
    for i in range(N):
        yield i
@njit        
def gen2(g):
    for i in g:
        yield i
    
list(gen2(gen(10)))

@stuartarchibald
Copy link
Contributor

It's because Numba doesn't know about generator objects, it has no way to translate them from python into a native representation. Even if you spelled it out Numba wouldn't know what to do with it.

Is there a specific need for this? Or is it just convenient?

@Jul3k
Copy link

Jul3k commented Mar 2, 2020

Well I am generating pulse sequences for stepper motors on a Raspberry Pi. Those pulse sequences are very long (>200.000) so I used generators to calculate chunks of 20 ms sequences with around 2000 pulses and write them to a buffer. The first type of generators calculate the pulse times and directions for different types of movements. The second type of generator calculates the delays and breaks them into chunks. Not being able to pass a generator means that I have to write a implementation of the second type of generator for every first type of generator.

It would say passing a generator makes sense in cases where a generator acts on another group of generators. I also found a unsolved stackoverflow question on that topic generator argument in numba

@stuartarchibald
Copy link
Contributor

Well I am generating pulse sequences for stepper motors on a Raspberry Pi. Those pulse sequences are very long (>200.000) so I used generators to calculate chunks of 20 ms sequences with around 2000 pulses and write them to a buffer. The first type of generators calculate the pulse times and directions for different types of movements. The second type of generator calculates the delays and breaks them into chunks. Not being able to pass a generator means that I have to write a implementation of the second type of generator for every first type of generator.

It would say passing a generator makes sense in cases where a generator acts on another group of generators. I also found a unsolved stackoverflow question on that topic generator argument in numba

I see, thanks. This would be useful to have working for the above. As an interim, does writing factory functions help? i.e. generate the generator?

@Jul3k
Copy link

Jul3k commented Mar 3, 2020

I am not quite sure what you mean by a factory function. Can you please explain how that would look? As generators work when they are created inside the function this worked for me as a workaround:

from numba import njit
@njit
def gen(N):
    for i in range(N):
        yield i
@njit        
def gen2(gen, gen_args):
    g = gen(*gen_args)
    for i in g:
        yield i*2

list(gen2(gen, (10,)))

@Jul3k
Copy link

Jul3k commented Mar 3, 2020

Do you mean something like the following?

from numba import njit, jit
@njit
def fgen(N):
    def gen():
        for i in range(N):
            yield i
    return gen

@njit        
def gen2(gen):
    g = gen()
    for i in g:
        yield i*2

list(gen2(fgen(10)))

This code fails with: Cannot capture the non-constant value associated with variable 'N' in a function that will escape.

@stuartarchibald
Copy link
Contributor

I was thinking along the lines of basically creating specialisations via some "factory" to avoid some issues/unimplemented things in Numba. I had a play about with this and I've found at least two new issues with creating escaping functions from closures, but no way better than the workaround you describe, so I think that's the best option for now.

@Jul3k
Copy link

Jul3k commented Mar 4, 2020

The workaround is fine for me. I just want to point out that for other use cases, e.g. when a generator should be passed to two subsequent generators acting on it, this will not work, as the generator is created inside the generator and it's state cannot be passed to the subsequent one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants