-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accept Numpy (1.17+) Generator and BitGenerator objects in jitted code #4499
Comments
I've looked into this issue and it might require #2823 to be able to interact with via cffi with the new code. |
Any updates on this issue? |
Friendly ping that this is an important issue for aesara/aeppl. |
I managed to wire this up (demo is at the bottom), the outcome of all this is that from numba import njit
import numpy as np
from numba.core import types, cgutils
from numba.core.pythonapi import unbox, NativeValue
from numba.core.extending import (register_model, models, typeof_impl,
overload_method, intrinsic, overload,
make_attribute_wrapper, register_jitable)
from numba.np.numpy_support import from_dtype, as_dtype
from llvmlite import ir
class NumPyRandomBitGeneratorType(types.Type):
def __init__(self, *args, **kwargs):
super(NumPyRandomBitGeneratorType, self).__init__(*args, **kwargs)
self.name = 'NumPyRandomBitGeneratorType'
class NumPyRandomGeneratorType(types.Type):
def __init__(self, *args, **kwargs):
super(NumPyRandomGeneratorType, self).__init__(*args, **kwargs)
self.name = 'NumPyRandomGeneratorType'
@typeof_impl.register(np.random.bit_generator.BitGenerator)
def typeof_numpy_random_bitgen(val, c):
return NumPyRandomBitGeneratorType(val)
@typeof_impl.register(np.random.Generator)
def typeof_random_generator(val, c):
return NumPyRandomGeneratorType(val)
@register_model(NumPyRandomBitGeneratorType)
class NumPyRngBitGeneratorModel(models.StructModel):
def __init__(self, dmm, fe_type):
members = [
('parent', types.pyobject),
('state_address', types.uintp),
('state', types.uintp),
('fnptr_next_uint64', types.uintp),
('fnptr_next_uint32', types.uintp),
('fnptr_next_double', types.uintp),
('bit_generator', types.uintp),
]
super(NumPyRngBitGeneratorModel, self).__init__(dmm, fe_type, members)
_bit_gen_type = NumPyRandomBitGeneratorType('bit_generator')
@register_model(NumPyRandomGeneratorType)
class NumPyRandomGeneratorTypeModel(models.StructModel):
def __init__(self, dmm, fe_type):
members = [
('bit_generator', _bit_gen_type),
]
super(
NumPyRandomGeneratorTypeModel,
self).__init__(
dmm,
fe_type,
members)
@unbox(NumPyRandomBitGeneratorType)
def unbox_numpy_random_bitgenerator(typ, obj, c):
# The bit_generator instance has a `.ctypes` attr which is a namedtuple
# with the following members (types):
# * state_address (Python int)
# * state (ctypes.c_void_p)
# * next_uint64 (ctypes.CFunctionType instance)
# * next_uint32 (ctypes.CFunctionType instance)
# * next_double (ctypes.CFunctionType instance)
# * bit_generator (ctypes.c_void_p)
struct_ptr = cgutils.create_struct_proxy(typ)(c.context, c.builder)
struct_ptr.parent = obj
c.pyapi.incref(obj) # ? need to hold ref to the underlying python obj
# Get the .ctypes attr
ctypes_binding = c.pyapi.object_getattr_string(obj, 'ctypes')
# Look up the "state_address" member and wire it into the struct
interface_state_address = c.pyapi.object_getattr_string(
ctypes_binding, 'state_address')
setattr(struct_ptr, 'state_address',
c.unbox(types.uintp, interface_state_address).value)
# Look up the "state" member and wire it into the struct
interface_state = c.pyapi.object_getattr_string(ctypes_binding, 'state')
interface_state_value = c.pyapi.object_getattr_string(
interface_state, 'value')
setattr(
struct_ptr,
'state',
c.unbox(
types.uintp,
interface_state_value).value)
# Want to store callable function pointers to these CFunctionTypes, so
# import ctypes and use it to cast the CFunctionTypes to c_void_p and
# store the results.
# First find ctypes.cast, and ctypes.c_void_p
ctypes_name = c.context.insert_const_string(c.builder.module, 'ctypes')
ctypes_module = c.pyapi.import_module_noblock(ctypes_name)
ct_cast = c.pyapi.object_getattr_string(ctypes_module, 'cast')
ct_voidptr_ty = c.pyapi.object_getattr_string(ctypes_module, 'c_void_p')
# This wires in the fnptrs refered to by name
def wire_in_fnptrs(name):
# Find the CFunctionType function
interface_next_fn = c.pyapi.object_getattr_string(
ctypes_binding, name)
# Want to do ctypes.cast(CFunctionType, ctypes.c_void_p), create an
# args tuple for that.
args = c.pyapi.tuple_pack([interface_next_fn, ct_voidptr_ty])
# Call ctypes.cast()
interface_next_fn_casted = c.pyapi.call(ct_cast, args)
# Fetch the .value attr on the resulting ctypes.c_void_p for storage
# in the function pointer slot.
interface_next_fn_casted_value = c.pyapi.object_getattr_string(
interface_next_fn_casted, 'value')
# Wire up
setattr(struct_ptr, f'fnptr_{name}',
c.unbox(types.uintp, interface_next_fn_casted_value).value)
wire_in_fnptrs('next_double')
wire_in_fnptrs('next_uint64')
wire_in_fnptrs('next_uint32')
# This is the same as the `state` member but its the bit_generator address,
# it's probably never needed.
interface_bit_generator = c.pyapi.object_getattr_string(
ctypes_binding, 'bit_generator')
interface_bit_generator_value = c.pyapi.object_getattr_string(
interface_bit_generator, 'value')
setattr(
struct_ptr,
'bit_generator',
c.unbox(
types.uintp,
interface_bit_generator_value).value)
return NativeValue(struct_ptr._getvalue())
@unbox(NumPyRandomGeneratorType)
def unbox_numpy_random_generator(typ, obj, c):
struct_ptr = cgutils.create_struct_proxy(typ)(c.context, c.builder)
bit_gen_inst = c.pyapi.object_getattr_string(obj, 'bit_generator')
unboxed = c.unbox(_bit_gen_type, bit_gen_inst).value
struct_ptr.bit_generator = unboxed
return NativeValue(struct_ptr._getvalue())
# The Generator instances have a bit_generator attr
make_attribute_wrapper(
NumPyRandomGeneratorType,
'bit_generator',
'bit_generator')
# Generate the overloads for "next_(some type)" functions
def generate_next_binding(overloadable_function, return_type):
@intrinsic
def intrin_NumPyRandomBitGeneratorType_next_ty(tyctx, inst):
sig = return_type(inst)
def codegen(cgctx, builder, sig, llargs):
name = overloadable_function.__name__
struct_ptr = cgutils.create_struct_proxy(inst)(cgctx, builder,
value=llargs[0])
# Get the 'state' and 'fnptr_next_(type)' members of the struct
state = struct_ptr.state
next_double_addr = getattr(struct_ptr, f'fnptr_{name}')
# LLVM IR types needed
ll_void_ptr_t = cgctx.get_value_type(types.voidptr)
ll_return_t = cgctx.get_value_type(return_type)
ll_uintp_t = cgctx.get_value_type(types.uintp)
# Convert the stored Generator function address to a pointer
next_fn_fnptr = builder.inttoptr(
next_double_addr, ll_void_ptr_t)
# Add the function to the module
fnty = ir.FunctionType(ll_return_t, (ll_uintp_t,))
next_fn = cgutils.get_or_insert_function(
builder.module, fnty, name)
# Bit cast the function pointer to the function type
hack = builder.bitcast(next_fn_fnptr, next_fn.type)
# call it with the "state" address as the arg
ret = builder.call(hack, (state,))
return ret
return sig, codegen
@overload(overloadable_function)
def ol_next_ty(bitgen):
if isinstance(bitgen, NumPyRandomBitGeneratorType):
def impl(bitgen):
return intrin_NumPyRandomBitGeneratorType_next_ty(bitgen)
return impl
# Some function stubs for "next(some type)", these will be overloaded
def next_double(bitgen):
return bitgen.ctypes.next_double(bitgen.ctypes.state)
def next_uint32(bitgen):
return bitgen.ctypes.next_uint32(bitgen.ctypes.state)
def next_uint64(bitgen):
return bitgen.ctypes.next_uint64(bitgen.ctypes.state)
generate_next_binding(next_double, types.double)
generate_next_binding(next_uint32, types.uint32)
generate_next_binding(next_uint64, types.uint64)
# From:
# https://github.com/numpy/numpy/blob/7cfef93c77599bd387ecc6a15d186c5a46024dac/numpy/random/src/distributions/distributions.c#L20
@register_jitable
def next_float(bitgen):
return (next_uint32(bitgen) >> 9) * (1.0 / 8388608.0)
# Overload the Generator().random() method
@overload_method(NumPyRandomGeneratorType, 'random')
def ol_NumPyRandomGeneratorType_random(inst, size=None, dtype=np.float64):
if isinstance(size, types.Omitted):
size = size.value
if isinstance(dtype, types.Omitted):
dtype = dtype.value
if not isinstance(dtype, types.Type):
dt = np.dtype(dtype)
nb_dt = from_dtype(dt)
np_dt = dtype
else:
nb_dt = dtype
np_dt = as_dtype(nb_dt)
np_dt = np.dtype(np_dt)
if np_dt == np.float32:
next_func = next_float
else:
next_func = next_double
if isinstance(size, (types.NoneType,)) or size is None:
def impl(inst, size=None, dtype=np.float64):
return nb_dt(next_func(inst.bit_generator))
return impl
else:
def impl(inst, size=None, dtype=np.float64):
out = np.empty(size, dtype=dtype)
for i in range(out.size):
out[i] = next_func(inst.bit_generator)
return out
return impl
# ------------------------------------------------------------------------------
# DEMO
# ------------------------------------------------------------------------------
@njit
def foo(x, y):
print(y.random())
print(y.random(size=(5,)))
print(y.random(size=(5,), dtype=np.float32))
# Const seed
seed = 1
print("Numba")
rng_instance = np.random.default_rng(seed=seed)
foo(rng_instance.bit_generator, rng_instance)
print("Python")
rng_instance = np.random.default_rng(seed=seed)
foo.py_func(rng_instance.bit_generator, rng_instance) Questions:
CC @luk-f-a @brandonwillard @kc611 (and feel free to ping anyone else interested!). Hope this helps folks get started on implementing this. Also, I have to thank NumPy devs for creating the |
@stuartarchibald, this looks great!
As far as I can tell, yes, definitely!
We've only recently added support for "in-graph" construction of RNG states (see aesara-devs/aesara#789), but we currently make use of "shared variables" that hold the underlying RNG objects (i.e. NumPy's
Yes.
I still need to look over this and make sure I understand it, but with a small number of implementations like |
@stuartarchibald, This particular implementation is definitely what's needed to get started on the newer
If we are planning to implement all of the random distributions within Numba itself, I'm pretty sure we can use some of the implementations from Another great feature we'd want to add down the line would be support for the |
Just to make sure I've understood correctly. The implementation I managed to patch in yesterday (#4499 (comment)) for
Unfortunately, I don't think there's a choice about this, the implementation of the distributions in NumPy is in the C language with no bindings exposed:
To which function is |
Thanks, am glad this is along the right path.
Great, thanks for confirming.
OK, working around this for a while might be a good idea. I suspect creating a
Thanks for confirming.
I think I need to perhaps go and read the documentation/implementation of this further. I think I understand a bit more following @kc611's comments above, in that if a |
That's exactly what I meant, but anyways that can just be treated as extra convenience for having
Though I haven't looked into the internals of it yet, I'm pretty sure we can emulate above linked Anyways, for now we can focus our efforts on getting the
It's basically an argument to any of the random methods to determine the size of output array. In fact, I just noticed now, you already have that in your above given implementation: |
@luk-f-a Recent work by @kc611 has been merged with #8031 which implements initial support, #8038 adding some standard distributions, #8040 adding more more advanced distributions, and there's a few more PRs opened to implement a lot of the rest of the functionality (#8041, #8042). What do you think to closing this issue and opening new ones asking for specific features etc in its place? Thanks! |
Is there an issue that tracks the dispatch performance overhead problem (e.g. mentioned here)? While the underlying problem might not be specific to this functionality, it might still be good to have an issue for its realization in the I ask because this is the next big thing I would like to look into. |
I don't think so, please feel free to open one if it's important to your use case.
I suspect the reason the dispatch cost is high is because there's no fast path in the dispatcher for the |
Sorry, I lost track of this, but is general For example, on 0.57.0dev0 I get the following: import numba
import numpy as np
rng = np.random.default_rng(23)
@numba.njit
def sample():
return rng.normal()
sample()
# NumbaNotImplementedError: Failed in nopython mode pipeline (step: native lowering)
# <numba.core.base.OverloadSelector object at 0x7ff03d61f640>, (NumPyRandomGeneratorType,)
# During: lowering "$2load_global.0 = global(rng: Generator(PCG64))" at <ipython-input-6-791ffc0ac1fc> (3) |
Ah, I think Numba can't lower it directly yet. import numba
import numpy as np
rng = np.random.default_rng(23)
@numba.njit
def sample(rng):
return rng.normal()
sample(rng)
# 0.5532605888887387 Also in the current public release (Numba 0.56.0) support for |
I think the general progress is being tracked at: https://github.com/numba/numba/projects/19 |
On a related note, is there an "excluded"-like option for I'm trying to figure out how best to broadcast the scalar |
Nevermind, I checked; there doesn't appear to be one. |
RE: #4499 (comment) @kc611 I think this needs a |
In other words, the internal state of the |
Have been thinking about this a bit more. I think the "state" would have to be "copied" and the working parts of the state are:
As a result, I think that these would survive lowering as a "frozen" global and calls to the RNG etc would still correctly advance the state of the RNG. Essentially the "frozeness" is related to the native representation, which is primarily addresses, not the contents of these addresses. There are a couple of issues though to consider:
|
If the global
Could Numba keep a reference to the Python RNG object? I imagine that a user could just add such a reference to their
A deep copy of the globals might be necessary for caching. Regardless, it sounds like a |
I don't think it's quite a shallow copy in the traditional sense due to the way Numba represents these objects, but I think the semantics are the sufficiently similar, so it's definitely worth a try.
This could potentially work, IIRC there was a report a while back of a similar issue with large arrays in globals.
I think the technique of "finding the function pointers" on deserialisation will be needed, as well as working out how to deep copy the RNG state itself (of which there are a few types IIRC).
I think so too, @kc611 perhaps you'd like to take a look at this if you have some time? Having written the rest of the related code your knowledge of this is ideal! |
Numpy 1.17 introduced a new
random
module, with newGenerator
andBitGenerator
objects. The oldRandomState
is preserved for backwards compatibility. As far as I can tell, the new module does not have any global state, but rather relies on explicit instantiation and passing ofGenerator
instances.Therefore it would be very useful if these instance could be passed to a jitted function. This would not only allow jitted functions to access the faster random number generators, and preserve Numpy overall compatibility but also address #3249.
The text was updated successfully, but these errors were encountered: