Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accept Numpy (1.17+) Generator and BitGenerator objects in jitted code #4499

Open
luk-f-a opened this issue Aug 30, 2019 · 23 comments
Open

Accept Numpy (1.17+) Generator and BitGenerator objects in jitted code #4499

luk-f-a opened this issue Aug 30, 2019 · 23 comments

Comments

@luk-f-a
Copy link
Contributor

luk-f-a commented Aug 30, 2019

Numpy 1.17 introduced a new random module, with new Generator and BitGenerator objects. The old RandomState is preserved for backwards compatibility. As far as I can tell, the new module does not have any global state, but rather relies on explicit instantiation and passing of Generator instances.
Therefore it would be very useful if these instance could be passed to a jitted function. This would not only allow jitted functions to access the faster random number generators, and preserve Numpy overall compatibility but also address #3249.

@luk-f-a
Copy link
Contributor Author

luk-f-a commented Jun 16, 2020

I've looked into this issue and it might require #2823 to be able to interact with via cffi with the new code.

@Daples
Copy link

Daples commented Sep 1, 2021

Any updates on this issue?

@luk-f-a
Copy link
Contributor Author

luk-f-a commented Sep 5, 2021

nothing beyond this: https://numba-how-to.readthedocs.io/en/latest/numpy.html#using-the-new-1-17-numpy-random-generator

@twiecki
Copy link

twiecki commented Feb 2, 2022

Friendly ping that this is an important issue for aesara/aeppl.

@stuartarchibald
Copy link
Contributor

I managed to wire this up (demo is at the bottom), the outcome of all this is that numpy.random.Generator and numpy.random.bit_generator.BitGenerator objects can cross the JIT boundary and their functions be used to implement @overload(s) for their API within Numba:

from numba import njit
import numpy as np

from numba.core import types, cgutils
from numba.core.pythonapi import unbox, NativeValue
from numba.core.extending import (register_model, models, typeof_impl,
                                  overload_method, intrinsic, overload,
                                  make_attribute_wrapper, register_jitable)
from numba.np.numpy_support import from_dtype, as_dtype
from llvmlite import ir


class NumPyRandomBitGeneratorType(types.Type):
    def __init__(self, *args, **kwargs):
        super(NumPyRandomBitGeneratorType, self).__init__(*args, **kwargs)
        self.name = 'NumPyRandomBitGeneratorType'


class NumPyRandomGeneratorType(types.Type):
    def __init__(self, *args, **kwargs):
        super(NumPyRandomGeneratorType, self).__init__(*args, **kwargs)
        self.name = 'NumPyRandomGeneratorType'


@typeof_impl.register(np.random.bit_generator.BitGenerator)
def typeof_numpy_random_bitgen(val, c):
    return NumPyRandomBitGeneratorType(val)


@typeof_impl.register(np.random.Generator)
def typeof_random_generator(val, c):
    return NumPyRandomGeneratorType(val)


@register_model(NumPyRandomBitGeneratorType)
class NumPyRngBitGeneratorModel(models.StructModel):
    def __init__(self, dmm, fe_type):
        members = [
            ('parent', types.pyobject),
            ('state_address', types.uintp),
            ('state', types.uintp),
            ('fnptr_next_uint64', types.uintp),
            ('fnptr_next_uint32', types.uintp),
            ('fnptr_next_double', types.uintp),
            ('bit_generator', types.uintp),
        ]
        super(NumPyRngBitGeneratorModel, self).__init__(dmm, fe_type, members)


_bit_gen_type = NumPyRandomBitGeneratorType('bit_generator')


@register_model(NumPyRandomGeneratorType)
class NumPyRandomGeneratorTypeModel(models.StructModel):
    def __init__(self, dmm, fe_type):
        members = [
            ('bit_generator', _bit_gen_type),
        ]
        super(
            NumPyRandomGeneratorTypeModel,
            self).__init__(
            dmm,
            fe_type,
            members)


@unbox(NumPyRandomBitGeneratorType)
def unbox_numpy_random_bitgenerator(typ, obj, c):
    # The bit_generator instance has a `.ctypes` attr which is a namedtuple
    # with the following members (types):
    # * state_address (Python int)
    # * state (ctypes.c_void_p)
    # * next_uint64 (ctypes.CFunctionType instance)
    # * next_uint32 (ctypes.CFunctionType instance)
    # * next_double (ctypes.CFunctionType instance)
    # * bit_generator (ctypes.c_void_p)

    struct_ptr = cgutils.create_struct_proxy(typ)(c.context, c.builder)
    struct_ptr.parent = obj
    c.pyapi.incref(obj)  # ? need to hold ref to the underlying python obj

    # Get the .ctypes attr
    ctypes_binding = c.pyapi.object_getattr_string(obj, 'ctypes')

    # Look up the "state_address" member and wire it into the struct
    interface_state_address = c.pyapi.object_getattr_string(
        ctypes_binding, 'state_address')
    setattr(struct_ptr, 'state_address',
            c.unbox(types.uintp, interface_state_address).value)

    # Look up the "state" member and wire it into the struct
    interface_state = c.pyapi.object_getattr_string(ctypes_binding, 'state')
    interface_state_value = c.pyapi.object_getattr_string(
        interface_state, 'value')
    setattr(
        struct_ptr,
        'state',
        c.unbox(
            types.uintp,
            interface_state_value).value)

    # Want to store callable function pointers to these CFunctionTypes, so
    # import ctypes and use it to cast the CFunctionTypes to c_void_p and
    # store the results.
    # First find ctypes.cast, and ctypes.c_void_p
    ctypes_name = c.context.insert_const_string(c.builder.module, 'ctypes')
    ctypes_module = c.pyapi.import_module_noblock(ctypes_name)
    ct_cast = c.pyapi.object_getattr_string(ctypes_module, 'cast')
    ct_voidptr_ty = c.pyapi.object_getattr_string(ctypes_module, 'c_void_p')

    # This wires in the fnptrs refered to by name
    def wire_in_fnptrs(name):
        # Find the CFunctionType function
        interface_next_fn = c.pyapi.object_getattr_string(
            ctypes_binding, name)

        # Want to do ctypes.cast(CFunctionType, ctypes.c_void_p), create an
        # args tuple for that.
        args = c.pyapi.tuple_pack([interface_next_fn, ct_voidptr_ty])

        # Call ctypes.cast()
        interface_next_fn_casted = c.pyapi.call(ct_cast, args)

        # Fetch the .value attr on the resulting ctypes.c_void_p for storage
        # in the function pointer slot.
        interface_next_fn_casted_value = c.pyapi.object_getattr_string(
            interface_next_fn_casted, 'value')

        # Wire up
        setattr(struct_ptr, f'fnptr_{name}',
                c.unbox(types.uintp, interface_next_fn_casted_value).value)

    wire_in_fnptrs('next_double')
    wire_in_fnptrs('next_uint64')
    wire_in_fnptrs('next_uint32')

    # This is the same as the `state` member but its the bit_generator address,
    # it's probably never needed.
    interface_bit_generator = c.pyapi.object_getattr_string(
        ctypes_binding, 'bit_generator')
    interface_bit_generator_value = c.pyapi.object_getattr_string(
        interface_bit_generator, 'value')
    setattr(
        struct_ptr,
        'bit_generator',
        c.unbox(
            types.uintp,
            interface_bit_generator_value).value)

    return NativeValue(struct_ptr._getvalue())


@unbox(NumPyRandomGeneratorType)
def unbox_numpy_random_generator(typ, obj, c):
    struct_ptr = cgutils.create_struct_proxy(typ)(c.context, c.builder)
    bit_gen_inst = c.pyapi.object_getattr_string(obj, 'bit_generator')
    unboxed = c.unbox(_bit_gen_type, bit_gen_inst).value
    struct_ptr.bit_generator = unboxed
    return NativeValue(struct_ptr._getvalue())


# The Generator instances have a bit_generator attr
make_attribute_wrapper(
    NumPyRandomGeneratorType,
    'bit_generator',
    'bit_generator')


# Generate the overloads for "next_(some type)" functions
def generate_next_binding(overloadable_function, return_type):
    @intrinsic
    def intrin_NumPyRandomBitGeneratorType_next_ty(tyctx, inst):
        sig = return_type(inst)

        def codegen(cgctx, builder, sig, llargs):
            name = overloadable_function.__name__
            struct_ptr = cgutils.create_struct_proxy(inst)(cgctx, builder,
                                                           value=llargs[0])

            # Get the 'state' and 'fnptr_next_(type)' members of the struct
            state = struct_ptr.state
            next_double_addr = getattr(struct_ptr, f'fnptr_{name}')

            # LLVM IR types needed
            ll_void_ptr_t = cgctx.get_value_type(types.voidptr)
            ll_return_t = cgctx.get_value_type(return_type)
            ll_uintp_t = cgctx.get_value_type(types.uintp)

            # Convert the stored Generator function address to a pointer
            next_fn_fnptr = builder.inttoptr(
                next_double_addr, ll_void_ptr_t)
            # Add the function to the module
            fnty = ir.FunctionType(ll_return_t, (ll_uintp_t,))
            next_fn = cgutils.get_or_insert_function(
                builder.module, fnty, name)
            # Bit cast the function pointer to the function type
            hack = builder.bitcast(next_fn_fnptr, next_fn.type)
            # call it with the "state" address as the arg
            ret = builder.call(hack, (state,))
            return ret
        return sig, codegen

    @overload(overloadable_function)
    def ol_next_ty(bitgen):
        if isinstance(bitgen, NumPyRandomBitGeneratorType):
            def impl(bitgen):
                return intrin_NumPyRandomBitGeneratorType_next_ty(bitgen)
            return impl


# Some function stubs for "next(some type)", these will be overloaded
def next_double(bitgen):
    return bitgen.ctypes.next_double(bitgen.ctypes.state)


def next_uint32(bitgen):
    return bitgen.ctypes.next_uint32(bitgen.ctypes.state)


def next_uint64(bitgen):
    return bitgen.ctypes.next_uint64(bitgen.ctypes.state)


generate_next_binding(next_double, types.double)
generate_next_binding(next_uint32, types.uint32)
generate_next_binding(next_uint64, types.uint64)


# From:
# https://github.com/numpy/numpy/blob/7cfef93c77599bd387ecc6a15d186c5a46024dac/numpy/random/src/distributions/distributions.c#L20
@register_jitable
def next_float(bitgen):
    return (next_uint32(bitgen) >> 9) * (1.0 / 8388608.0)


# Overload the Generator().random() method
@overload_method(NumPyRandomGeneratorType, 'random')
def ol_NumPyRandomGeneratorType_random(inst, size=None, dtype=np.float64):
    if isinstance(size, types.Omitted):
        size = size.value

    if isinstance(dtype, types.Omitted):
        dtype = dtype.value

    if not isinstance(dtype, types.Type):
        dt = np.dtype(dtype)
        nb_dt = from_dtype(dt)
        np_dt = dtype
    else:
        nb_dt = dtype
        np_dt = as_dtype(nb_dt)

    np_dt = np.dtype(np_dt)

    if np_dt == np.float32:
        next_func = next_float
    else:
        next_func = next_double

    if isinstance(size, (types.NoneType,)) or size is None:
        def impl(inst, size=None, dtype=np.float64):
            return nb_dt(next_func(inst.bit_generator))
        return impl
    else:
        def impl(inst, size=None, dtype=np.float64):
            out = np.empty(size, dtype=dtype)
            for i in range(out.size):
                out[i] = next_func(inst.bit_generator)
            return out
        return impl

# ------------------------------------------------------------------------------
# DEMO
# ------------------------------------------------------------------------------


@njit
def foo(x, y):
    print(y.random())
    print(y.random(size=(5,)))
    print(y.random(size=(5,), dtype=np.float32))


# Const seed
seed = 1

print("Numba")
rng_instance = np.random.default_rng(seed=seed)
foo(rng_instance.bit_generator, rng_instance)

print("Python")
rng_instance = np.random.default_rng(seed=seed)
foo.py_func(rng_instance.bit_generator, rng_instance)

Questions:

  1. Is this enough for those interested to start work on implementing numpy.random.Generator more fully? If not what else would be useful?
  2. Is it acceptable to have to hand generator instances to Numba (i.e. no construction)?
  3. I've not considered how this will work with respect to locking and object life times, for now is it acceptable to ignore this a bit with view of working out what else is needed?
  4. What, if anything, can be reused algorithm wise from the current np.random. implementation in Numba?

CC @luk-f-a @brandonwillard @kc611 (and feel free to ping anyone else interested!).

Hope this helps folks get started on implementing this.

Also, I have to thank NumPy devs for creating the ctypes interface, without this it would have been much harder to get it to work.

@brandonwillard
Copy link
Contributor

brandonwillard commented Mar 9, 2022

@stuartarchibald, this looks great!

  • Is this enough for those interested to start work on implementing numpy.random.Generator more fully? If not what else would be useful?

As far as I can tell, yes, definitely!

  • Is it acceptable to have to hand generator instances to Numba (i.e. no construction)?

We've only recently added support for "in-graph" construction of RNG states (see aesara-devs/aesara#789), but we currently make use of "shared variables" that hold the underlying RNG objects (i.e. NumPy's RandomStates/Generators). Anyway, I believe we can work around this during transpilation, if necessary.

  • I've not considered how this will work with respect to locking and object life times, for now is it acceptable to ignore this a bit with view of working out what else is needed?

Yes.

  • What, if anything, can be reused algorithm wise from the current np.random. implementation in Numba?

I still need to look over this and make sure I understand it, but with a small number of implementations like ol_NumPyRandomGeneratorType_random, it seems like quite a few of the existing sampler implementations could be easily converted to similar overload_method implementations of the same type. For instance, np.random.[pareto, weibull, vonmises] and others should only need the random you've already implemented.

@kc611
Copy link
Contributor

kc611 commented Mar 10, 2022

Is this enough for those interested to start work on implementing numpy.random.Generator more fully?

@stuartarchibald, This particular implementation is definitely what's needed to get started on the newer numpy.random interface for Numba. I wasn't aware of the ctypes interface while I built up that implementation for Numba RandomStates, as of now I'm curious if this particular interface will work with RandomStates as well by making a suitable BitGenerator for RandomState. (For sake of backwards compatibility.) Though a complete BitGenerator support works just as well.

What, if anything, can be reused algorithm wise from the current np.random. implementation in Numba?

If we are planning to implement all of the random distributions within Numba itself, I'm pretty sure we can use some of the implementations from numba/cpython/randomimpl.py that we can convert into overload methods similar to the .random overload demonstrated in your example.

Another great feature we'd want to add down the line would be support for the size argument.

@stuartarchibald
Copy link
Contributor

Is this enough for those interested to start work on implementing numpy.random.Generator more fully?

@stuartarchibald, This particular implementation is definitely what's needed to get started on the newer numpy.random interface for Numba. I wasn't aware of the ctypes interface while I built up that implementation for Numba RandomStates, as of now I'm curious if this particular interface will work with RandomStates as well by making a suitable BitGenerator for RandomState. (For sake of backwards compatibility.) Though a complete BitGenerator support works just as well.

Just to make sure I've understood correctly. The implementation I managed to patch in yesterday (#4499 (comment)) for numpy.random.Generator and numpy.random.bit_generator.BitGenerator is for the new RNG APIs. What you are referring to with RandomState is numpy.random.RandomState which is a way to access the "legacy" generators (functions such as np.random.beta are using this underneath) which are essentially global, something Numba already has in some form? Is the request here a way to expose the current "legacy" generators that are in Numba via a Numba based np.random.RandomState implementation?

What, if anything, can be reused algorithm wise from the current np.random. implementation in Numba?

If we are planning to implement all of the random distributions within Numba itself, I'm pretty sure we can use some of the implementations from numba/cpython/randomimpl.py that we can convert into overload methods similar to the .random overload demonstrated in your example.

Unfortunately, I don't think there's a choice about this, the implementation of the distributions in NumPy is in the C language with no bindings exposed:
https://github.com/numpy/numpy/blob/maintenance/1.22.x/numpy/random/src/distributions/distributions.c so they will need translating. But as noted, there's probably elements of the existing code in Numba that can be shared.

Another great feature we'd want to add down the line would be support for the size argument.

To which function is size an argument? (I'm afraid I'm not familiar with the APIs yet, I literally looked at this for the first time to write the code yesterday!)

@stuartarchibald
Copy link
Contributor

@brandonwillard

@stuartarchibald, this looks great!

Thanks, am glad this is along the right path.

  • Is this enough for those interested to start work on implementing numpy.random.Generator more fully? If not what else would be useful?

As far as I can tell, yes, definitely!

Great, thanks for confirming.

  • Is it acceptable to have to hand generator instances to Numba (i.e. no construction)?

We've only recently added support for "in-graph" construction of RNG states (see aesara-devs/aesara#789), but we currently make use of "shared variables" that hold the underlying RNG objects (i.e. NumPy's RandomStates/Generators). Anyway, I believe we can work around this during transpilation, if necessary.

OK, working around this for a while might be a good idea. I suspect creating a BitGenerator, even the np.random.default_rng() default of numpy.random._pcg64.PCG64, will likely be tricky, both in terms of implementing it correctly (we'd need to emulate the 128bit stuff in LLVM IR) and then also in terms of wiring it up/getting the semantics right.

  • I've not considered how this will work with respect to locking and object life times, for now is it acceptable to ignore this a bit with view of working out what else is needed?

Yes.

Thanks for confirming.

  • What, if anything, can be reused algorithm wise from the current np.random. implementation in Numba?

I still need to look over this and make sure I understand it, but with a small number of implementations like ol_NumPyRandomGeneratorType_random, it seems like quite a few of the existing sampler implementations could be easily converted to similar overload_method implementations of the same type. For instance, np.random.[pareto, weibull, vonmises] and others should only need the random you've already implemented.

I think I need to perhaps go and read the documentation/implementation of this further. I think I understand a bit more following @kc611's comments above, in that if a RandomState object were to be created it then all these functions for creating distributions can be altered to take something Generator-like and wired up as either methods on the Generator or as np.random.<distribution>. I guess my main concern is ensuring that what was in the NumPy 1.16 code freeze for the "legacy" distributions remains correct in Numba, and if there's been differences in implementation for the Generator based distributions then those also need to be accounted for. Essentially, I'd like to ensure that np.random.<distribution>, np.random.Generator.<distribution> and CPython's random.<distriubution> are all numerically correct and reproducing bug-for-bug!

@kc611
Copy link
Contributor

kc611 commented Mar 10, 2022

In that if a RandomState object were to be created it then all these functions for creating distributions can be altered to take something Generator-like and wired up as either methods on the Generator or as np.random..

That's exactly what I meant, but anyways that can just be treated as extra convenience for having RandomState support within Numba. I think even the Numpy folks have an implementation of RandomState as an BitGenerator: https://numpy.org/doc/stable/reference/random/bit_generators/mt19937.html

Essentially, I'd like to ensure that np.random., np.random.Generator. and CPython's random. are all numerically correct and reproducing bug-for-bug!

Though I haven't looked into the internals of it yet, I'm pretty sure we can emulate above linked MT19937 BitGenerator for deriving the np.random. type distributions, while maintaining a global instance for it.

Anyways, for now we can focus our efforts on getting the BitGenerator methods up and running, since ultimately that's what we want to end up supporting in Numba.

To which function is size an argument? (I'm afraid I'm not familiar with the APIs yet, I literally looked at this for the first time to write the code yesterday!)

It's basically an argument to any of the random methods to determine the size of output array. In fact, I just noticed now, you already have that in your above given implementation: y.random(size=(5,))

@stuartarchibald
Copy link
Contributor

@luk-f-a Recent work by @kc611 has been merged with #8031 which implements initial support, #8038 adding some standard distributions, #8040 adding more more advanced distributions, and there's a few more PRs opened to implement a lot of the rest of the functionality (#8041, #8042). What do you think to closing this issue and opening new ones asking for specific features etc in its place? Thanks!

@brandonwillard
Copy link
Contributor

brandonwillard commented Jul 8, 2022

Is there an issue that tracks the dispatch performance overhead problem (e.g. mentioned here)? While the underlying problem might not be specific to this functionality, it might still be good to have an issue for its realization in the Generator context, in case there are potential workarounds/improvements that somehow are Generator-specific, or if there isn't already an issue for the more general problem, etc.

I ask because this is the next big thing I would like to look into.

@stuartarchibald
Copy link
Contributor

Is there an issue that tracks the dispatch performance overhead problem (e.g. mentioned here)?

I don't think so, please feel free to open one if it's important to your use case.

While the underlying problem might not be specific to this functionality, it might still be good to have an issue for its realization in the Generator context, in case there are potential workarounds/improvements that somehow are Generator-specific, or if there isn't already an issue for the more general problem, etc.

I ask because this is the next big thing I would like to look into.

I suspect the reason the dispatch cost is high is because there's no fast path in the dispatcher for the Generator type. This means the dispatcher has to call back into the interpreter to work out what the type of the argument instance is in the case of it being a Generator. This issue is the same for any type for which Numba doesn't have a fast path.

@brandonwillard
Copy link
Contributor

Sorry, I lost track of this, but is general Generator lowering covered in any of the active PRs, or is that a separate thing?

For example, on 0.57.0dev0 I get the following:

import numba
import numpy as np


rng = np.random.default_rng(23)

@numba.njit
def sample():
    return rng.normal()


sample()
# NumbaNotImplementedError: Failed in nopython mode pipeline (step: native lowering)
# <numba.core.base.OverloadSelector object at 0x7ff03d61f640>, (NumPyRandomGeneratorType,)
# During: lowering "$2load_global.0 = global(rng: Generator(PCG64))" at <ipython-input-6-791ffc0ac1fc> (3)

@kc611
Copy link
Contributor

kc611 commented Aug 27, 2022

Ah, I think Numba can't lower it directly yet.

import numba
import numpy as np


rng = np.random.default_rng(23)

@numba.njit
def sample(rng):
    return rng.normal()


sample(rng)
# 0.5532605888887387

Also in the current public release (Numba 0.56.0) support for .random() is only one that's available, rest of it like the normal(), got pushed into 0.57.0 which is yet to be released.

@kc611
Copy link
Contributor

kc611 commented Aug 27, 2022

I think the general progress is being tracked at: https://github.com/numba/numba/projects/19

@brandonwillard
Copy link
Contributor

On a related note, is there an "excluded"-like option for vectorize (i.e. similar to numpy.vectorize's excluded option)?

I'm trying to figure out how best to broadcast the scalar Generator methods and our old numba.vectorize approach doesn't seem like it would work in this case.

@brandonwillard
Copy link
Contributor

On a related note, is there an "excluded"-like option for vectorize (i.e. similar to numpy.vectorize's excluded option)?

Nevermind, I checked; there doesn't appear to be one.

@stuartarchibald
Copy link
Contributor

RE: #4499 (comment) @kc611 I think this needs a @lower_constant(types.NumPyRandomGeneratorType) implementation for it to work. This would effectively "freeze" the state of rng into the generated code (as it's a global), which has its own issues, i.e. a np.random.default_rng(<no seed>) would just get frozen and potentially cached as such. I'm not sure whether this is desirable, @brandonwillard any thoughts?

@brandonwillard
Copy link
Contributor

RE: #4499 (comment) @kc611 I think this needs a @lower_constant(types.NumPyRandomGeneratorType) implementation for it to work. This would effectively "freeze" the state of rng into the generated code (as it's a global), which has its own issues, i.e. a np.random.default_rng(<no seed>) would just get frozen and potentially cached as such. I'm not sure whether this is desirable, @brandonwillard any thoughts?

In other words, the internal state of the Generator would be copied and, thus, separated from the Generator reference in rng? If that's the case, then, yeah, that wouldn't be ideal.

@stuartarchibald
Copy link
Contributor

RE: #4499 (comment) @kc611 I think this needs a @lower_constant(types.NumPyRandomGeneratorType) implementation for it to work. This would effectively "freeze" the state of rng into the generated code (as it's a global), which has its own issues, i.e. a np.random.default_rng(<no seed>) would just get frozen and potentially cached as such. I'm not sure whether this is desirable, @brandonwillard any thoughts?

In other words, the internal state of the Generator would be copied and, thus, separated from the Generator reference in rng? If that's the case, then, yeah, that wouldn't be ideal.

Have been thinking about this a bit more. I think the "state" would have to be "copied" and the working parts of the state are:

  1. addresses of functions in a NumPy library, these are functions like
    double next_double(state){/* impl */}
    
  2. the address of the "state" of the RNG.

As a result, I think that these would survive lowering as a "frozen" global and calls to the RNG etc would still correctly advance the state of the RNG. Essentially the "frozeness" is related to the native representation, which is primarily addresses, not the contents of these addresses.

There are a couple of issues though to consider:

  1. What happens if the global RNG is deleted and garbage collected? At least the state address will be invalid for reading (the function pointers/addresses from the NumPy library are probably fine still).
  2. What happens if the function is cached? The addresses would likely be baked in as integers and are almost certainly not going to be the same on replay from a new process.

@brandonwillard
Copy link
Contributor

brandonwillard commented Sep 9, 2022

Have been thinking about this a bit more. I think the "state" would have to be "copied" and the working parts of the state are:

  1. addresses of functions in a NumPy library, these are functions like
    double next_double(state){/* impl */}
    
  2. the address of the "state" of the RNG.

As a result, I think that these would survive lowering as a "frozen" global and calls to the RNG etc would still correctly advance the state of the RNG. Essentially the "frozeness" is related to the native representation, which is primarily addresses, not the contents of these addresses.

If the global Generator instance/RNG object is shallowly copied, that would definitely work.

There are a couple of issues though to consider:

  1. What happens if the global RNG is deleted and garbage collected? At least the state address will be invalid for reading (the function pointers/addresses from the NumPy library are probably fine still).

Could Numba keep a reference to the Python RNG object? I imagine that a user could just add such a reference to their Dispatcher instance and avoid that issue.

  1. What happens if the function is cached? The addresses would likely be baked in as integers and are almost certainly not going to be the same on replay from a new process.

A deep copy of the globals might be necessary for caching.

Regardless, it sounds like a lower_constant implementation will do what we need.

@stuartarchibald
Copy link
Contributor

Have been thinking about this a bit more. I think the "state" would have to be "copied" and the working parts of the state are:

  1. addresses of functions in a NumPy library, these are functions like
    double next_double(state){/* impl */}
    
  2. the address of the "state" of the RNG.

As a result, I think that these would survive lowering as a "frozen" global and calls to the RNG etc would still correctly advance the state of the RNG. Essentially the "frozeness" is related to the native representation, which is primarily addresses, not the contents of these addresses.

If the global Generator instance/RNG object is shallowly copied, that would definitely work.

I don't think it's quite a shallow copy in the traditional sense due to the way Numba represents these objects, but I think the semantics are the sufficiently similar, so it's definitely worth a try.

There are a couple of issues though to consider:

  1. What happens if the global RNG is deleted and garbage collected? At least the state address will be invalid for reading (the function pointers/addresses from the NumPy library are probably fine still).

Could Numba keep a reference to the Python RNG object? I imagine that a user could just add such a reference to their Dispatcher instance and avoid that issue.

This could potentially work, IIRC there was a report a while back of a similar issue with large arrays in globals.

  1. What happens if the function is cached? The addresses would likely be baked in as integers and are almost certainly not going to be the same on replay from a new process.

A deep copy of the globals might be necessary for caching.

I think the technique of "finding the function pointers" on deserialisation will be needed, as well as working out how to deep copy the RNG state itself (of which there are a few types IIRC).

Regardless, it sounds like a lower_constant implementation will do what we need.

I think so too, @kc611 perhaps you'd like to take a look at this if you have some time? Having written the rest of the related code your knowledge of this is ideal!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Observing
Development

Successfully merging a pull request may close this issue.

7 participants