Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync numpy random generator with numba random state #3249

Open
agramfort opened this issue Aug 21, 2018 · 4 comments
Open

Sync numpy random generator with numba random state #3249

agramfort opened this issue Aug 21, 2018 · 4 comments

Comments

@agramfort
Copy link

Feature request

I got a code for which I could not have deterministic test output due
to some np.random calls in a numba function. Reading the test_random.py
file I found maybe a way to address this issue using a decorator.
The idea is to get from a RandomState instance or the global np random
the state of numpy generator, set it to numba and then on exit taking
the numba state and set it back to numpy. The code looks like this.

# Author : Alex Gramfort, <alexandre.gramfort@inria.fr>

import numpy as np
from numba import njit, _helperlib


def check_random_state(seed):
    """Turn seed into a np.random.RandomState instance.

    If seed is None, return the RandomState singleton used by np.random.
    If seed is an int, return a new RandomState instance seeded with seed.
    If seed is already a RandomState instance, return it.
    Otherwise raise ValueError.
    """
    if seed is None or seed is np.random:
        return np.random.mtrand._rand
    if isinstance(seed, (int, np.integer)):
        return np.random.RandomState(seed)
    if isinstance(seed, np.random.RandomState):
        return seed
    raise ValueError('%r cannot be used to seed a numpy.random.RandomState'
                     ' instance' % seed)


def _copy_np_state(r, ptr):
    """
    Copy state of Numpy random *r* to Numba state *ptr*.
    """
    ints, index = r.get_state()[1:3]
    _helperlib.rnd_set_state(ptr, (index, [int(x) for x in ints]))
    return ints, index


def _copyback_np_state(r, ptr):
    """
    Copy state of of Numba state *ptr* to Numpy random *r*
    """
    index, ints = _helperlib.rnd_get_state(ptr)
    r.set_state(('MT19937', ints, index, 0, 0.0))


def get_np_state_ptr():
    return _helperlib.rnd_get_np_state_ptr()


class use_numba_random(object):
    """Decorator that gets the current of numpy random, then
    sets it to numba and then puts the new state of numba
    to numpy to decoration the function of not does not
    affect the output."""
    def __init__(self, random_state=None):
        self.random_state = random_state

    def __call__(self, func):
        def new_func(*args, **kwargs):
            r = check_random_state(self.random_state)
            ptr = get_np_state_ptr()
            _copy_np_state(r, ptr)
            out = func(*args, **kwargs)
            _copyback_np_state(r, ptr)
            return out
        return new_func


def test_no_decorator():
    """Test that shows the problem when not using a decorator."""
    np.random.seed(0)
    r1_ok, r2_ok = np.random.rand(), np.random.rand()

    # Define a numba function using np.random
    @njit()
    def generate_random():
        return np.random.rand()

    np.random.seed(0)
    r1 = generate_random()
    r2 = np.random.rand()
    assert r1 != r1_ok
    assert r2 == r1_ok  # Numba does not increment the state of np.random
    assert r2 != r2_ok


def test_global_random():
    """Test using a global numpy random state."""
    np.random.seed(0)
    r = check_random_state(None)
    r1_ok, r2_ok = r.rand(), r.rand()

    np.random.seed(0)
    r = check_random_state(None)

    # Define a numba function decorated to sync it's seed
    # with a random state or current np.random
    @use_numba_random(r)
    @njit()
    def generate_random():
        return np.random.rand()

    r1 = generate_random()
    r2 = r.rand()
    assert r1 == r1_ok
    assert r2 == r2_ok


def test_random_state():
    """Test using a numpy RandomState."""
    r = check_random_state(42)
    r1_ok, r2_ok = r.rand(), r.rand()

    r = check_random_state(42)

    # Define a numba function decorated to sync it's seed
    # with a random state or current np.random
    @use_numba_random(r)
    @njit()
    def generate_random():
        return np.random.rand()

    r1 = generate_random()
    r2 = r.rand()
    assert r1 == r1_ok
    assert r2 == r2_ok


if __name__ == '__main__':
    test_no_decorator()
    test_global_random()
    test_random_state()

what are you thoughts on this? could it inspire passing to jit of njit decorators a random_state parameter to keep np.random and numba states in sync. This would allow to get the same outputs if the njit / jit decorator is used or not.

this code solved my problem but maybe it can help others...

@stuartarchibald
Copy link
Contributor

Thanks for the report. I think I see what you mean, that you are would like a call to a Numba implementation of e.g. np.random.rand() to a) sync from NumPy's RNG internal state when called b) move NumPy's RNG internal state during the call, and the inverse of this would probably be useful too. I think the above is a reasonable way to achieve this, thanks for sharing, I think it will be useful to other users.

Internally, Numba has entirely reimplemented both the CPython and NumPy RNG state mechanisms and this is entirely independent of NumPy's RandomState internals. I think synchronising the state cheaply and safely would be difficult, especially in a multi-threaded environment. It may be possible and desirable though since #3038 now guarantees sequences generated from the same initial seed are identical across NumPy and Numba.

@agramfort
Copy link
Author

agramfort commented Aug 22, 2018 via email

@stuartarchibald
Copy link
Contributor

I think this would constitute a behavioural change for Numba as the state of the NumPy RNG would be observed and influence the Numba np.random RNG functions. Can this be done? In theory, the init part, yes, the sync back, probably. The init part would just require using the NumPy state in the initialisation sequence for the Numba internal RNG state instead of os.urandom. Synchronising this state back to NumPy after a call is possible, a function could be written that does the sync and is called every time Numba leaves one of it's NumPy random function reimplementations.

There's a load of issues with thread safety and common uses of threaded RNG calls (like for Monte Carlo) that make this harder to implement in practice. Numba does something to try and be more practically useful for this (initialize random state per thread into TLS), but it does deviate from what would happen in NumPy/the case demonstrated above.

Ping @seibert any thoughts?

@agramfort
Copy link
Author

thanks @stuartarchibald for the good synthesis of the situation.

Whatever can help people having better interaction between numpy random state and numba is I think good. It would facilitate the port of functions to numba as it would not require to change tests due to the use of numba. Both functions would give the same results, but one much faster...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Usability / Docs
Needs Design
Development

No branches or pull requests

2 participants