Implement random and np.random #981

Merged
merged 72 commits into from Feb 18, 2015

Conversation

Projects
None yet
4 participants
@pitrou
Member

pitrou commented Feb 4, 2015

No description provided.

+ rshift = sizeof(void *) > 4 ? 16 : 0;
+ seed ^= (Py_uintptr_t) &timemod >> rshift;
+ seed += (Py_uintptr_t) &PyObject_CallMethod >> rshift;
+ Numba_rnd_init(state, seed);

This comment has been minimized.

@seibert

seibert Feb 9, 2015

Contributor

Is this how Numpy or Python initializes its RNG seed? Seems pretty strange to assume all kernels will randomize function addresses...

@seibert

seibert Feb 9, 2015

Contributor

Is this how Numpy or Python initializes its RNG seed? Seems pretty strange to assume all kernels will randomize function addresses...

This comment has been minimized.

@pitrou

pitrou Feb 9, 2015

Member

No. I also need to feed some entropy from /dev/urandom.

@pitrou

pitrou Feb 9, 2015

Member

No. I also need to feed some entropy from /dev/urandom.

This comment has been minimized.

@gmarkall

gmarkall Feb 9, 2015

Contributor

Seems pretty strange to assume all kernels will randomize function addresses...

Isn't the effect of using these addresses to add some randomness across invocations of Python though? These pointers will differ across runs if ASLR is turned on - but as I understand it, ASLR won't be changing the addresses of things during a run.

@gmarkall

gmarkall Feb 9, 2015

Contributor

Seems pretty strange to assume all kernels will randomize function addresses...

Isn't the effect of using these addresses to add some randomness across invocations of Python though? These pointers will differ across runs if ASLR is turned on - but as I understand it, ASLR won't be changing the addresses of things during a run.

This comment has been minimized.

@seibert

seibert Feb 9, 2015

Contributor

Ultimately, the goal is to ensure that a user who does not actively seed the RNG is highly likely to unique sequences in different interpreter processes, regardless of what time they started or what computer they are running on. (It is important to note that different seeds do not guarantee non-overlapping sequences for all RNGs, though Mersenne Twister seems to be pretty good in this regard.)

Sufficient entropy from /dev/urandom should be enough on its own (though available might vary across platforms). Lacking that timestamp XORed with some value unlikely to be shared by two processes is sufficient.

@seibert

seibert Feb 9, 2015

Contributor

Ultimately, the goal is to ensure that a user who does not actively seed the RNG is highly likely to unique sequences in different interpreter processes, regardless of what time they started or what computer they are running on. (It is important to note that different seeds do not guarantee non-overlapping sequences for all RNGs, though Mersenne Twister seems to be pretty good in this regard.)

Sufficient entropy from /dev/urandom should be enough on its own (though available might vary across platforms). Lacking that timestamp XORed with some value unlikely to be shared by two processes is sufficient.

@seibert

This comment has been minimized.

Show comment
Hide comment
@seibert

seibert Feb 9, 2015

Contributor

Given that users are potentially going to combine the RNG with multithreading (now that we can release the GIL), how is that handled?

Contributor

seibert commented Feb 9, 2015

Given that users are potentially going to combine the RNG with multithreading (now that we can release the GIL), how is that handled?

@pitrou

This comment has been minimized.

Show comment
Hide comment
@pitrou

pitrou Feb 9, 2015

Member

It's not handled at all. It shouldn't crash, but results are undefined (which, of course, is not necessarily bad in this case :-)). I don't think adding a lock would be a good idea for performance.

Member

pitrou commented Feb 9, 2015

It's not handled at all. It shouldn't crash, but results are undefined (which, of course, is not necessarily bad in this case :-)). I don't think adding a lock would be a good idea for performance.

@pitrou pitrou changed the title from [WIP] Implement random and np.random to Implement random and np.random Feb 9, 2015

docs/source/reference/numpysupported.rst
+module, but does not allow you to create individual RandomState instances.
+The same algorithms are used as for :ref:`the standard
+random module <pysupported-random>` (and therefore the same notes apply),
+but with an independent internal state: seeding or drawning numbers from

This comment has been minimized.

@seibert

seibert Feb 16, 2015

Contributor

Typo: should be "drawing". Also, is the Numba RNG state for numpy.random calls in nopython mode independent of the real Numpy RNG state?

@seibert

seibert Feb 16, 2015

Contributor

Typo: should be "drawing". Also, is the Numba RNG state for numpy.random calls in nopython mode independent of the real Numpy RNG state?

This comment has been minimized.

@pitrou

pitrou Feb 16, 2015

Member

Yes, it is.

@pitrou

pitrou Feb 16, 2015

Member

Yes, it is.

+Numba supports top-level functions from the :mod:`random` module, but does
+not allow you to create individual Random instances. A Mersenne-Twister
+generator is used, with a dedicated internal state. It is initialized at
+startup with entropy drawn from the operating system.

This comment has been minimized.

@seibert

seibert Feb 16, 2015

Contributor

Should also put the same note here about the existence of support for the numpy.random functions, which do not share state with the Python random functions in nopython mode. Someone might read this document page independent of the Numpy support page.

@seibert

seibert Feb 16, 2015

Contributor

Should also put the same note here about the existence of support for the numpy.random functions, which do not share state with the Python random functions in nopython mode. Someone might read this document page independent of the Numpy support page.

+
+The following functions are supported, but only with scalar output: you can't
+pass a *size* argument.
+

This comment has been minimized.

@seibert

seibert Feb 16, 2015

Contributor

Do we need the same fork() warning for the Numpy RNG as we have for the Python RNG?

@seibert

seibert Feb 16, 2015

Contributor

Do we need the same fork() warning for the Numpy RNG as we have for the Python RNG?

This comment has been minimized.

@pitrou

pitrou Feb 16, 2015

Member

Yes, basically the same caveats apply.

@pitrou

pitrou Feb 16, 2015

Member

Yes, basically the same caveats apply.

@seibert

This comment has been minimized.

Show comment
Hide comment
@seibert

seibert Feb 16, 2015

Contributor

We also need to add to the docs (both the Python and Numpy support pages) a warning that the random number generators are not thread-safe.

Contributor

seibert commented Feb 16, 2015

We also need to add to the docs (both the Python and Numpy support pages) a warning that the random number generators are not thread-safe.

@gmarkall

This comment has been minimized.

Show comment
Hide comment
@gmarkall

gmarkall Feb 16, 2015

Contributor

Though the CUDA docs aren't explicit about what Numpy functions are and aren't supported, we generally suggest that things that work in Nopython mode will work in CUDA - however, it's not the case here - do we need to make a mention somewhere that it's not supported in CUDA?

Contributor

gmarkall commented Feb 16, 2015

Though the CUDA docs aren't explicit about what Numpy functions are and aren't supported, we generally suggest that things that work in Nopython mode will work in CUDA - however, it's not the case here - do we need to make a mention somewhere that it's not supported in CUDA?

@pitrou

This comment has been minimized.

Show comment
Hide comment
@pitrou

pitrou Feb 16, 2015

Member

I don't really know how CUDA works in that respect. Is it true that all other nopython features are supported in CUDA? Some of them need e.g. _helperlib functions.

Member

pitrou commented Feb 16, 2015

I don't really know how CUDA works in that respect. Is it true that all other nopython features are supported in CUDA? Some of them need e.g. _helperlib functions.

@gmarkall

This comment has been minimized.

Show comment
Hide comment
@gmarkall

gmarkall Feb 16, 2015

Contributor

I think those other things that need _helperlib functions also will not work with CUDA. I would guess that as a general rule, anything that can be lowered using only LLVM IR will probably work, and anything else probably won't work.

Contributor

gmarkall commented Feb 16, 2015

I think those other things that need _helperlib functions also will not work with CUDA. I would guess that as a general rule, anything that can be lowered using only LLVM IR will probably work, and anything else probably won't work.

+* :func:`random.expovariate`
+* :func:`random.gammavariate`
+* :func:`random.gauss`
+* :func:`random.getrandbits`: number of bits must not be greater than 64

This comment has been minimized.

@gmarkall

gmarkall Feb 16, 2015

Contributor

Is it possible do give a warning/error to the user when the number of bits is greater than 64? It seems that if the number of bits is greater than 64, the output begins to diverge from CPython's without reporting any error - it appears that could be easy to miss this warning in the docs and not notice what is happening.

@gmarkall

gmarkall Feb 16, 2015

Contributor

Is it possible do give a warning/error to the user when the number of bits is greater than 64? It seems that if the number of bits is greater than 64, the output begins to diverge from CPython's without reporting any error - it appears that could be easy to miss this warning in the docs and not notice what is happening.

@pitrou

This comment has been minimized.

Show comment
Hide comment
@pitrou

pitrou Feb 16, 2015

Member

I would guess that as a general rule, anything that can be lowered using only LLVM IR will probably work, and anything else probably won't work.

I think we should do a pass over all nopython-supported features and document which ones are CUDA-supported. Many things seem injected in the CPU context (see the CPUContext constructor).

Member

pitrou commented Feb 16, 2015

I would guess that as a general rule, anything that can be lowered using only LLVM IR will probably work, and anything else probably won't work.

I think we should do a pass over all nopython-supported features and document which ones are CUDA-supported. Many things seem injected in the CPU context (see the CPUContext constructor).

docs/source/reference/numpysupported.rst
+* :func:`numpy.random.randint`
+* :func:`numpy.random.randn`: only without argument
+* :func:`numpy.random.random`
+* :func:`numpy.random.random_integers`

This comment has been minimized.

@gmarkall

gmarkall Feb 16, 2015

Contributor

With

from numba import njit
import numpy as np

@njit
def f1():
    return np.random.random_integers(2)

print(f1())

I'm getting a traceback:

$ python tests.py 
Traceback (most recent call last):
  File "/home/gmarkall/work/numba/numba/typeinfer.py", line 269, in __call__
    attrty = context.resolve_getattr(value=ty, attr=self.attr)
  File "/home/gmarkall/work/numba/numba/typing/context.py", line 75, in resolve_getattr
    attrinfo = self.attributes[value]
KeyError: Module(<module 'numpy.random' from '/home/gmarkall/.conda/envs/p34/lib/python3.4/site-packages/numpy/random/__init__.py'>)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "tests.py", line 8, in <module>
    print(f1())
  File "/home/gmarkall/work/numba/numba/dispatcher.py", line 157, in _compile_for_args
    return self.compile(sig)
  File "/home/gmarkall/work/numba/numba/dispatcher.py", line 291, in compile
    flags=flags, locals=self.locals)
  File "/home/gmarkall/work/numba/numba/compiler.py", line 547, in compile_extra
    return pipeline.compile_extra(func)
  File "/home/gmarkall/work/numba/numba/compiler.py", line 293, in compile_extra
    return self.compile_bytecode(bc, func_attr=self.func_attr)
  File "/home/gmarkall/work/numba/numba/compiler.py", line 301, in compile_bytecode
    return self._compile_bytecode()
  File "/home/gmarkall/work/numba/numba/compiler.py", line 534, in _compile_bytecode
    return pm.run(self.status)
  File "/home/gmarkall/work/numba/numba/compiler.py", line 191, in run
    raise patched_exception
  File "/home/gmarkall/work/numba/numba/compiler.py", line 183, in run
    res = stage()
  File "/home/gmarkall/work/numba/numba/compiler.py", line 389, in stage_nopython_frontend
    self.locals)
  File "/home/gmarkall/work/numba/numba/compiler.py", line 665, in type_inference_stage
    infer.propagate()
  File "/home/gmarkall/work/numba/numba/typeinfer.py", line 390, in propagate
    self.constrains.propagate(self.context, self.typevars)
  File "/home/gmarkall/work/numba/numba/typeinfer.py", line 112, in propagate
    constrain(context, typevars)
  File "/home/gmarkall/work/numba/numba/typeinfer.py", line 273, in __call__
    raise TypingError(msg, loc=self.inst.loc)
numba.typeinfer.TypingError: Failed at nopython (nopython frontend)
Unknown attribute 'random_integers' for Module(<module 'numpy.random' from '/home/gmarkall/.conda/envs/p34/lib/python3.4/site-packages/numpy/random/__init__.py'>) $0.2 $0.3 = getattr(attr=random_integers, value=$0.2)
File "tests.py", line 6
  • should this work?
@gmarkall

gmarkall Feb 16, 2015

Contributor

With

from numba import njit
import numpy as np

@njit
def f1():
    return np.random.random_integers(2)

print(f1())

I'm getting a traceback:

$ python tests.py 
Traceback (most recent call last):
  File "/home/gmarkall/work/numba/numba/typeinfer.py", line 269, in __call__
    attrty = context.resolve_getattr(value=ty, attr=self.attr)
  File "/home/gmarkall/work/numba/numba/typing/context.py", line 75, in resolve_getattr
    attrinfo = self.attributes[value]
KeyError: Module(<module 'numpy.random' from '/home/gmarkall/.conda/envs/p34/lib/python3.4/site-packages/numpy/random/__init__.py'>)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "tests.py", line 8, in <module>
    print(f1())
  File "/home/gmarkall/work/numba/numba/dispatcher.py", line 157, in _compile_for_args
    return self.compile(sig)
  File "/home/gmarkall/work/numba/numba/dispatcher.py", line 291, in compile
    flags=flags, locals=self.locals)
  File "/home/gmarkall/work/numba/numba/compiler.py", line 547, in compile_extra
    return pipeline.compile_extra(func)
  File "/home/gmarkall/work/numba/numba/compiler.py", line 293, in compile_extra
    return self.compile_bytecode(bc, func_attr=self.func_attr)
  File "/home/gmarkall/work/numba/numba/compiler.py", line 301, in compile_bytecode
    return self._compile_bytecode()
  File "/home/gmarkall/work/numba/numba/compiler.py", line 534, in _compile_bytecode
    return pm.run(self.status)
  File "/home/gmarkall/work/numba/numba/compiler.py", line 191, in run
    raise patched_exception
  File "/home/gmarkall/work/numba/numba/compiler.py", line 183, in run
    res = stage()
  File "/home/gmarkall/work/numba/numba/compiler.py", line 389, in stage_nopython_frontend
    self.locals)
  File "/home/gmarkall/work/numba/numba/compiler.py", line 665, in type_inference_stage
    infer.propagate()
  File "/home/gmarkall/work/numba/numba/typeinfer.py", line 390, in propagate
    self.constrains.propagate(self.context, self.typevars)
  File "/home/gmarkall/work/numba/numba/typeinfer.py", line 112, in propagate
    constrain(context, typevars)
  File "/home/gmarkall/work/numba/numba/typeinfer.py", line 273, in __call__
    raise TypingError(msg, loc=self.inst.loc)
numba.typeinfer.TypingError: Failed at nopython (nopython frontend)
Unknown attribute 'random_integers' for Module(<module 'numpy.random' from '/home/gmarkall/.conda/envs/p34/lib/python3.4/site-packages/numpy/random/__init__.py'>) $0.2 $0.3 = getattr(attr=random_integers, value=$0.2)
File "tests.py", line 6
  • should this work?
+
+* :func:`numpy.random.rand`: only without argument
+* :func:`numpy.random.randint`
+* :func:`numpy.random.randn`: only without argument

This comment has been minimized.

@gmarkall

gmarkall Feb 16, 2015

Contributor

This seems to work, but I couldn't find a test for it (everything else apart from random_integers I could see a test for).

@gmarkall

gmarkall Feb 16, 2015

Contributor

This seems to work, but I couldn't find a test for it (everything else apart from random_integers I could see a test for).

@gmarkall

This comment has been minimized.

Show comment
Hide comment
@gmarkall

gmarkall Feb 16, 2015

Contributor

I think we should do a pass over all nopython-supported features and document which ones are CUDA-supported. Many things seem injected in the CPU context (see the CPUContext constructor).

OK - just opened #997 to record this.

Contributor

gmarkall commented Feb 16, 2015

I think we should do a pass over all nopython-supported features and document which ones are CUDA-supported. Many things seem injected in the CPU context (see the CPUContext constructor).

OK - just opened #997 to record this.

@pitrou

This comment has been minimized.

Show comment
Hide comment
@pitrou

pitrou Feb 16, 2015

Member

I added a test for randn(). As for random_integers(), you're right, it isn't supported. (it's a bit silly really: it's the same as randint() except that the interval in the one-argument form is [1, low) instead of [0, low)...).

Member

pitrou commented Feb 16, 2015

I added a test for randn(). As for random_integers(), you're right, it isn't supported. (it's a bit silly really: it's the same as randint() except that the interval in the one-argument form is [1, low) instead of [0, low)...).

seibert added a commit that referenced this pull request Feb 18, 2015

Merge pull request #981 from pitrou/random
Implement random and np.random

@seibert seibert merged commit a7b8600 into numba:master Feb 18, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@rossant

This comment has been minimized.

Show comment
Hide comment
@rossant

rossant Feb 20, 2015

Just wondering: when is the next release planned? I'm considering using random in a Numba function in my upcoming book.

rossant commented Feb 20, 2015

Just wondering: when is the next release planned? I'm considering using random in a Numba function in my upcoming book.

@seibert

This comment has been minimized.

Show comment
Hide comment
@seibert

seibert Feb 20, 2015

Contributor

We are planning the release for mid-March.

Contributor

seibert commented Feb 20, 2015

We are planning the release for mid-March.

@rossant

This comment has been minimized.

Show comment
Hide comment
@rossant

rossant Feb 20, 2015

Great, thanks.

rossant commented Feb 20, 2015

Great, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment