Skip to content

Commit

Permalink
Merge pull request #14608 from mattip/random-api
Browse files Browse the repository at this point in the history
API: rearrange the cython files in numpy.random
  • Loading branch information
rgommers committed Oct 17, 2019
2 parents 1185880 + 1531642 commit 9ee262b
Show file tree
Hide file tree
Showing 42 changed files with 2,124 additions and 688 deletions.
1 change: 1 addition & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ include tox.ini
include .coveragerc
include test_requirements.txt
recursive-include numpy/random *.pyx *.pxd *.pyx.in *.pxd.in
include numpy/random/include/*
include numpy/__init__.pxd
# Add build support that should go in sdist, but not go in bdist/be installed
# Note that sub-directories that don't have __init__ are apparently not
Expand Down
11 changes: 0 additions & 11 deletions doc/source/reference/random/bit_generators/bitgenerators.rst

This file was deleted.

43 changes: 21 additions & 22 deletions doc/source/reference/random/bit_generators/index.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
.. _bit_generator:

.. currentmodule:: numpy.random

Bit Generators
Expand Down Expand Up @@ -35,14 +33,18 @@ The included BitGenerators are:
.. _`Random123`: https://www.deshawresearch.com/resources_random123.html
.. _`SFC author's page`: http://pracrand.sourceforge.net/RNG_engines.txt

.. autosummary::
:toctree: generated/

BitGenerator

.. toctree::
:maxdepth: 1
:maxdepth: 1

BitGenerator <bitgenerators>
MT19937 <mt19937>
PCG64 <pcg64>
Philox <philox>
SFC64 <sfc64>
MT19937 <mt19937>
PCG64 <pcg64>
Philox <philox>
SFC64 <sfc64>

Seeding and Entropy
-------------------
Expand All @@ -53,14 +55,14 @@ seed. All of the provided BitGenerators will take an arbitrary-sized
non-negative integer, or a list of such integers, as a seed. BitGenerators
need to take those inputs and process them into a high-quality internal state
for the BitGenerator. All of the BitGenerators in numpy delegate that task to
`~SeedSequence`, which uses hashing techniques to ensure that even low-quality
`SeedSequence`, which uses hashing techniques to ensure that even low-quality
seeds generate high-quality initial states.

.. code-block:: python
from numpy.random import PCG64
from numpy.random import PCG64
bg = PCG64(12345678903141592653589793)
bg = PCG64(12345678903141592653589793)
.. end_block
Expand All @@ -75,14 +77,14 @@ user, which is up to you.

.. code-block:: python
from numpy.random import PCG64, SeedSequence
from numpy.random import PCG64, SeedSequence
# Get the user's seed somehow, maybe through `argparse`.
# If the user did not provide a seed, it should return `None`.
seed = get_user_seed()
ss = SeedSequence(seed)
print('seed = {}'.format(ss.entropy))
bg = PCG64(ss)
# Get the user's seed somehow, maybe through `argparse`.
# If the user did not provide a seed, it should return `None`.
seed = get_user_seed()
ss = SeedSequence(seed)
print('seed = {}'.format(ss.entropy))
bg = PCG64(ss)
.. end_block
Expand All @@ -104,9 +106,6 @@ or using ``secrets.randbits(128)`` from the standard library are both
convenient ways.

.. autosummary::
:toctree: generated/
:toctree: generated/

SeedSequence
bit_generator.ISeedSequence
bit_generator.ISpawnableSeedSequence
bit_generator.SeedlessSeedSequence
2 changes: 1 addition & 1 deletion doc/source/reference/random/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ The `Generator` is the user-facing object that is nearly identical to
rg.random()
One can also instantiate `Generator` directly with a `BitGenerator` instance.
To use the older `~mt19937.MT19937` algorithm, one can instantiate it directly
To use the older `MT19937` algorithm, one can instantiate it directly
and pass it to `Generator`.

.. code-block:: python
Expand Down
13 changes: 7 additions & 6 deletions doc/source/reference/random/new-or-different.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,19 +10,20 @@ What's New or Different
The Box-Muller method used to produce NumPy's normals is no longer available
in `Generator`. It is not possible to reproduce the exact random
values using ``Generator`` for the normal distribution or any other
distribution that relies on the normal such as the `gamma` or
`standard_t`. If you require bitwise backward compatible
streams, use `RandomState`.
distribution that relies on the normal such as the `Generator.gamma` or
`Generator.standard_t`. If you require bitwise backward compatible
streams, use `RandomState`, i.e., `RandomState.gamma` or
`RandomState.standard_t`.

Quick comparison of legacy `mtrand <legacy>`_ to the new `Generator`

================== ==================== =============
Feature Older Equivalent Notes
------------------ -------------------- -------------
`~.Generator` `~.RandomState` ``Generator`` requires a stream
source, called a `BitGenerator
<bit_generators>` A number of these
are provided. ``RandomState`` uses
source, called a `BitGenerator`
A number of these are provided.
``RandomState`` uses
the Mersenne Twister `~.MT19937` by
default, but can also be instantiated
with any BitGenerator.
Expand Down
18 changes: 9 additions & 9 deletions doc/source/reference/random/parallel.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,10 @@ a `~BitGenerator`. It uses hashing techniques to ensure that low-quality seeds
are turned into high quality initial states (at least, with very high
probability).

For example, `~mt19937.MT19937` has a state consisting of 624
For example, `MT19937` has a state consisting of 624
`uint32` integers. A naive way to take a 32-bit integer seed would be to just set
the last element of the state to the 32-bit seed and leave the rest 0s. This is
a valid state for `~mt19937.MT19937`, but not a good one. The Mersenne Twister
a valid state for `MT19937`, but not a good one. The Mersenne Twister
algorithm `suffers if there are too many 0s`_. Similarly, two adjacent 32-bit
integer seeds (i.e. ``12345`` and ``12346``) would produce very similar
streams.
Expand Down Expand Up @@ -91,15 +91,15 @@ territory ([2]_).
.. [2] In this calculation, we can ignore the amount of numbers drawn from each
stream. Each of the PRNGs we provide has some extra protection built in
that avoids overlaps if the `~SeedSequence` pools differ in the
slightest bit. `~pcg64.PCG64` has :math:`2^{127}` separate cycles
slightest bit. `PCG64` has :math:`2^{127}` separate cycles
determined by the seed in addition to the position in the
:math:`2^{128}` long period for each cycle, so one has to both get on or
near the same cycle *and* seed a nearby position in the cycle.
`~philox.Philox` has completely independent cycles determined by the seed.
`~sfc64.SFC64` incorporates a 64-bit counter so every unique seed is at
`Philox` has completely independent cycles determined by the seed.
`SFC64` incorporates a 64-bit counter so every unique seed is at
least :math:`2^{64}` iterations away from any other seed. And
finally, `~mt19937.MT19937` has just an unimaginably huge period. Getting
a collision internal to `~SeedSequence` is the way a failure would be
finally, `MT19937` has just an unimaginably huge period. Getting
a collision internal to `SeedSequence` is the way a failure would be
observed.
.. _`implements an algorithm`: http://www.pcg-random.org/posts/developing-a-seed_seq-alternative.html
Expand All @@ -113,10 +113,10 @@ territory ([2]_).
Independent Streams
-------------------

:class:`~philox.Philox` is a counter-based RNG based which generates values by
`Philox` is a counter-based RNG based which generates values by
encrypting an incrementing counter using weak cryptographic primitives. The
seed determines the key that is used for the encryption. Unique keys create
unique, independent streams. :class:`~philox.Philox` lets you bypass the
unique, independent streams. `Philox` lets you bypass the
seeding algorithm to directly set the 128-bit key. Similar, but different, keys
will still create independent streams.

Expand Down
20 changes: 10 additions & 10 deletions doc/source/reference/random/performance.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,21 +5,21 @@ Performance

Recommendation
**************
The recommended generator for general use is :class:`~pcg64.PCG64`. It is
The recommended generator for general use is `PCG64`. It is
statistically high quality, full-featured, and fast on most platforms, but
somewhat slow when compiled for 32-bit processes.

:class:`~philox.Philox` is fairly slow, but its statistical properties have
`Philox` is fairly slow, but its statistical properties have
very high quality, and it is easy to get assuredly-independent stream by using
unique keys. If that is the style you wish to use for parallel streams, or you
are porting from another system that uses that style, then
:class:`~philox.Philox` is your choice.
`Philox` is your choice.

:class:`~sfc64.SFC64` is statistically high quality and very fast. However, it
`SFC64` is statistically high quality and very fast. However, it
lacks jumpability. If you are not using that capability and want lots of speed,
even on 32-bit processes, this is your choice.

:class:`~mt19937.MT19937` `fails some statistical tests`_ and is not especially
`MT19937` `fails some statistical tests`_ and is not especially
fast compared to modern PRNGs. For these reasons, we mostly do not recommend
using it on its own, only through the legacy `~.RandomState` for
reproducing old results. That said, it has a very long history as a default in
Expand All @@ -31,20 +31,20 @@ Timings
*******

The timings below are the time in ns to produce 1 random value from a
specific distribution. The original :class:`~mt19937.MT19937` generator is
specific distribution. The original `MT19937` generator is
much slower since it requires 2 32-bit values to equal the output of the
faster generators.

Integer performance has a similar ordering.

The pattern is similar for other, more complex generators. The normal
performance of the legacy :class:`~.RandomState` generator is much
performance of the legacy `RandomState` generator is much
lower than the other since it uses the Box-Muller transformation rather
than the Ziggurat generator. The performance gap for Exponentials is also
large due to the cost of computing the log function to invert the CDF.
The column labeled MT19973 is used the same 32-bit generator as
:class:`~.RandomState` but produces random values using
:class:`~Generator`.
`RandomState` but produces random values using
`Generator`.

.. csv-table::
:header: ,MT19937,PCG64,Philox,SFC64,RandomState
Expand All @@ -61,7 +61,7 @@ The column labeled MT19973 is used the same 32-bit generator as
Poissons,67.6,52.4,69.2,46.4,78.1

The next table presents the performance in percentage relative to values
generated by the legacy generator, `RandomState(MT19937())`. The overall
generated by the legacy generator, ``RandomState(MT19937())``. The overall
performance was computed using a geometric mean.

.. csv-table::
Expand Down
2 changes: 1 addition & 1 deletion doc/source/release/1.17.0-notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -239,7 +239,7 @@ New extensible `numpy.random` module with selectable random number generators
-----------------------------------------------------------------------------
A new extensible `numpy.random` module along with four selectable random number
generators and improved seeding designed for use in parallel processes has been
added. The currently available :ref:`Bit Generators <bit_generator>` are
added. The currently available `Bit Generators` are
`~mt19937.MT19937`, `~pcg64.PCG64`, `~philox.Philox`, and `~sfc64.SFC64`.
``PCG64`` is the new default while ``MT19937`` is retained for backwards
compatibility. Note that the legacy random module is unchanged and is now
Expand Down
4 changes: 2 additions & 2 deletions doc/source/user/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -206,8 +206,8 @@ of elements that we want, instead of the step::
`empty_like`,
`arange`,
`linspace`,
`numpy.random.mtrand.RandomState.rand`,
`numpy.random.mtrand.RandomState.randn`,
`numpy.random.RandomState.rand`,
`numpy.random.RandomState.randn`,
`fromfunction`,
`fromfile`

Expand Down
4 changes: 2 additions & 2 deletions numpy/matlib.py
Original file line number Diff line number Diff line change
Expand Up @@ -239,7 +239,7 @@ def rand(*args):
See Also
--------
randn, numpy.random.rand
randn, numpy.random.RandomState.rand
Examples
--------
Expand Down Expand Up @@ -285,7 +285,7 @@ def randn(*args):
See Also
--------
rand, random.randn
rand, numpy.random.RandomState.randn
Notes
-----
Expand Down
21 changes: 10 additions & 11 deletions numpy/random/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -179,20 +179,19 @@

# add these for module-freeze analysis (like PyInstaller)
from . import _pickle
from . import common
from . import bounded_integers

from . import _common
from . import _bounded_integers

from ._generator import Generator, default_rng
from ._bit_generator import SeedSequence, BitGenerator
from ._mt19937 import MT19937
from ._pcg64 import PCG64
from ._philox import Philox
from ._sfc64 import SFC64
from .mtrand import *
from .generator import Generator, default_rng
from .bit_generator import SeedSequence
from .mt19937 import MT19937
from .pcg64 import PCG64
from .philox import Philox
from .sfc64 import SFC64
from .mtrand import RandomState

__all__ += ['Generator', 'RandomState', 'SeedSequence', 'MT19937',
'Philox', 'PCG64', 'SFC64', 'default_rng']
'Philox', 'PCG64', 'SFC64', 'default_rng', 'BitGenerator']


def __RandomState_ctor():
Expand Down
13 changes: 11 additions & 2 deletions numpy/random/bit_generator.pxd → numpy/random/_bit_generator.pxd
Original file line number Diff line number Diff line change
@@ -1,6 +1,15 @@

from .common cimport bitgen_t, uint32_t
cimport numpy as np
from libc.stdint cimport uint32_t, uint64_t

cdef extern from "include/bitgen.h":
struct bitgen:
void *state
uint64_t (*next_uint64)(void *st) nogil
uint32_t (*next_uint32)(void *st) nogil
double (*next_double)(void *st) nogil
uint64_t (*next_raw)(void *st) nogil

ctypedef bitgen bitgen_t

cdef class BitGenerator():
cdef readonly object _seed_seq
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -53,9 +53,7 @@ from cpython.pycapsule cimport PyCapsule_New
import numpy as np
cimport numpy as np

from libc.stdint cimport uint32_t
from .common cimport (random_raw, benchmark, prepare_ctypes, prepare_cffi)
from .distributions cimport bitgen_t
from ._common cimport (random_raw, benchmark, prepare_ctypes, prepare_cffi)

__all__ = ['SeedSequence', 'BitGenerator']

Expand Down Expand Up @@ -116,7 +114,7 @@ def _coerce_to_uint32_array(x):
Examples
--------
>>> import numpy as np
>>> from numpy.random.bit_generator import _coerce_to_uint32_array
>>> from numpy.random._bit_generator import _coerce_to_uint32_array
>>> _coerce_to_uint32_array(12345)
array([12345], dtype=uint32)
>>> _coerce_to_uint32_array('12345')
Expand Down Expand Up @@ -484,13 +482,12 @@ cdef class BitGenerator():
Parameters
----------
seed : {None, int, array_like[ints], ISeedSequence}, optional
seed : {None, int, array_like[ints], SeedSequence}, optional
A seed to initialize the `BitGenerator`. If None, then fresh,
unpredictable entropy will be pulled from the OS. If an ``int`` or
``array_like[ints]`` is passed, then it will be passed to
`SeedSequence` to derive the initial `BitGenerator` state. One may also
pass in an implementor of the `ISeedSequence` interface like
`SeedSequence`.
~`numpy.random.SeedSequence` to derive the initial `BitGenerator` state.
One may also pass in a `SeedSequence` instance.
Attributes
----------
Expand Down
29 changes: 29 additions & 0 deletions numpy/random/_bounded_integers.pxd
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
from libc.stdint cimport (uint8_t, uint16_t, uint32_t, uint64_t,
int8_t, int16_t, int32_t, int64_t, intptr_t)
import numpy as np
cimport numpy as np
ctypedef np.npy_bool bool_t

from ._bit_generator cimport bitgen_t

cdef inline uint64_t _gen_mask(uint64_t max_val) nogil:
"""Mask generator for use in bounded random numbers"""
# Smallest bit mask >= max
cdef uint64_t mask = max_val
mask |= mask >> 1
mask |= mask >> 2
mask |= mask >> 4
mask |= mask >> 8
mask |= mask >> 16
mask |= mask >> 32
return mask

cdef object _rand_uint64(object low, object high, object size, bint use_masked, bint closed, bitgen_t *state, object lock)
cdef object _rand_uint32(object low, object high, object size, bint use_masked, bint closed, bitgen_t *state, object lock)
cdef object _rand_uint16(object low, object high, object size, bint use_masked, bint closed, bitgen_t *state, object lock)
cdef object _rand_uint8(object low, object high, object size, bint use_masked, bint closed, bitgen_t *state, object lock)
cdef object _rand_bool(object low, object high, object size, bint use_masked, bint closed, bitgen_t *state, object lock)
cdef object _rand_int64(object low, object high, object size, bint use_masked, bint closed, bitgen_t *state, object lock)
cdef object _rand_int32(object low, object high, object size, bint use_masked, bint closed, bitgen_t *state, object lock)
cdef object _rand_int16(object low, object high, object size, bint use_masked, bint closed, bitgen_t *state, object lock)
cdef object _rand_int8(object low, object high, object size, bint use_masked, bint closed, bitgen_t *state, object lock)
Loading

0 comments on commit 9ee262b

Please sign in to comment.