Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pseudorandom number generator #135

Closed
masuday opened this issue Feb 1, 2020 · 22 comments
Closed

Pseudorandom number generator #135

masuday opened this issue Feb 1, 2020 · 22 comments
Labels
topic: algorithms searching and sorting, merging, ...

Comments

@masuday
Copy link

masuday commented Feb 1, 2020

EDIT: An updated proposal is available below.

There are a few suggestions to have pseudorandom number generators (PRNGs) in the library, e.g., #1 (comment) and #104 (comment). Focusing on the uniform floating-point PRNG [0.0, 1.0), we have already had the intrinsic subroutine, random_number. The first question is whether we should have custom PRNGs of the uniform distribution. Or, we should concentrate on derived PRNGs like non-uniform distributions or integer PRNGs. I think it is still worth developing custom PRNGs because of better flexibility and performance.

With the custom PRNGs, we have many more questions.

  • Should we have various generators (e.g., integer numbers) in addition to a floating-point PRNG [0.0, 1.0)?
  • Should it support different algorithms of PRNG, or can we provide a single algorithm?
    • If true: Should it optionally call the intrinsic subroutine?
    • If true: Should the seed (state) be global, private, or both?
      • If global, Should it be thread-safe?

The decision defines a design of API. A possible API is the same as the intrinsic subroutines. The subroutine name in the below example is just a placeholder.

! just arbitrary precision here
type(random_number_generator) :: rng
real(dp) :: harvest
integer :: seed(2)

! for local state as derived type
seed = [1,2]
call pseudorandom_seed(rng,put=seed)
call pseudorandom_number(rng,harvest)

! for global state as saved seed
call pseudorandom_seed(put=seed)
call pseudorandom_number(harvest)

Please tell me what you think about it.

@peteroupc
Copy link

peteroupc commented Feb 1, 2020

I have a section "Guidelines for New RNG APIs" that gives my thoughts on how new APIs should support RNGs. (It seems that the focus is not on security, so ignore the information on cryptographic RNGs there if that is true.)

Also, any new PRNGs provided by the API should not use global state; see also the recent NumPy RNG policy.

@ivan-pi
Copy link
Member

ivan-pi commented Feb 1, 2020

I would support having both custom uniform and also non-uniform and integer distributions. Importantly, for Fortran the RNG should work also when run in parallel (both coarray or OpenMP, OpenMPI environments). Quoting from a blog post by Jason Blevins:

To fix this problem, each thread needs to have its own independent RNG. There is no simple way to accomplish this with Fortran’s random_number intrinsic. Instead, we need a more sophisticated RNG that can have multiple “instances.” Fortran’s user-defined types provide a nice solution. If we create a type that will store the state of each stream of random numbers, we can then build an array of RNGs–one for each thread or experiment–each started at a different seed.

The C++ standard library also provides several non-uniform distributions and different generators in a high-level object API.
The NAG libary also provides both different generators and also non-uniform distributions.
Rust also provides thread-local RNGs.

@leonfoks
Copy link

leonfoks commented Feb 2, 2020

@peteroupc That link is fantastic, ill check it out!

I have a PRNG that is thread safe, and non global state, in coretran. See #104 and this link

It is thread safe under OpenMP and OpenMPI, I have not tried it with coarrays but I don't see why it wouldn't work.

The base generator can be changed depending on what the user wants. So far i've implemented the xorshift128+ and xorshift1024* but I would like to add xoroshiro variants too since I read they are a little more robust when it comes to the big crush tests. The seed of the generators can be specified. I also have the capability of randomly generating seeds using the splitmix64 algorithm. The PRNG class can generate past just uniform. Most of the functions were written by Alan Miller and were released into public domain.

The default generators in Matlab and Python are Mersenne Twister generators which have a huge period, but they can't be "jumped" for parallel codes, although after reading the link above, maybe they can? My PRNG class has a "jump" type bound procedure. IMHO The best way to use a PRNG in parallel is to instantiate and seed every PRNG on every thread with the same seed, and jump each "rank's" PRNG by an integer amount of cycles, (usually the rank (or thread) number)

I believe the speed is comparable to numpy's generator, but its been a while since I ran them.

I would love to use this as a starting point, and modify/add to what I already have! Feedback would be greatly appreciated!

Caveat I've tested with gfortran, and I think Intel. Would need testing with other compilers.

Here's the example from my docs on how to use the PRNG in coretran.

program PrngTest
use Prng_Class, only: Prng
implicit none

type(Prng) :: rng
integer(i64) :: seed(16)
integer(i32) :: i, id

real(r64) :: a

! Use a constructor to initialize the Prng with a random seed
! Using display = .true. will print the randomly generated seed to the screen
! So you can reproduce any results later if you wish.
rng = Prng(big = .true., display = .true.)

! Or you could have set the seed using this
! seed = [Some numbers, size 16 if big == .true., size 2 if big == .false.]
! and you can use rng = Prng(seed, .true., .true.)

! Draw from a uniform distribution
call rng%rngUniform(a) ! Can take min, max

! Draw an integer between 1 and 100 inclusive
call rng%rngInteger(id, 1, 100)

! Other distributions
call rng%rngNormal(a) ! can take mean, std
call rng%rngExponential(a, lambda=1.d0)
call rng%rngWeibull(a, lambda=1.d0, k=1.d0)
stop
end program

And here is an example in parallel using OpenMPI

program PrngTest_mpi
include 'mpif.h'
use Prng_Class, only: Prng
implicit none

type(Prng) :: rng
integer(i64) :: seed(16)
integer(i32) :: i, id, ierror, size, rank

real(r64) :: a

call MPI_INIT(ierror)
call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierror)
call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierror)

! Generate a seed that is the same on all ranks.!
! seed = [16 numbers]
! This could be generated randomly and broadcast, or set by a user.

! Initialize the PRNG on each rank
rng = Prng(seed, big=.true., display=.true.)

! To avoid PRNG aliasing, jump each PRNG class by the rank number
rng%jump(rank)

! Do cool things.

call MPI_FINALIZE(ierror)
end program

@masuday
Copy link
Author

masuday commented Feb 2, 2020

@peteroupc Thank you for your thoughts. The new PRNG should be non-cryptographic, and a possible option includes xoshiro, xoroshiro, and/or Mersenne Twister (MT). For API, IMHO, we are going to the basic uniform PRNG [0,1) because of the compatibility with random_number as the first step, then likely integer PRNGs. The PRNG (0,1) or (0,1] is useful for statistical distributions.

@ivan-pi Thank you for the useful resource. It seems that, in parallel, each thread should have a thread-specific PRNG, and I agree with this design.

@leonfoks Thank you for sharing your code. I agree that xoshiro/xoroshiro can be the default PRNG. A code for Jump Ahead for MT is available. It is useful for us to have PRNGs of probability distribution functions, and we will probably discuss it after we fix the APIs for the basic PRNGs. We can take advantage of your code to implement the algorithms!

As @ivan-pi and @leonfoks suggested, it seems to be better that each thread has a private state explicitly. Steve Lionel also suggested it. The support for thread-safety is inconsistent across compilers:

  • Intel Fortran: thread-safe only for coarrays.
  • GFortran: thread-safe for OpenMP.

By the way, I have one more question: how much flexibility can we have in PRNGs?

  1. Assuming we are going to have several algorithms (e.g., xoshiro and MT), if the kind of internal state (int64) is the same across the algorithms, the implementation can be simple. Supporting both int32 and int64, we need a careful design in APIs. The kind of random bits (outputs) has the same issue. The issue is in "jump", too. Some algorithms may not support it (as in a table by @peteroupc), and the function can be optional.

  2. Should it be flexible enough for the users to be able to use their custom PRNGs? Or, we simply provide a couple of built-in algorithms? If we take the latter, the internal state and the output can be fixed to int64.

@peteroupc
Copy link

peteroupc commented Feb 2, 2020

I should note that generating random floating-point numbers in computers depends on generating random integers — not the other way around. See also this answer. Notably, since there are infinitely many real numbers between two others, it is impossible for any computer to choose from among all of them. Thus, floating-point random generation methods should be derived from integer PRNGs, not the other way around. The API can include a method that generates a number in [0, 1), but that should not be the only random generation method available.

I should also note that many high-quality PRNGs have an internal state of much more than 64 bits (usually either 128 or 256 bits). (In this sense, MT19937's state is huge compared to other modern PRNGs.) Rather, it's more useful to speak of the number of seeds the PRNG admits (which should be at least 2^63 for high-quality PRNGs), and less importantly, of the output size in bits.

Having named PRNGs (as is the case in C++ and NumPy 1.17) is useful, since their implementation is more likely to be stable across time. Allowing the use of custom PRNGs is also useful, but is less important at this point.

@certik
Copy link
Member

certik commented Feb 2, 2020

@masuday thank you for starting the discussion. I think this is a nice fit into stdlib.

Here is my implementation of some of the random number generators (gamma and normal distribution):

https://github.com/certik/hfsolver/blob/b4c50c1979fb7e468b1852b144ba756f5a51788d/src/random.f90

also initializing the seed using system time is a very useful function that we should have in some form in stdlib:

https://github.com/certik/hfsolver/blob/b4c50c1979fb7e468b1852b144ba756f5a51788d/src/utils.f90#L402

@masuday
Copy link
Author

masuday commented Feb 2, 2020

@peteroupc Thank you for your comment! I realized my comment was somewhat ambiguous.

I am aware that random FP numbers are from uniform random-bit generators (URBGs) like xoshiro256++. My previous comment meant that the standard FP PRNG [0,1) is a good starting point to discuss API and some specifications because the reference implementation is already available in Fortran (random_number). We can expand the discussion to uniform integer PRNGs.

I think there is no problem in the size of the internal state, i.e., random_seed(size=). A possible issue could be its "kind" (int32 or int64). If we support the maximum flexibility in URBGs, we should support any "kinds" for the internal state. We can fix it to a single "kind", and we can transfer (type-cast) it to the other "kinds". But IMHO, casting is awkward in Fortran (although it is debatable). Perhaps I am afraid of it too much, and we can just use a fixed kind, int64.

C++ and NumPy is a useful reference for APIs, as @peteroupc mentioned. I tend to like a procedural style (similar to the intrinsic random_number and random_seed as in the original post) so that the users easily guess how to use the new PRNGs.

@certik makes a good point (thank you for sharing your code!). random_number is a generic name to return a scalar or an array. IMHO, the new PRNGs should have the same feature. Also, I would like to a procedure to initialize the seed by timestamp.

@masuday
Copy link
Author

masuday commented Feb 3, 2020

This post is pretty long. I made it because more concrete suggestions can stimulate the discussion.

Here is a draft proposal with some description of the issue. A formal proposal will come in a separate thread. I use some terminology taken from the C++ implementation. In the draft, random numbers mean pseudorandom numbers.

Overview

The scope of PRNGs in stdlib is limited to non-cryptographic random numbers. Here is a hierarchy of PRNGs.

  • Uniform random bit generators (URBGs)
    • Uniform random number generators (uniform PRNGs)
      • Shuffling and random sampling
      • Univariate random number distributions
        • Multivariate random number distributions

A random-number engine is a main body of URBG, and a typical engine is a family of the linear feedback shift register (LFSLs) and its variants (Mersenne Twister and xoshiro256, etc.). Uniform generators convert the random bits to a uniform random number in integer or real type. Shuffling and random-sampling algorithms depend on the integer random numbers. The uniform random numbers are used to generate statistically-distributed random numbers.

In this draft, I focus only on URBGs and uniform PRNGs. stdlib should provide particular engines (URBGs) as built-in. The users may use a custom engine, but it should be compliant with the interface of the built-in engines.

We can have both a (procedural) low-level API and an object-oriented API. This proposal is biased to the low-level API, but one can wrap the subroutines to define the OO API.

Low-level API for URBG engines

Engine subroutine

The engine should be a subroutine with a common interface. My suggestion for the interface is that the engine receives an array of int64 state, and returns random bits as a single int64 variable. The procedure changes the state, and we should not use a function.

interface
   subroutine random_number_engine(state,harvest)
      integer(int64),intent(inout) :: state(:)     ! may need "allocatable"
      integer(int64),intent(out) :: harvest
   end subroutine random_number_engine
end interface

The default engine is undefined. I put a dummy subroutine for the engine.

subroutine random_number_engine_undef(state,harvest)
   integer(int64),intent(inout) :: state(:)    ! may need "allocatable"
   integer(int64),intent(out) :: harvest
   stop "undefined engine"
end subroutine random_number_engine_undef

Whenever we use this dummy engine, we need allocatable for the state.

User-derived type

A derived type holds an engine and its state.

type random_number_generator
   integer :: state_size = 0
   integer(int64),allocatable :: state(:)
   procedure(random_number_engine),nopass,pointer :: engine => random_number_engine_undef
end type random_number_generator

Seeding

We should have a subroutine to seed the engine (i.e., constructor). My suggestion is to have a similar subroutine to the reference, random_seed. It specifies the engine and the seed at the same time.

! The name should be changed.
subroutine random_number_seed(rng, engine, size, put, get, auto)
   type(random_number_generator),intent(inout) :: rng   ! state structure
   character(len=*),intent(in),optional :: engine       ! engine name
   integer,intent(out),optional :: size                 ! same as the reference
   integer(int64),intent(in),optional :: put(:)         ! same as the reference
   integer(int64),intent(out),optional :: get(:)        ! same as the reference
   logical,intent(in),optional :: auto                  ! automatic seeding
   ...
end subroutine random_number_seed

If the engine has a "jump" feature as pointed by @leonfoks, we can have an extra subroutine to do it. We may have an additional member variable to the derived type: logical :: has_jump and a procedure procedure(random_number_jump),nopass,pointer :: jump.

Supporting a user-defined engine, we need one more subroutine.

Example

type(random_number_generator) :: rng
integer :: state_size
integer(int64),allocatable :: seed(:)

! traditional method
call random_number_seed(rng, "xoshiro256**", size=state_size)
allocate(seed(state_size))
seed = 1234567890
call random_number_seed(rng, "xoshiro256**", put=seed)
deallocate(seed)

! automatic method based on timestamp etc
call random_number_seed(rng, "xoshiro256**", auto=.true.)

Summary

Suggestion:

  • A derived type has a state and a procedure pointer to the random-bit engine.
  • The engine is a subroutine with the state for input and the random-bit integer for output.
  • The seed subroutine (constructor) has a similar form as random_seed with automatic seeding, engine specification, and more.
  • A separate subroutine can perform "jumping" for some engines.
  • An additional subroutine may support user-defined engines.

Low-level API for uniform PRNGs

Real numbers

First, I suggest a uniform PRNG for a real number in [0,1) as in random_number. The following is only for real64, but we can have a real32 version. Also, we should support the various output, a scalar, and an array with different ranks.

subroutine random_real_uniform(rng,harvest)
   type(random_number_generator),intent(inout) :: rng
   real(real64),intent(out) :: harvest
   ...
end subroutine random_real_uniform

We can extend it to the general uniform distribution.

subroutine random_real_uniform(rng,harvest,lower,upper)
   type(random_number_generator),intent(inout) :: rng
   real(real64),intent(out) :: harvest
   real(real64),intent(in),optional :: lower
   real(real64),intent(in),optional :: upper
   ...
end subroutine random_real_uniform

@peteroupc suggests implementing the different ranges: [0,1), (0,1], (0,1), and [0,1]. I find (0,1) or (0,1] useful for generating random numbers under statistical distributions with log(x). Assuming we implement all the four options, we have several options for the API.

  1. Use additional arguments like random_real_uniform(rng,harvest,lopen=0.0,uclose=1.0).
  2. Use separate subroutines like random_real_uniform_open_close(rng,harvest,0.0,1.0).

We can choose either of them so that the usage is not too complicated.

Integer numbers

Although we can get random bits directly from the engine, we should have a separate subroutine to obtain the bits because of the consistency of APIs.

subroutine random_raw_stream(rng,harvest)
   type(random_number_generator),intent(inout) :: rng
   integer(int64),intent(out) :: harvest
   ...
end subroutine random_raw_stream

For the uniform integer PRNG, we can use the following subroutine.

subroutine random_integer_uniform(rng,harvest,lower,upper)
   type(random_number_generator),intent(inout) :: rng
   integer(int64),intent(out) :: harvest
   integer(int64),intent(in) :: lower
   integer(int64),intent(in) :: upper
   ...
end subroutine random_integer_uniform

Summary

Suggestion:

  • The uniform PRNGs have a similar interface as random_number.
  • The output can be a scalar or an array with various Fortran "kinds".
  • For real numbers, the default generator supports the range, [0,1). Optionally, we can support various ranges like (0,1].
  • For integer numbers, we can have two subroutines for a raw number and a uniform integer.

To be discussed

  • Restriction on engines (supporting int64 only)
  • Design of the user-derived type (object)
  • Boundary of uniform random numbers
  • API for each subroutine
  • Jumping subroutine
  • Naming rules

IMO, it is useful to support a scalar seed like put_scalar= in the seeding subroutine.


API in other programming languages

I reached the above proposal for APIs because of the comparison with the other programming languages.

Local state

Regarding thread safety, each thread should have a separate state. The suggested API always takes the local state as an argument, and it is simple to remember. This design is typical in non-object-oriented languages.

Julia can have the state as a local variable. Most numerical libraries supporting Fortran give the local state (e.g., MKL, NAG, GSL). In object-oriented languages, a PRNG is an instance of a generator class, and each thread has a local PRNG (e.g., C++ and NumPy).

User-derived type specialized to a particular engine

The "type" (say, object) that I have shown in the proposal is flexible to have any engine if it is compliant with the interface. The engine is assigned in the seed subroutine. Because we can use the common object, the users easily remember this rule.

This API is similar to MKL, although it seems not to use a procedure pointer. In C++, the programmer specifies a particular class of the engine. It is efficient in computations, but we have to prepare a derived-type for each of the engines.

@masuday
Copy link
Author

masuday commented Feb 6, 2020

I have investigated the APIs of PRNGs in major programming languages and libraries. See my Gist for details. Based on the research, I would like to suggest a revised API.

I would suggest 3 subroutines for a random-bit generator (the engine): initializer, re-seeder, and uniform PRNG. The user must call the initializer for the first time (optionally with a seed). A separate subroutine puts/gets the internal state. A new API looks like:

   type(random_number_generator) :: rng
   integer :: state_size
   integer(int64),allocatable :: state(:)
   real(dp) :: harvest

   ! initialize
   call random_number_init(rng, "default", seed=[3245,8730,1211])

   ! uniform real numbers
   call random_real_uniform(rng, harvest)

   ! reseed by date_and_time
   call random_number_seed(rng, auto=.true.)

   ! get state
   call random_number_state(rng,size=state_size)
   allocate(state(state_size))
   call random_number_state(rng,get=state)

   ! put state
   call random_number_state(rng,put=state)

I support the "default" PRNG for a real number ranged between 0 and 1. Looking at the other languages/libraries, the range differs.

  • Inclusive of 0: [0,1) i.e. 0<=r<1; Fortran, C++, and major compiled and scripting languages.
  • Exclusive of 0: (0,1] or (0,0); MATLAB, R, and many numerical/statistical libraries and software.

I believe that [0,1) is a straightforward implementation and that (0,1] or (0,0) is useful for applications such as sampling from statistical distribution functions because zero is problematic in logarithm and division (although the probability of getting zero is negligible). I tend to support (0,1] or (0,0) as the default.

Any thoughts?

@leonfoks
Copy link

leonfoks commented Feb 6, 2020

I think the 3 basic methods you suggest are a great starting point for us to get going on this.

I might have missed this somewhere, but are these functions meant to be type bound procedures of the random_number_generator derived type? They might as well be since the user needs to import the random_number_generator class anyway.

The constructor needs to return the derived type random_number_generator from a function to take advantage of the more modern derived type constructor capabilities. This would let the user instantiate the class with the name of the class. @certik @milancurcic Is there any reason we might want to stay away from the constructor interface on DTs based on the discussion about functions returning anything other than a scalar?

Here are the key points

  • Make call random_number_init(rng, "default", seed=[3245,8730,1211]) an actual constructor for the random_number_generator class
  • Rename random_real_uniform() to rand() since that is the most common across the languages.
  • Make the subroutine random_number_state into a type bound procedure, either as two separate subroutines, %get_state and %set_state, or as %state with optional arguments. Personally I prefer two separate TBPs.
  • We don't need the attribute state_size, we can inquire that from the state once it's set.

From a user perspective, using the module below would look like this

type(random_number_generator) :: rng
real(dp) :: a, b(10), c(10, 10)
integer(i64), allocatable :: state(:)

rng = random_number_generator(seed=[3245, 8730, 1211])

call rng%get_state(state)

call rng%rand(a)
call rng%rand(b)
call rng%rand(c)

The module

module m_random_number_generator

type random_number_generator
    integer(int64), allocatable :: state(:)

contains

    procedure, public :: get_state => get_state_rng

    generic, public :: rand => rand_d1_Prng_, rand_d1D_Prng_, rand_d2D_Prng_
        !! random_number_generator%rand() - Draw from a uniform distribution
        !! Draws uniform random numbers on [0, 1)
    procedure, private :: rand_d1_Prng_ => rand_d1_Prng
    procedure, private :: rand_d1D_Prng_ => rand_d1D_Prng
    procedure, private :: rand_d2D_Prng_ => rand_d2D_Prng

    procedure, public :: set_state => set_state_rng

end type random_number_generator

interface random_number_generator
    !!Overloaded Initializer for a Prng - Pseudo random number generator.

    module procedure :: init_with_seed_rng, init_random_seed_rng

end interface random_number_generator

!====================================================================!
function init_with_seed_rng(seed) result(this)
    type(random_number_generator) :: this
        !! Prng Class
    integer(i64), intent(in) :: seed(:)
        !! Seed of the Prng

    ! Set the explicit seed
end function init_with_seed_rng

function init_random_seed_rng() result(this)
    type(random_number_generator) :: this
        !! Prng Class

    ! Use date and time to obtain a random seed
    ! Set the seed.
end function init_random_seed_rng
!====================================================================!

subroutine get_state_rng(this, state)
    class(random_number_generator), intent(in) :: this
        !! Prng Class
    integer(i64), allocatable, intent(inout) :: state(:)
        !! State of the Prng
end subroutine get_state_rng

subroutine set_state_rng(this, state)
    class(random_number_generator), intent(inout) :: this
        !! Prng Class
    integer(i64), intent(in) :: state(:)
        !! Set this as the state of the Prng
end subroutine set_state_rng

! Use FYPP to process the different functions and ranks.
subroutine rand_d1_Prng(this, res)
    class(random_number_generator), intent(inout) :: this
        !! Prng Class
    real(dp), intent(out) :: res
end subroutine

subroutine rand_d1D_Prng(this, res)
    class(random_number_generator), intent(inout) :: this
        !! Prng Class
    real(dp), intent(out) :: res(:)
end subroutine

subroutine rand_d2D_Prng(this, res)
    class(random_number_generator), intent(inout) :: this
        !! Prng Class
    real(dp), intent(out) :: res(:, :)
end subroutine

end module m_random_number_generator

@masuday
Copy link
Author

masuday commented Feb 6, 2020

@leonfoks, I agree with you. I started from a traditional style, and my suggestion converged to an object-oriented style. We can implement PRNGs in the OO style while we have traditional (non-OO) APIs that I prefer to use as @certik. We can wrap the OO APIs with some subroutines (it is a reason why my suggestion has subroutines).

I prefer short names: rand, set_seed, set_state, get_state, It makes sense to separate set_state and get_state for the OO APIs. For traditional APIs, it can be unified. I also agree not to have state_size.

Currently, we do not have OO APIs in src/, so it should be a model-case for such APIs.

@masuday
Copy link
Author

masuday commented Feb 7, 2020

EDIT: I rewrote the code because my post was wrong. Everything in a class is private (protected is not allowed).

I have some suggestions on @leonfoks's class proposal. IMO, the internal state should not be open. Also, we may have 2 constants for convenience.

integer,parameter :: random_seed_kind = int64   ! would be int32
integer,parameter :: random_state_kind = int64

type random_number_generator
private
    integer(random_state_kind),allocatable :: state(:)

contains
...
end type random_number_generator

Following @certik's comment, I would modify get_state to have non-allocatable state.

subroutine get_state(this,state)
    class(random_number_generator) :: this
    integer(random_state_kind),intent(inout) :: state(:)
end subroutine get_state

@ivan-pi
Copy link
Member

ivan-pi commented Mar 10, 2020

@leonfoks, I agree with you. I started from a traditional style, and my suggestion converged to an object-oriented style. We can implement PRNGs in the OO style while we have traditional (non-OO) APIs that I prefer to use as @certik. We can wrap the OO APIs with some subroutines (it is a reason why my suggestion has subroutines).

Wrapping the OO API in a subroutine would effectively mean the code is only valid from Fortran 90 upward when derived types were introduced. While I suppose most people today are using sufficiently modern compilers and would not be affected (unless enforcing specific compiler flags), should we aim to have subroutine versions for users who might want to import the new RNG routines into their decades old F77 codebases? @certik @milancurcic

@certik
Copy link
Member

certik commented Mar 10, 2020

@ivan-pi I think there is general agreement that we will do a low level API that is just subroutines, no derived types (unless there is no clean way to do that), and then an optional high level API that uses OO.

@dev-zero
Copy link

dev-zero commented May 4, 2020

As for the OO-API: I've ported CP2K's parallel stream PRNG to OO. As PRNGs go it is probably a very specific case and unimportant for most other applications, but already before I ported it there was a possibility to dump & load the state.

@certik
Copy link
Member

certik commented Jun 9, 2020

@masuday do you want to open a PR with initial implementation of this?

I think there is a broad agreement that we want this, and it's just about agreeing on an API that would work for everybody.

@masuday
Copy link
Author

masuday commented Jun 9, 2020

@certik Thank you for chiming. I agree with you, and I am making a PR in a couple of days.

@esterjo
Copy link

esterjo commented Dec 21, 2020

I'd like to call out this RNG library, which I use regularly (currently the C++ version). See also wiki. There is a Fortran wrapper to the version producing 32bit random outputs but I think it's worth implementing the whole thing. It seems to have become quite popular lately. There is also an explicit SIMD version of one of the 32-bit output variants in C by way of AVX intrinsics.

EDIT: but a generator with less than 64bit state isn't recommended for scientific applications

@esterjo
Copy link

esterjo commented Dec 22, 2020

Also, I noticed that the author of xorshiro256++ has a "vectorized" implementation. I put it in quotes because he doesn't explicitly use SIMD instrinsics, but writes it in a way that makes it extremely easy for the compiler to vectorize (on gcc use -O3 -fdisable-tree-cunrolli -march=native). A strategy I greatly prefer to explicit SIMD. Not necessary for now, but it would be good to lay the groundwork for optimizing stdlib functions later by leaving reminders in the comments.

@Beliavsky
Copy link

Random number generators are rarely implemented as pure procedures -- the state of the RNG is saved within the procedure or the enclosing module. I suggest that when RNG are added to stdlib that alongside the usual non-pure procedures that versions as pure subroutines also be added, where the state is passed as an argument, as done here. Such pure RNG can then be used where pure procedures are required, such as in do concurrent blocks or in other pure procedures.

@Beliavsky
Copy link

TensorFlow has both impure and "stateless" RNG, for example tf.random.stateless_uniform

@14NGiestas
Copy link
Member

14NGiestas commented May 6, 2022

I'm closing this issue since Probability Distribution and Statistical Functions -- PRNG Module #271 was merged in 6 Feb 2021 so it's likely to be resolved (it isn't tagged in any release version nor mentioned but it was probably in 0.2.0).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: algorithms searching and sorting, merging, ...
Projects
None yet
Development

No branches or pull requests

10 participants