Skip to content

TheClimateCorporation/prng

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

com.climate/prng

Pseudorandom number generators in Clojure.

This project is in pre-release development and subject to change.

Concepts

A pseudorandom number generator (PRNG) is a deterministic algorithm that generates a sequence of numbers that appear to be random.

A PRNG has the following components:

  • State space, S: a finite set of possible states.
  • Output space, U: a finite set of possible outputs (pseudorandom numbers) that can be generated.
  • Transition function, f: a pure function that takes a state as its argument and returns a (usually different) state.
  • Output function, g: a pure function that maps a state to an element in the output space. Different output functions may be used to generate different numeric types (e.g. longs or doubles) from the same underlying sequence of states.

The state of a PRNG evolves according to the recurrence relationship (LaTeX math markup):

s_{i} = f(s_{i - 1})
u_{i} = g(s{i})

where

f: S \rightarrow S

is the state transition function and

g: S \rightarrow U

is the output function.

The initial state of a PRNG is typically constructed by provided a "seed" (a source of bits) to a factory function. When initialized with the same seed, a PRNG is guaranteed to subsequently generate exactly the same sequence of numbers.

Most widely used PRNG libraries represent PRNGs as stateful objects. A typical example is the java.util.Random class in Java, which encapsulates a hidden state. Calling a method to generate the next pseudorandom number in the sequence has the side effect of mutating this hidden state:

(def r (java.util.Random. 1234567))
(.nextDouble r)
;=> 0.24283348968807417
; nextDouble is impure! the second call returns a different result!
(.nextDouble r)
;=> 0.7520183689538884

Such statefulness makes it difficult to reason about concurrency or reproducibility. Also, as shown in this example, stateful PRNG libraries typically conflate the transition function and the output function in a single method.

The prng library takes a different approach to PRNGs that is more functional and Clojure-like. Instead of implicitly mutating a stateful object, the state of the PRNG is directly evolved by calling a transition function that returns the next state, and a pseudorandom number is directly generated by calling an output function on a state. States are values that can be passed to functions that produce other values. The transition function and the output function are separated concerns. This functional design is intended to make it easy to reason about concurrency and reproducibility, especially when using PRNGs in large distributed systems in which coordination is costly.

Usage

This is a quick demo that you can follow in a Clojure REPL to get acquainted with the prng library.

The namespace com.climate.prng.core defines a PRNGState protocol whose methods can be invoked on all implementing types.

(require '[com.climate.prng.core :as prng])

Specific PRNG implementations are found in namespaces under the prefix com.climate.prng.generators. Currently, two implementations are available:

Try using the Mersenne Twister:

(require '[com.climate.prng.generators.mersenne-twister :as mt])

To instantiate a Mersenne Twister state, call a factory function with a long[] seed argument.

(def seed (long-array [0 1 2 3]))
(def initial-state (mt/seed-state seed)))

The seed-state factory function returns a state value that satisfies the PRNGState protocol.

(satisfies? prng/PRNGState initial-state)
;=> true

For a given seed argument, the same return value ("same" in the sense of equals()) is returned:

(= initial-state (mt/seed-state seed))
;=> true

The PRNGState protocol methods are pure functions that take a state value argument and return other values, without mutating the input state value.

To sample a long from this PRNG, simply call the ->long protocol method on the state. ->long is a pure function; for a given state, it always returns the same value, and it does not mutate its state argument.

(prng/->long initial-state)
;=> -3722652993627521228
; Multiple calls return the same value.
(prng/->long initial-state)
;=> -3722652993627521228

Similarly, the ->double protocol method returns a double:

(prng/->double initial-state)
;=> 0.4049867569919572

The next-state protocol method takes a state argument and returns a new state that is one step ahead in the state sequence of the PRNG. The input state is not mutated.

(def new-state (prng/next-state initial-state))
(= new-state initial-state)
;=> false
; Get a long from this new state
(prng/->long new-state)
;=> -909579606142348662
; Get a long from the old state (compare to previous calls)
(prng/->long initial-state)
;=> -3722652993627521228

The jump-state protocol method takes two arguments, a state and a jump size, and jumps ahead by the provided jump size. jump-state returns the same state value as iterative composition of next-state, but it is much faster.

(def jump-size 1000000)
(def state-after-jump (prng/jump-state initial-state jump-size))
(= state-after-jump ((apply comp (repeat jump-size prng/next-state)) initial-state))
;=> true

Tests

lein test runs unit tests.

lein test :benchmark runs tests with the :benchmark test selector, which are a few performance benchmarks. They may take a while to complete.

Known issues

Concurrency

The Java classes MersenneTwisterState and ThreefishState have public methods that are untested for thread safety and are probably unsafe. Although the PRNGState protocol is designed for concurrency, it possible to bypass the protocol and directly invoke the unsafe methods of the implementation classes. To ensure thread safety, these implementation classes ought to be re-implemented with Clojure-style persistent data structures.

Speed

For the MersenneTwisterState implementation class, the jump-state protocol method uses a naive algorithm to compute jump-ahead from the remainder polynomial. Maromoto and colleauges demonstrated two alternative algorithms to compute jump-ahead more quickly. The jump-state method ought to be reimplemented with one of these faster algorithms.

The implementation of the jump-state method for ThreefishState uses kludgy BigInteger and hexadecimal string manipulations to perform binary arithmetic. Performance can almost certainly be improved by thoughtful bit twiddling.

Pseudorandomness

The Mersenne Twister is a well-known, high-quality PRNG that has been thoroughly tested for pseudorandomness. The implementation in this library is a bit-exact match to the reference implementation in C.

The counter-based Threefish PRNG has not been as well tested, and its pseudorandomness ought to be teststed with a suite of statistical tests.

License

Copyright © 2014 The Climate Corporation

Distributed under the Apache License, Version 2.0

About

pseudorandom number generators in Clojure

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published