This package explores a possible extension of rand
-related
functionalities (from the Random
module); the code is initially
taken from JuliaLang/julia#24912.
Note that type piracy is committed!
While hopefully useful, this package is still experimental, and
hence unstable. User feedback, and design or implementation contributions are welcome.
This does essentially four things:
-
define distribution objects, to give first-class status to features provided by
Random
; for examplerand(Normal(), 3)
is equivalent torandn(3)
; other available distributions:Exponential
,CloseOpen
(for generation of floats in a close-open range) and friends,Uniform
(which can wrap an implicit uniform distribution); -
define
make
methods, which can combine distributions for objects made of multiple scalars, likePair
,Tuple
, orComplex
, or describe how to generate more complex objects, like containers; -
extend the
rand([rng], [S], dims)
API to allow the generation of other containers than arrays (likeSet
,Dict
,SparseArray
,String
,BitArray
); -
define a
Rand
iterator, which produces lazily random values.
Point 1) defines a Distribution
type which is incompatible with the
"Distributions.jl" package. Input on how to unify the two approaches is
welcome.
Point 2) is really the core of this package. make
provides a vocabulary to define the generation
of "scalars" which require more than one argument to be described, e.g. pairs from 1:3
to Int
(rand(make(Pair, 1:3, Int))
) or regular containers (e.g. make(Array, 2, 3)
). The point of
calling make
rather than putting all the arguments in rand
directly is simplicity and
composability: the make
call always occurs as the second argument to rand
(or first if the RNG
is omitted). For example, rand(make(Array, 2, 3), 3)
creates an array of matrices.
Of course, make
is not necessary, in that the same can be achieved with an ad hoc struct
,
which in some cases is clearer (e.g. Normal(m, s)
rather than something like make(Float64, Val(:Normal), m, s)
).
As an experimental feature, the following alternative API is available:
rand(T => x)
is equivalent torand(make(T, x))
rand(T => (x, y, ...))
is equivalent torand(make(T, x, y, ...))
This is for convenience only (it may be more readable), but may be less efficient due to the
fact that the type of a pair containing a type doesn't know this exact type (e.g. Pair => Int
has type Pair{UnionAll,DataType}
), so rand
can't infer the type of the generated value.
Thanks to inlining, the inferred types can however be sufficiently tight in some cases
(e.g. rand(Complex => Int, 3)
is of type Vector{Complex{Int64}}
instead of Vector{Any}
).
Point 3) allows something like rand(1:30, Set, 10)
to produce a Set
of length 10
with values
from 1:30
. The idea is that rand([rng], [S], Cont, etc...)
should always be equivalent to
rand([rng], make(Cont, [S], etc...))
. This design goes somewhat against the trend in Base
to create
containers using their constructors -- which by the way may be achieved via the Rand
iterator from
point 4). Still, I like the terse approach here, as it simply generalizes to other containers the
current rand
API creating arrays. See the issue linked above for a discussion on these topics.
For convenience, the following names from Random
are re-exported
in this package: rand!
, AbstractRNG
, MersenneTwister
,
RandomDevice
(rand
is in Base
). Functions like randn!
or
randstring
are considered to be obsoleted by this package so are not
re-exported. It is still necessary to import Random
separately in order
to use functions which don't extend the rand
API, namely
randsubseq
, shuffle
, randperm
, randcycle
, and their mutating
variants.
There is not much documentation for now: rand
's docstring is updated,
and here are some examples:
julia> rand(CloseOpen(Float64)) # equivalent to rand(Float64)
0.7678877639669386
julia> rand(CloseClose(1.0f0, 10)) # generation in [1.0f0, 10.0f0]
6.62467f0
julia> rand(OpenOpen(2.0^52, 2.0^52+1)) == 2.0^52 # exactness not guaranteed for "unreasonable" values!
true
julia> rand(Normal(0.0, 10.0)) # explicit μ and σ parameters
-8.473790458128912
julia> rand(Uniform(1:3)) # equivalent to rand(1:3)
2
julia> rand(make(Pair, 1:10, Normal())) # random Pair, where both members have distinct distributions
5 => 0.674375
julia> rand(make(Pair{Number,Any}, 1:10, Normal())) # specify the Pair type
Pair{Number, Any}(1, -0.131617)
julia> rand(Pair{Float64,Int}) # equivalent to rand(make(Pair, Float64, Int))
0.321676 => -4583276276690463733
julia> rand(make(Tuple, 1:10, UInt8, OpenClose()))
(9, 0x6b, 0.34900083923775505)
julia> rand(Tuple{Float64,Int}) # equivalent to rand(make(Tuple, Float64, Int))
(0.9830769470405203, -6048436354564488035)
julia> rand(make(NTuple{3}, 1:10)) # produces a 3-tuple with values from 1:10
(5, 9, 6)
julia> rand(make(NTuple{N,UInt8} where N, 1:3, 5))
(0x02, 0x03, 0x02, 0x03, 0x02)
julia> rand(make(NTuple{3}, make(Pair, 1:9, Bool))) # make calls can be nested
(2 => false, 8 => true, 7 => false)
julia> rand(make(Complex, Normal())) # each coordinate is drawn from the normal distribution
1.5112317924121632 + 0.723463453534426im
julia> rand(make(Complex, Normal(), 1:10)) # distinct distributions
1.096731587266045 + 8.0im
julia> rand(Normal(ComplexF64)) # equivalent to randn(ComplexF64)
0.9322376894079347 + 0.2812214248483498im
julia> rand(Set, 3)
Set{Float64} with 3 elements:
0.0675168818514279
0.31058418699493895
0.15029104540378424
julia> rand!(ans, Exponential())
Set{Float64} with 3 elements:
1.082312697650858
1.2984094155972015
0.016146678329819485
julia> rand(1:9, Set, 3) # if you try `rand(1:3, Set, 9)`, it will take a while ;-)
Set{Int64} with 3 elements:
4
7
1
julia> rand(Dict{String,Int8}, 2)
Dict{String, Int8} with 2 entries:
"vxybIbae" => 42
"bO2fTwuq" => -13
julia> rand(make(Pair, 1:9, Normal()), Dict, 3)
Dict{Int64, Float64} with 3 entries:
9 => 0.916406
3 => -2.44958
8 => -0.703348
julia> using SparseArrays
julia> rand(SparseVector, 0.3, 9) # equivalent to sprand(9, 0.3)
9-element SparseVector{Float64, Int64} with 3 stored entries:
[1] = 0.173858
[6] = 0.568631
[8] = 0.297207
julia> rand(Normal(), SparseMatrixCSC, 0.3, 2, 3) # equivalent to sprandn(2, 3, 0.3)
2×3 SparseMatrixCSC{Float64, Int64} with 2 stored entries:
⋅ -1.5617 ⋅
0.572305 ⋅ ⋅
# like for Array, sparse arrays enjoy to be special cased: `SparseVector` or `SparseMatrixCSC`
# can be omitted in the `rand` call (not in the `make` call):
julia> rand(make(SparseVector, 1:9, 0.3, 2), 0.1, 4, 3) # possible, bug ugly output when non-empty :-/
4×3 SparseMatrixCSC{SparseVector{Int64,Int64},Int64} with 0 stored entries
julia> rand(String, 4) # equivalent to randstring(4)
"5o75"
julia> rand("123", String, 4) # like above, String creation with the "container" syntax ...
"2131"
julia> rand(make(String, 3, "123")) # ... which is as always equivalent to a call to make
"211"
julia> rand(String, Set, 3) # String considered as a scalar
Set{String} with 3 elements:
"jDbjXu9b"
"0Lo75VKo"
"webpNhfY"
julia> rand(BitArray, 3) # equivalent to, but unfortunately more verbose than, bitrand(3)
3-element BitVector:
1
1
0
julia> rand(Bernoulli(0.2), BitVector, 10) # using the Bernoulli distribution
10-element BitVector:
0
1
0
1
0
0
0
0
0
1
julia> rand(1:3, NTuple{3}) # NTuple{3} considered as a container, equivalent to rand(make(NTuple{3}, 1:3))
(3, 3, 1)
julia> rand(1:3, Tuple{Int,UInt8, BigFloat}) # works also with more general tuple types ...
(3, 0x02, 2.0)
julia> rand(1:3, NamedTuple{(:a, :b)}) # ... and with named tuples
(a = 3, b = 2)
julia> RandomExtensions.random_staticarrays() # poor man's conditional modules!
# ugly warning
julia> rand(make(MVector{2,AbstractString}, String), SMatrix{3, 2})
3×2 SArray{Tuple{3,2},MArray{Tuple{2},AbstractString,1,2},2,6} with indices SOneTo(3)×SOneTo(2):
["SzPKXHFk", "1eFXaUiM"] ["RJnHwhb7", "jqfLcY8a"]
["FMTKcBY8", "eoYtNntD"] ["FzdD530L", "ux6sWGMU"]
["fFJuUtJQ", "H2mAQrIV"] ["pt0OYFJw", "O0fCfjjR"]
julia> Set(Iterators.take(Rand(RandomDevice(), 1:10), 3)) # RNG defaults to Random.default_rng()
Set{Int64} with 2 elements: # note that the set can end up with less than 3 elements if `Rand` generates duplicates
5
9
julia> collect(Iterators.take(Uniform(1:10), 3)) # distributions can be iterated over, using Random.default_rng() implicitly
3-element Vector{Int64}:
9
6
8
julia> rand(Complex => Int) # equivalent to rand(make(Complex, Int)) (experimental)
4610038282330316390 + 4899086469899572461im
julia> rand(Pair => (String, Int8)) # equivalent to rand(make(Pair, String, Int8)) (experimental)
"ODNXIePK" => 4
In some cases, the Rand
iterator can provide efficiency gains compared to
repeated calls to rand
, as it uses the same mechanism as array generation.
For example, given a = zeros(1000)
and s = BitSet(1:1000)
,
a .+ Rand(s).()
is three times faster than a .+ rand.(Ref(s))
.
Note: as seen in the examples above, String
can be considered as a scalar or as a container (in the rand
API).
In a call like rand(String)
, both APIs coincide, but in rand(String, 3)
, should we construct a String
of
length 3
(container API), or an array of strings of default length 8
? Currently, the package chooses
the first interpretation, partly because it was the first implemented, and also because it may actually be the one
most useful (and offers the tersest API to compete with randstring
).
But as this package is still unstable, this choice may be revisited in the future.
Note that it's easy to get the result of the second interpretation via either rand(make(String), 3)
,
rand(String, (3,))
or rand(String, Vector, 3)
.
How to extend: the make
function is meant to be extensible, and there are some helper functions
which make it easy, but this is still experimental. By default, make(T, args...)
will
create a Make{maketype(T, args...)}
object, say m
, which contain args...
as fields. For type
stable code, the rand
machinery likes to know the exact type of the object which will be generated by
rand(m)
, and maketype(T, args...)
is supposed to return that type. For example,
maketype(Pair, 1:3, UInt) == Pair{Int,UInt}
.
Then just define rand
for m
like documented in the Random
module, e.g.
rand(rng::AbstractRNG, sp::SamplerTrivial{<:Make{P}}) where {P<:Pair} = P(rand(sp[][1]), rand(sp[][2]))
.
For convenience, maketype(T, ...)
defaults to T
, which means that for simple cases, only the
rand
function has to be defined. But in cases like for Pair
above, if maketype
is not
defined, the generated type will be assumed to be Pair
, which is not a concrete type
(and hence suboptimal).
This package started out of frustration with the limitations of the Random
module. Besides
generating simple scalars and arrays, very little is supported out of the box. For example,
generating a random Dict
is too complex. Moreover, there are too many functions for my taste:
rand
, randn
, randexp
, sprand
(with its exotic rfn
parameter), sprandn
, ,
sprandexp
randstring
, bitrand
, and mutating counterparts (but I believe randn
will never go away, as
it's so terse). I hope that this package can serve as a starting point towards improving Random
.