Optimise `Rng::gen` on arrays? #269

dhardy · 2018-02-23T15:25:40Z

I noticed that Rng::gen and Rng::fill don't do the same thing on arrays and have different results when the element size is small:

rng.gen() essentially does [rng.gen(), rng.gen(), ..., rng.gen()]
rng.fill(&mut buf) casts buf to &mut [u8] then uses RngCore::fill_bytes — more efficient, and only implemented for integer arrays/slices

We likely want to keep the current behaviour for rng.gen() when the element type is f32/f64/complex types, but for integer types can we optimise to use fill_bytes?

The text was updated successfully, but these errors were encountered:

clarfonthey · 2018-03-08T17:24:46Z

Perhaps fill and try_fill should only work on byte slices and other arrays can just use distributions?

pitdicker · 2018-03-08T17:32:51Z

Perhaps fill and try_fill should only work on byte slices and other arrays can just use distributions?

That is how they work right now 😄.

This issue asks the question if we can/should optimise the rng.gen() implementation for integer arrays to also use fill like it is a slice.

dhardy · 2018-03-08T18:06:36Z

Actually fill also works on integer slices/arrays; it's fill_bytes which is restricted to byte slices.

The point is that you can write let x: [u32; 32] = rng.gen(); in the current rand, but it would be much faster to write the following (and also have different result, but statistically equivalent):

let mut x = [0u32; 32];
rng.fill(&mut x);

dhardy · 2018-06-02T16:53:01Z

Well, the current gen() (aka Standard) implementation for arrays is actually very fast:

test gen_1k_fill                     ... bench:         377 ns/iter (+/- 16) = 2716 MB/s
test gen_1k_gen_array                ... bench:         261 ns/iter (+/- 5) = 3923 MB/s
test gen_1k_gen_iter                 ... bench:         633 ns/iter (+/- 20) = 1617 MB/s
test gen_1k_iter_repeat              ... bench:         383 ns/iter (+/- 11) = 2673 MB/s
test gen_1k_sample_iter              ... bench:         390 ns/iter (+/- 8) = 2625 MB/s

... for u64 elements, at least.

I found a way to make gen() use fill on arrays:

// Specialized implementation:
impl<T> Distribution<T> for Standard where T: AsByteSliceMut+Default {
    #[inline]
    fn sample<R: Rng + ?Sized>(&self, _rng: &mut R) -> T {
        let mut result: T = Default::default();
        _rng.fill(&mut result);
        result
    }
}

The problems are (1) that it conflicts with other implementations (even using #![feature(specialization)] on nightly, all versions I've tried appear to conflict), and (2) that for u64 at least this version actually hurts performance.

If we can sort out the specialisation (when Rust supports it on stable, or alternatively negative trait bounds), and we restrict to small types (u8, i8, probably also u16 and i16), then this may work:

// within array_impl macro:
        // Specialized implementation:
        impl<u8> Distribution<[u8; $n]> for Standard where [u8; $n]: AsByteSliceMut+Default {
            #[inline]
            fn sample<R: Rng + ?Sized>(&self, _rng: &mut R) -> [u8; $n] {
                let mut result: [u8; $n] = Default::default();
                _rng.fill(&mut result);
                result
            }
        }

For now, I believe the best option is to do nothing (we can actually implement this instead of the generic Distribution<[T; $n]> support, but of course that restricts users since any array type not specifically implemented will cease to be supported — e.g. currently [[u64; 32]; 4] is supported).

dhardy · 2019-06-03T10:26:45Z

I propose we close this with documentation only (provided in #812) since although using specialisation as above might be possible in the future, value-stability is important and this specialisation would yield different values for many types (with actual behaviour depending on the RNG).

It appears that specialization will not be available in the near future: tracking issue. Would we want to make a value-breaking change when it becomes available?

Also, any such implementation would have to make a fixed choice about which types to use the fill "optimisation" for, while the fastest method may depend on RNG and platform.

vks · 2019-06-03T10:50:36Z

This might be a better option:

diff --git a/src/distributions/other.rs b/src/distributions/other.rs
index 2295f79..883b332 100644
--- a/src/distributions/other.rs
+++ b/src/distributions/other.rs
@@ -137,10 +137,13 @@ macro_rules! array_impl {
     {$n:expr, $t:ident, $($ts:ident,)*} => {
         array_impl!{($n - 1), $($ts,)*}
 
-        impl<T> Distribution<[T; $n]> for Standard where Standard: Distribution<T> {
+        impl<T> Distribution<[T; $n]> for Standard
+        where Standard: Distribution<T>, [T]: crate::AsByteSliceMut, T: Default {
             #[inline]
-            fn sample<R: Rng + ?Sized>(&self, _rng: &mut R) -> [T; $n] {
-                [_rng.gen::<$t>(), $(_rng.gen::<$ts>()),*]
+            fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> [T; $n] {
+                let mut a = Default::default();
+                rng.fill(&mut a);
+                a
             }
         }
     };

The Distribution<T> requirement is not needed, but it avoids implementing the trait for potentially unwanted types. This does not work for [[u64; 32]; 4], but arguably that is not very important and could be fixed by implementing AsByteSliceMut for nested arrays.

Of course, this is still a regression for types that don't implement AsByteSliceMut.

dhardy · 2019-06-03T11:00:19Z

This doesn't work for many types, e.g. [f32; 3]. Is byte-buffer gen performance more important than being able to generate arrays of other types?

vks · 2019-06-03T11:28:21Z

Fair enough. However, it would be nice if fill could support floats. I think it is fine to close this issue, following your argumentation in #269 (comment).

dhardy · 2019-06-03T13:13:16Z

Perhaps we got the design of AsByteSliceMut wrong), and instead we should have:

pub trait Fill {
    fn try_fill<R: Rng + ?Sized>(&mut self, rng: &mut R) -> Result<(), Error>;
}

With that, we could implement support for float slices; also we could support gen for any type supporting Fill + Default (so long as we avoided overlapping implementations) which would allow this optimisation and make arr = rng.gen() and rng.fill(&mut arr) value-equivalent on arrays.

Possibly this was not done because fill_bytes and try_fill_bytes are not guaranteed to be semantically equivalent, so correct implementation of both would require two implementations (at least, for everything using byte filling internally). Now that we have a simpler Error type it is hard to see try_fill_bytes(buf).unwrap() being different to fill_bytes(buf), although technically it's still possible.

vks · 2019-06-03T14:08:33Z

I agree that Fill seems like the better abstraction. Maybe it's better to do that for 0.8 though.

I would not worry too much about fill_bytes vs. try_fill_bytes, we have similar problems with next_u32 and next_u64.

vks · 2021-09-15T16:51:10Z

Fill is now implemented and documented, including support for floats.

dhardy added C-optimisation P-low Priority: Low labels Mar 14, 2018

dhardy mentioned this issue Apr 27, 2018

Tracker: planned changes for 0.5 #232

Closed

33 tasks

dhardy added the P-postpone Waiting on something else label Jun 2, 2018

dhardy mentioned this issue Jun 27, 2018

Tracker: 0.6 release #520

Closed

28 tasks

dhardy mentioned this issue Oct 9, 2018

Hasher2Rng #627

Closed

dhardy mentioned this issue Apr 8, 2019

Future of SmallRng #767

Closed

vks mentioned this issue Jun 3, 2019

Document generation of array types using fill; support Wrapping types #812

Merged

vks closed this as completed Sep 15, 2021

vks mentioned this issue Sep 15, 2021

Mention that Fill supports floats #1183

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimise `Rng::gen` on arrays? #269

Optimise `Rng::gen` on arrays? #269

dhardy commented Feb 23, 2018

clarfonthey commented Mar 8, 2018

pitdicker commented Mar 8, 2018

dhardy commented Mar 8, 2018

dhardy commented Jun 2, 2018

dhardy commented Jun 3, 2019

vks commented Jun 3, 2019

dhardy commented Jun 3, 2019

vks commented Jun 3, 2019

dhardy commented Jun 3, 2019

vks commented Jun 3, 2019

vks commented Sep 15, 2021

Optimise Rng::gen on arrays? #269

Optimise Rng::gen on arrays? #269

Comments

dhardy commented Feb 23, 2018

clarfonthey commented Mar 8, 2018

pitdicker commented Mar 8, 2018

dhardy commented Mar 8, 2018

dhardy commented Jun 2, 2018

dhardy commented Jun 3, 2019

vks commented Jun 3, 2019

dhardy commented Jun 3, 2019

vks commented Jun 3, 2019

dhardy commented Jun 3, 2019

vks commented Jun 3, 2019

vks commented Sep 15, 2021

Optimise `Rng::gen` on arrays? #269

Optimise `Rng::gen` on arrays? #269