Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

random number generators not identical across accelerators #390

Open
psychocoderHPC opened this issue Aug 30, 2017 · 7 comments
Open

random number generators not identical across accelerators #390

psychocoderHPC opened this issue Aug 30, 2017 · 7 comments

Comments

@psychocoderHPC
Copy link
Member

Currently the generation method for random numbers for a accelerator is fixed defined within alpaka.
To provide different generators per accelerator depending of the users needs we should think about an interface change.

Why we need different generators:

  • each generator provide different qualities of random numbers
  • the size of the stored random number state is different and influence the memory usage and performance

e.g. PIConGPU provides different methods up to my pull request to use the native alpaka generator which removes the possibility for the user to control the quality of the RNG generator.

@BenjaminW3
Copy link
Member

BenjaminW3 commented Aug 31, 2017

In alpaka the generators are already seperated from the distribution.
So in theory it would be possible to use different generators. However, there is some work to do:

    1. the CUDA backend only provides the Xor generator. XorMin, MRG32k3a and MRG32k3aMin could easily be added to RandCuRand.hpp. Please create corresponding Pull Requests.
    1. the CPU backends only provide a mersenne twister generator. (Not even as a standalone generator but only direclty within alpaka::rand::generator::createDefault) This generator should be made its own class. There are some other generators provided by the C++ standarad library which could be added to RandStl.hpp.
    1. I have no Idea how to implement Xor, XorMin, ... generators similar to the CUDA ones for the CPU backends.
    1. due to those differences between the available generators for CUDA and CPU, there is no way to use the same generator on all backends. alpaka::rand::generator::createDefault simply uses an unspecified generator.
  • 5. there are no unit tests for the random functions

Points 1, 2 and 5 can be solved, but point 4 depends on point 3 which may be very hard.

Edit: point 5 has been solved.
Edit: parts of point 2 have been solved.

@psychocoderHPC
Copy link
Member Author

Thanks for the summary. Point 3 is one reason why I opened this pull request. I think it is not possible or to hard to maintain that we have all Generators on all platforms.
I will use this issue also for thinking about solution how we can handle that each platform maybe ships different algorithm and never the less give the user the opportunity to write code without #if to support the differences between the platforms.

One idea is to create something like a factory where the user can set properties like quality, performance and memory usage and gets back the type of the best fitting generator. If a platform has only implemented one algorithm than there will be always the same generator returned.

@BenjaminW3
Copy link
Member

Such a factory might be the only viable option. It might be hard to find the correct properties to describe the generators.
I will work on point 5 and write some unit tests for the existing generators/distributions because I am already adding some stream and event unit tests at the moment.

@BenjaminW3 BenjaminW3 changed the title different random number generators for a accelerator random number generators not identical across accelerators Nov 3, 2017
@ax3l
Copy link
Member

ax3l commented Nov 28, 2017

Admittedly, a typical PIConGPU 0.4.0-dev simulation on Tesla P100 currently uses (wastes) 18-25% of its main memory (3 our of 12/16 GByte) just to the RNG state. Can we do anything to allow backend-specific RNGs like the one we had before ComputationalRadiationPhysics/picongpu#2226 again (~50% mem footprint)? It would be totally fine if that RNG is only usable on a specific backend (e.g. via a less-specific wrapper/factory as above) and an other implementation (and API) is used on other backends.

@ax3l
Copy link
Member

ax3l commented Nov 28, 2017

cross-linking ComputationalRadiationPhysics/picongpu#2410 as @psychocoderHPC work-arounds back the XorMin (6xint32 state per thread; 50% footprint of current RNG) back into PIConGPU in parallel

@ax3l
Copy link
Member

ax3l commented Aug 6, 2018

Proposed implementation:

enum class
Generator {
    Default,
    MersenneTwister
    // , ...
};

// ...
auto genMersenneTwister = alpaka::rand::generator::create<
    alpaka::rand::Generator::MersenneTwister
>(
            acc,
            12345u,
            6789u
);

@j-stephan
Copy link
Member

We discussed this in today's meeting. @sliwowitz is currently working on a separate RNG library on top of alpaka that will adress this issue. This is therefore WONTFIX and will be closed once the new RNG library is public.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants