Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rocrand_set_offset unexpected behaviour #234

Closed
sbalint98 opened this issue Dec 11, 2021 · 5 comments · Fixed by #239
Closed

rocrand_set_offset unexpected behaviour #234

sbalint98 opened this issue Dec 11, 2021 · 5 comments · Fixed by #239
Assignees

Comments

@sbalint98
Copy link
Contributor

Describe the bug
rorand_set_offset seems to change the seed of the generator. The numbers produced by a generator with and without offset with the same seed are completely different.

To Reproduce
Steps to reproduce the behavior:

  1. Install rocrand-dev4.5.2 and hip-dev4.5.2 version 4.5.2 using the package repositories

  2. Compile the attached reproducer (reprod.txt) with the following line:
    /opt/rocm-4.5.2/hip/bin/hipcc -I/opt/rocm-4.5.2/rocrand/include/ -L/opt/rocm-4.5.2/rocrand/lib/ -lrocrand test.cpp

  3. See that in the output no offset can be archived, but the two lists of random numbers seem to be completely independent:

0.0225561:0.750519
0.129137:0.628226
0.805372:0.446292
0.974561:0.55909
0.109374:0.951059
0.471769:0.231839
0.920535:0.930325
0.731697:0.620792
0.33033:0.228607
0.2921:0.888079

Expected behavior
Based on the description here, I would expect that the second list of random numbers generated is the same as the original one, but with a certain offset. For example in this case I would expect something like:

0.0225561:0.00121
0.129137:0.0225561
0.805372:0.129137
0.974561:0.805372
0.109374:0.974561
0.471769:0.109374
0.920535:0.471769
0.731697:0.920535
0.33033:0.731697
0.2921:0.33033

Log-files
out.txt

Environment
environment.txt

Additional context
This issue was encountered while working on the rocrand backend for oneMKL.

@neon60
Copy link
Collaborator

neon60 commented Dec 13, 2021

@Maetveis Please check the issue and try to reproduce.

@Maetveis
Copy link
Contributor

Maetveis commented Dec 13, 2021

The issue here is that the random number generation does not work the way you're thinking it works.
Lets call our random number generator g. Since g has internal state lets keep track of the number of calls to g by its superscript. There can be multiple generators so lets mark them with a subscript.
With this notation for example equation is the third generator with two prior applications.

What you are expecting is that the random numbers are generated like this:
equation
I.e. the same generator (with the same seed) is used with seqential offsets.
But they are generated like this:
equation
I.e. each output is generated by an independent generator. The next random numbers would be
equation

With offset = n each generator is set to the same state as if has been used n times. So sticking with the same notation:
equation

Your reproducer would pass with this code:

  ROCRAND_CALL(rocrand_create_generator(&gen, ROCRAND_RNG_PSEUDO_MRG32K3A));
  ROCRAND_CALL(rocrand_create_generator(&gen2, ROCRAND_RNG_PSEUDO_MRG32K3A));
  ROCRAND_CALL(rocrand_set_seed(gen, 1));
  ROCRAND_CALL(rocrand_set_seed(gen2, 1));
  ROCRAND_CALL(rocrand_set_offset(gen2, 1));

  // Two applications of gen, overwrite deviceA with second result
  ROCRAND_CALL(rocrand_generate(gen, deviceA, NUM));
  ROCRAND_CALL(rocrand_generate(gen, deviceA, NUM));
  
  ROCRAND_CALL(rocrand_generate(gen2, deviceB, NUM));
  // deviceA and deviceB now have the same values

@sbalint98
Copy link
Contributor Author

sbalint98 commented Dec 13, 2021

Thank you for your quick response and your thorough explanation!

I think the snippet that you have provided does not cover the use case that we are interested in. Our primary goal is to reproduce the behaviour of curand, which generates two lists with an offset.

For example, given two lists original_ran_list, and ofsetted_ran_list generated by generators with the same seed but the generator of ofsetted_ran_list were applied the offset of n, the following property should hold: original_ran_list[i+n] == ofsetted_ran_list[i]. (for curand behaviour see this example:
curand_offset.txt )

if I understand your explanation correctly, with rocRAND one way to achieve the same semantics is to call rocrand_generate NUM+1 times and then discard the first n elements. I believe this approach would be rather cumbersome to implement, and very inefficient.

Could you suggest a better solution to achieve the semantics described above?

@Maetveis
Copy link
Contributor

This is basically a continuity issue, the random numbers are not generated in a consistent order.
For example if one was to generate a 100 elements using two calls to the host API generating 50 elements each, then compare that with the same generator and the same seed generated with 20 then 80 elements, then the they differ after the first 20 elements.

My explanation was only for explaining what rocRAND does currently, but it wont have the same effect as setting the offset for cuRAND. The rocRAND behaviour is incorrect, I don't think there's a reasonable (as in using the host api) workaround until its fixed.

@Maetveis
Copy link
Contributor

Maetveis commented Jan 5, 2022

You can now try out the branch mrg32k3a_offset (#236) it should fix the mrg32k3a generator (used in your reproducer). Other generators are in separate branches.

nolmoonen pushed a commit that referenced this issue Nov 14, 2022
Remove the Python2 test case from CI

Closes #234

See merge request amd/libraries/rocRAND!218
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants