New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changing std::rand() to a better inbuilt PRNG generator. #3204
Conversation
@Anushi1998 Why did you close this? |
@hkaiser I have squashed all commits now.I will add more to it asap |
@Anushi1998 could you please not add more to this PR? It's already very big and adding to it makes it even more difficult to properly review things. Please create a new PR for additional changes. |
.gitignore
Outdated
@@ -12,3 +12,5 @@ | |||
/.dockerignore | |||
.ycm_extra_conf.py | |||
apex | |||
config/ | |||
runtime/parcelset/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why have these directories been added to the .gitignore file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file has been accidentally committed.I will soon remove it
hpx/util/jenkins_hash.hpp
Outdated
@@ -98,8 +99,10 @@ namespace hpx { namespace util | |||
/// constructors and destructor | |||
jenkins_hash() : seed_(0) {} | |||
|
|||
std::mt19937 gen{(unsigned int)std::random_device{}()}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't use direct initializations of member variables for now. We have not decided to allow this in HPX yet. Also, I don't think you want to invoke std::random_device
for each instance of the jenkins_hash
type. Creating and using std::random_device
is a potentially very expensive operation.
@@ -791,9 +791,10 @@ void big_boot_barrier::wait_hosted( | |||
, unassigned | |||
, suggested_prefix); | |||
|
|||
std::srand(static_cast<unsigned>(util::high_resolution_clock::now())); | |||
std::mt19937 gen(std::random_device{}()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't the result returned from std::random_device
here be sufficient as a random value for the first parcel-id?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will change it :)
@@ -109,7 +109,7 @@ void run_benchmark(std::size_t vector_size, int test_count, IteratorTag) | |||
|
|||
std::cout << "* Running Benchmark..." << std::endl; | |||
|
|||
int rand_base = std::rand(); | |||
int rand_base = std::random_device{}(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One call to std::random_device
should be sufficient. As said, its use is potentially very expensive.
@@ -185,7 +185,7 @@ int hpx_main(int argc, char **argv) | |||
HPX_TEST(equality); | |||
|
|||
// Do moduler operation for avoiding overflow in ramdom_fill. (#2954) | |||
int rand_base = std::rand() % 10000; | |||
int rand_base = std::random_device{}() % 10000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't call std::random_device
more than once per application. Also, please use a distribution instead of modulo arithmetic for the generated numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I then have to initialize seed
globally using std::random_device
.
@hkaiser Sure, I will create more PR than to add on these.But please note that there will be many PR since there are 210 files to be edited. |
Thanks!
No worries, there is a sufficient number of electrons available to accommodate for a couple more PRs... ;-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks a lot!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In many places, std::random_device{}()
is used for constructing std::mt19937
.
std::random_device{}()
generates different number sequences. So, we can't reproduce the problem in test if std::mt19937
is constructed with std::random_device{}()
.
So, std::mt19937
must be constructed with explicit seed like 'seed' of program option.
And this PR cannot fix #2954 because UB still can be occured:
https://github.com/Anushi1998/hpx/blob/3a1e3d8be803740487487f1a35a29d96740855dc/tests/regressions/parallel/stable_merge_2964.cpp#L31
@@ -34,7 +34,7 @@ const int random_fill_range = (std::min)(100000, RAND_MAX); | |||
struct random_fill | |||
{ | |||
random_fill() | |||
: gen(std::rand()), | |||
: gen(std::random_device{}()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a problem that gen
generates different numbers even if same 'seed' is used in program option.
The gen
must be initialized with the 'seed' of program option, not with std::random_device{}()
.
And there are many same problems in other codes.
struct random_fill | ||
{ | ||
random_fill(std::size_t random_range) | ||
: gen(std::rand()), | ||
: gen(std::random_device{}()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a problem that gen
generates different numbers even if same 'seed' is used in program option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, this file still uses std::srand() below... Thanks for noticing!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hkaiser There is no use to srand
because there is no call to rand
but I will remove it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -25,10 +25,13 @@ | |||
#include <vector> | |||
|
|||
/////////////////////////////////////////////////////////////////////////////// | |||
|
|||
std::mt19937 _rand(std::random_device{}()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a problem that gen
generates different numbers even if same 'seed' is used in program option.
@@ -25,10 +25,13 @@ | |||
#include <vector> | |||
|
|||
/////////////////////////////////////////////////////////////////////////////// | |||
|
|||
std::mt19937 _rand(std::random_device{}()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a problem that gen
generates different numbers even if same 'seed' is used in program option.
@@ -28,10 +28,13 @@ | |||
#include "utils.hpp" | |||
|
|||
/////////////////////////////////////////////////////////////////////////////// | |||
|
|||
std::mt19937 _rand(std::random_device{}()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a problem that gen
generates different numbers even if same 'seed' is used in program option.
@@ -27,10 +27,15 @@ | |||
#include "utils.hpp" | |||
|
|||
/////////////////////////////////////////////////////////////////////////////// | |||
unsigned int seed = (unsigned int)std::random_device{}(); | |||
std::mt19937 _rand(seed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_rand
must be initialized with 'seed' of program option, not with the output of std::random_device{}()
.
@@ -31,7 +31,7 @@ | |||
struct random_fill | |||
{ | |||
random_fill(std::size_t random_range) | |||
: gen(std::rand()), | |||
: gen(std::random_device{}()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a problem that gen
generates different numbers even if same 'seed' is used in program option.
@@ -31,7 +31,7 @@ | |||
struct random_fill | |||
{ | |||
random_fill(std::size_t random_range) | |||
: gen(std::rand()), | |||
: gen(std::random_device{}()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a problem that gen
generates different numbers even if same 'seed' is used in program option.
@@ -31,7 +31,7 @@ | |||
struct random_fill | |||
{ | |||
random_fill(std::size_t random_range) | |||
: gen(std::rand()), | |||
: gen(std::random_device{}()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a problem that gen
generates different numbers even if same 'seed' is used in program option.
@@ -29,7 +29,7 @@ | |||
struct random_fill | |||
{ | |||
random_fill(std::size_t random_range) | |||
: gen(std::rand()), | |||
: gen(std::random_device{}()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a problem that gen
generates different numbers even if same 'seed' is used in program option.
@taeguk Please can you elaborate more on
Meanwhile, I have tried to look at some better seeding option
As far as I think the problem of UB is due to integer overflow.Can't we avoid it by restricting the random values to a narrower range?Please correct me if I am wrong. |
@Anushi1998: the only thing left is to fix the issues reported by inspect: https://9684-4455628-gh.circle-artifacts.com/0/tmp/circle-artifacts.T3OlciG/hpx_inspect_report.html |
@Anushi1998 The unit or benchmark tests are performed with random numbers. And we must can control the random numbers, that is, random number generator must be deterministic. So, |
Yes you're right. The problem is due to integer overflow. And we can avoid it by the way you suggested. |
I think
But we need to discuss it with @hkaiser, he suggested to use |
We can also consider generating an exception whenever someone tries to call a large |
@taeguk |
I think |
@Anushi1998 Sorry, I may misused the word 'deterministic' and 'non-deterministic'. What I just wanted to say is that we must be able to initialize |
Please let me know if I missed something you expected.
I have not done that because I am waiting for @hkaiser review on this. |
@Anushi1998 we discussed this on irc, but for the records: I think it's fine to use |
I don't know what @taeguk expected, for myself, this looks good to go. |
Can you please have a look @taeguk now.Is it what you expected? |
Thanks, @taeguk for pointing that out.I hope it is now up to your expectation :) |
if (vm.count("seed")) | ||
seed = vm["seed"].as<unsigned int>(); | ||
|
||
std::cout << "using seed: " << seed << std::endl; | ||
std::srand(seed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If std::srand
is removed seed
is completely unused, no? Either std::srand
should stay or seed
(or a generator initialized with seed
) should be passed one way or another down to the tests (which directly use std::rand
now). Ideally it would be the latter as the goal is to get rid of std::rand
.
What am I missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@msimberg I have not completed this file(see this still uses std::rand
).I have only covered these 30 files.As soon as this PR merges I will file another PR for next 30 files(including this).
Also, this file commit has happened accidentally because I have started my work for another 30 files but seeing some issues regarding std::random_device
I stopped and I have done hard reset to come back to the previous commit but my mistake this change still happened here.But I will soon fix this file in another PR as soon as this merges(I assure within a day or two).If this is not acceptable then I will fix it right away :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Anushi1998 Okay, no problem in that case! Just make sure this change doesn't get included in this PR. Thanks for clarifying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure @msimberg ,thanks :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@msimberg I have reverted the changes.Thanks for figuring out :)
if (vm.count("seed")) | ||
seed = vm["seed"].as<unsigned int>(); | ||
|
||
std::cout << "using seed: " << seed << std::endl; | ||
std::srand(seed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Anushi1998 Thanks for updating the file. Would you mind doing this one as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before doing that, please would you clarify me what purpose srand
serves if there is no call to std::rand
.Have a look at this discussion too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, good point. But if the test doesn't use any random numbers the seed is also not needed. As far as I know we don't generally add the seed option if the test doesn't require random numbers, so I would remove it completely, but I haven't been around for too long.
@hkaiser Did you mean to use rand
somewhere in the test? If not, do you prefer to keep the seed option for consistency even if it's not used, or could it be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has no purpose, please remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I will do it :)
Added newline to end of the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you :)
@@ -135,12 +139,9 @@ double run_is_heap_until_benchmark_par_unseq(int test_count, | |||
/////////////////////////////////////////////////////////////////////////////// | |||
int hpx_main(boost::program_options::variables_map& vm) | |||
{ | |||
unsigned int seed = (unsigned int)std::time(nullptr); | |||
if (vm.count("seed")) | |||
seed = vm["seed"].as<unsigned int>(); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't _rand
be seeded here again if seed
is given on the command line? Same way you've updated benchmark_partition_copy.cpp
? The same goes for benchmark_inplace_merge.cpp
and benchmark_is_heap.cpp
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the changes.Thanks for figuring out :)
Thanks @Anushi1998, looks good! |
Fixes #2954
Since
std::rand
is now deprecated in C++11, I have tried changing it to a better inbuilt random generator likestd::mt19937
andstd::random_device
.It requires changing 210 files and right now I have fixed only some 30 files to get some reviews and/or suggestions :)