Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testGithub2245 in testPickers.cpp occasionally fails #2839

Closed
ptosco opened this issue Dec 10, 2019 · 0 comments
Closed

testGithub2245 in testPickers.cpp occasionally fails #2839

ptosco opened this issue Dec 10, 2019 · 0 comments
Assignees
Labels
Milestone

Comments

@ptosco
Copy link
Contributor

ptosco commented Dec 10, 2019

if you run the test 1000 times, e.g. by doing

( FOR /L %i IN (1,1,1000) DO ( ctest -I 159,159 -V ) ) > debug.log 2>&1

you will likely incur in a couple of exceptions.
Exceptions always occur in testGithub2245() on either line 45 or 52:

void testGithub2245() {
BOOST_LOG(rdErrorLog) << "-------------------------------------" << std::endl;
BOOST_LOG(rdErrorLog) << "Testing github issue 2245: MinMax Diversity picker "
"seeding shows deterministic / non-random behaviour."
<< std::endl;
{
RDPickers::MaxMinPicker pkr;
int poolSz = 1000;
auto picks1 = pkr.lazyPick(dist_on_line, poolSz, 10, RDKit::INT_VECT(), -1);
auto picks2 = pkr.lazyPick(dist_on_line, poolSz, 10, RDKit::INT_VECT(), -1);
TEST_ASSERT(picks1 != picks2);
}
{ // make sure the default is also random
RDPickers::MaxMinPicker pkr;
int poolSz = 1000;
auto picks1 = pkr.lazyPick(dist_on_line, poolSz, 10);
auto picks2 = pkr.lazyPick(dist_on_line, poolSz, 10);
TEST_ASSERT(picks1 != picks2);
}
{ // and we're still reproducible when we want to be
RDPickers::MaxMinPicker pkr;
int poolSz = 1000;
auto picks1 =
pkr.lazyPick(dist_on_line, poolSz, 10, RDKit::INT_VECT(), 0xf00d);
auto picks2 =
pkr.lazyPick(dist_on_line, poolSz, 10, RDKit::INT_VECT(), 0xf00d);
TEST_ASSERT(picks1 == picks2);
}
BOOST_LOG(rdErrorLog) << "Done" << std::endl;
}

Initially I thought it was due to poor quality random seeding (see http://www.pcg-random.org/posts/cpps-random_device.html), but replacing

    if (seed >= 0) {
      generator.seed(static_cast<rng_type::result_type>(seed));
    } else {
      generator.seed(std::random_device()());
    }

with

    if (seed >= 0) {
      generator.seed(static_cast<rng_type::result_type>(seed));
    } else {
      generator.seed(randutils::auto_seed_128{}.base());
    }

as recommended here: http://www.pcg-random.org/posts/simple-portable-cpp-seed-entropy.html didn't remove the occasional failures over 1000 runs.
So I realized that the occasional failures are not due to poor seeding but rather to the small sample, which makes getting two identical picks once in a while not so uncommon.
So I believe the best solution to the problem is to make the test more robust against occasional identical picks.
PR will follow soon.

@ptosco ptosco self-assigned this Dec 10, 2019
ptosco added a commit to ptosco/rdkit that referenced this issue Dec 10, 2019
@ptosco ptosco mentioned this issue Dec 10, 2019
@greglandrum greglandrum added this to the 2019_09_3 milestone Dec 11, 2019
greglandrum pushed a commit that referenced this issue Jan 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants