testGithub2245 in testPickers.cpp occasionally fails #2839

ptosco · 2019-12-10T18:19:13Z

if you run the test 1000 times, e.g. by doing

( FOR /L %i IN (1,1,1000) DO ( ctest -I 159,159 -V ) ) > debug.log 2>&1

you will likely incur in a couple of exceptions.
Exceptions always occur in testGithub2245() on either line 45 or 52:

rdkit/Code/SimDivPickers/testPickers.cpp

Lines 35 to 64 in ff5266f

    
           void testGithub2245() { 
        
             BOOST_LOG(rdErrorLog) << "-------------------------------------" << std::endl; 
        
             BOOST_LOG(rdErrorLog) << "Testing github issue 2245: MinMax Diversity picker " 
        
                                      "seeding shows deterministic / non-random behaviour." 
        
                                   << std::endl; 
        
             { 
        
               RDPickers::MaxMinPicker pkr; 
        
               int poolSz = 1000; 
        
               auto picks1 = pkr.lazyPick(dist_on_line, poolSz, 10, RDKit::INT_VECT(), -1); 
        
               auto picks2 = pkr.lazyPick(dist_on_line, poolSz, 10, RDKit::INT_VECT(), -1); 
        
               TEST_ASSERT(picks1 != picks2); 
        
             } 
        
             {  // make sure the default is also random 
        
               RDPickers::MaxMinPicker pkr; 
        
               int poolSz = 1000; 
        
               auto picks1 = pkr.lazyPick(dist_on_line, poolSz, 10); 
        
               auto picks2 = pkr.lazyPick(dist_on_line, poolSz, 10); 
        
               TEST_ASSERT(picks1 != picks2); 
        
             } 
        
             {  // and we're still reproducible when we want to be 
        
               RDPickers::MaxMinPicker pkr; 
        
               int poolSz = 1000; 
        
               auto picks1 = 
        
                   pkr.lazyPick(dist_on_line, poolSz, 10, RDKit::INT_VECT(), 0xf00d); 
        
               auto picks2 = 
        
                   pkr.lazyPick(dist_on_line, poolSz, 10, RDKit::INT_VECT(), 0xf00d); 
        
               TEST_ASSERT(picks1 == picks2); 
        
             } 
        
             BOOST_LOG(rdErrorLog) << "Done" << std::endl; 
        
           }

Initially I thought it was due to poor quality random seeding (see http://www.pcg-random.org/posts/cpps-random_device.html), but replacing

    if (seed >= 0) {
      generator.seed(static_cast<rng_type::result_type>(seed));
    } else {
      generator.seed(std::random_device()());
    }

with

    if (seed >= 0) {
      generator.seed(static_cast<rng_type::result_type>(seed));
    } else {
      generator.seed(randutils::auto_seed_128{}.base());
    }

as recommended here: http://www.pcg-random.org/posts/simple-portable-cpp-seed-entropy.html didn't remove the occasional failures over 1000 runs.
So I realized that the occasional failures are not due to poor seeding but rather to the small sample, which makes getting two identical picks once in a while not so uncommon.
So I believe the best solution to the problem is to make the test more robust against occasional identical picks.
PR will follow soon.

The text was updated successfully, but these errors were encountered:

ptosco self-assigned this Dec 10, 2019

ptosco added a commit to ptosco/rdkit that referenced this issue Dec 10, 2019

- fixes rdkit#2839

fb42ab3

ptosco mentioned this issue Dec 10, 2019

Fixes #2839 #2840

Merged

greglandrum closed this as completed in d714f51 Dec 11, 2019

greglandrum added the bug label Dec 11, 2019

greglandrum added this to the 2019_09_3 milestone Dec 11, 2019

greglandrum pushed a commit that referenced this issue Jan 9, 2020

- fixes #2839

a9e53ad

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

testGithub2245 in testPickers.cpp occasionally fails #2839

testGithub2245 in testPickers.cpp occasionally fails #2839

ptosco commented Dec 10, 2019 •

edited

testGithub2245 in testPickers.cpp occasionally fails #2839

testGithub2245 in testPickers.cpp occasionally fails #2839

Comments

ptosco commented Dec 10, 2019 • edited

ptosco commented Dec 10, 2019 •

edited