MinHasher32 test is flakey #500

johnynek · 2015-11-19T07:14:02Z

[info] MinHasher32
[info] - should measure 0.5 similarity in 1024 bytes with < 0.1 error
[info] - should measure 0.8 similarity in 1024 bytes with < 0.05 error *** FAILED ***
[info]   0.05833333333333324 was not less than 0.05 (MinHasherTest.scala:30)
[info] - should measure 1.0 similarity in 1024 bytes with < 0.01 error

This is quite common. Need to weaken the bound or figure out if there is really some bug (doubt it).

The text was updated successfully, but these errors were encountered:

Gabriella439 · 2016-02-22T23:38:43Z

Another instance of this occurring:

https://travis-ci.org/twitter/algebird/jobs/111090636

[info] - should measure 0.8 similarity in 1024 bytes with < 0.05 error *** FAILED ***
[info]   0.07916666666666672 was not less than 0.05 (MinHasherTest.scala:30)

johnynek · 2016-02-23T01:21:04Z

we have that approach that @sid-kap added. @Gabriel439 want to take a stab at porting this test to that framework?

Gabriella439 · 2016-02-23T17:45:14Z

Sure, I will give it a stab

Gabriella439 · 2016-02-26T00:01:51Z

So I think this is a case of the error bounds being incorrect. The test that occasionally fails is this one:

    "measure 0.8 similarity in 1024 bytes with < 0.05 error" in {
      test(new MinHasher32(0.8, 1024), 0.8, 0.05)
    }

... and if you follow the code for that specific MinHasher32 constructor initialization it initializes numHashes to 247. The expected error for the Min Hash algorithm is 1 / sqrt numHashes which is in this case evaluates to approximately 0.064. This explains why the test occasionally fails because the test requires an error less 0.05 which is below the expected error. We should probably bump the expected error up to at least 0.1.

johnynek · 2016-02-26T02:16:59Z

sounds fine to me.

johnynek added the testing label Nov 19, 2015

johnynek closed this as completed in 7bcbc68 Feb 26, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MinHasher32 test is flakey #500

MinHasher32 test is flakey #500

johnynek commented Nov 19, 2015

Gabriella439 commented Feb 22, 2016

johnynek commented Feb 23, 2016

Gabriella439 commented Feb 23, 2016

Gabriella439 commented Feb 26, 2016

johnynek commented Feb 26, 2016

MinHasher32 test is flakey #500

MinHasher32 test is flakey #500

Comments

johnynek commented Nov 19, 2015

Gabriella439 commented Feb 22, 2016

johnynek commented Feb 23, 2016

Gabriella439 commented Feb 23, 2016

Gabriella439 commented Feb 26, 2016

johnynek commented Feb 26, 2016