Conversation
|
While this makes sense, I wonder if its even worth to assert that number of false positives is less than some number? Each test asserting there's no false negatives may be good enough? I assume that there is still a failure chance, and it may not really be suitable for unit test to be run each time? Curious what other parquet or other projects do. Also wondering what others think. |
|
I think besides asserting no false negative, we probably still need to have a negative test for each of the data types? |
|
Yea it doenst seem great to have a chance of random failures in build tests but could go either way on it. Also curious with your buffer what is the estimated chance of failure ? (like is it astronomically small ) Wonder any thoughts from @kbendick @rdblue @RussellSpitzer |
|
Rather than adding code for false positives, wouldn't it be better to make the test use a consistent random seed that doesn't have a false positive? You'd need to replace |
|
@rdblue Thanks for the suggestion! Done. |
|
Thanks, @huaxingao! |
|
Thank you very much! @rdblue @szehon-ho |
I saw bloom filter test
Assert.assertFalse("Should not read: ...", shouldRead)failed with false positive.This PR is to make the bloom filter test less flaky by taking consideration of fpp.