complexity_test.cc spuriously fails #272

EricWF · 2016-08-09T20:07:58Z

The benchmark registered by BENCHMARK(BM_Complexity_O1) -> Range(1, 1<<18) -> Complexity(); sometimes reports it's complexity as lgN as opposed to ([0-9]+). This happens when one of the later repetitions happens to run faster due to CPU load.

For example:

BM_Complexity_O1/1                 2106 ns       3125 ns      10000
BM_Complexity_O1/8                 2132 ns       1563 ns      10000
BM_Complexity_O1/64                2050 ns       3125 ns      10000
BM_Complexity_O1/512               2032 ns       1563 ns      10000
BM_Complexity_O1/4k                2041 ns       1563 ns      10000
BM_Complexity_O1/32k               2264 ns      15625 ns       1000
BM_Complexity_O1/256k              2128 ns       1563 ns      10000
BM_Complexity_O1_BigO            163.31 lgN     389.19 lgN

I'm not quite sure how to fix this test, or even if this is a bug in the complexity implementation. I would like some guidance on how to tackle this.

This is making our Appveyor build frequently fail so I would like to fix it. For now I'm going to checkin a temporary fix which simply accepts 'lg(N)' as valid test output.

The text was updated successfully, but these errors were encountered:

See #272

dmah42 · 2021-04-27T12:42:29Z

i think this is what we've been seeing recently on the appveyor windows release tests. it does seem though that the github workflows are more stable.

As it can be seen in e.g. https://github.com/google/benchmark/actions/runs/7711328637/job/21016492361 We may get `65: BM_Complexity_O1_BigO 0.00 N^2 0.00 N^2 `

* `complexity_test`: deflake, same as #272 As it can be seen in e.g. https://github.com/google/benchmark/actions/runs/7711328637/job/21016492361 We may get `65: BM_Complexity_O1_BigO 0.00 N^2 0.00 N^2 ` * `user_counters_tabular_test`: deflake We were still getting zero times there. Perhaps this is better?

This test is fundamentally flaky, because it tried to read tea leafs, and is inherently misbehaving in CI environments, since there are unmitigated sources of noise. Fixes google#272

This test is fundamentally flaky, because it tried to read tea leafs, and is inherently misbehaving in CI environments, since there are unmitigated sources of noise. That being said, the computed Big-O also depends on the `--benchmark_min_time=` Fixes google#272

* Rewrite complexity_test to use (hardcoded) manual time This test is fundamentally flaky, because it tried to read tea leafs, and is inherently misbehaving in CI environments, since there are unmitigated sources of noise. That being said, the computed Big-O also depends on the `--benchmark_min_time=` Fixes #272 * Correctly compute Big-O for manual timings. Fixes #1758. * complexity_test: do more stuff in empty loop * Make all empty loops be a bit longer empty Looks like on windows, some of these tests still fail, i guess clock precision is too small.

EricWF added a commit that referenced this issue Aug 9, 2016

Workaround flaky complexity_test.cc test case.

a7a7c56

See #272

LebedevRI mentioned this issue Feb 1, 2024

Deflake CI #1751

Merged

LebedevRI mentioned this issue Feb 14, 2024

Rewrite complexity_test to use (hardcoded) manual time #1757

Merged

dmah42 closed this as completed in #1757 Feb 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

complexity_test.cc spuriously fails #272

complexity_test.cc spuriously fails #272

EricWF commented Aug 9, 2016

dmah42 commented Apr 27, 2021

complexity_test.cc spuriously fails #272

complexity_test.cc spuriously fails #272

Comments

EricWF commented Aug 9, 2016

dmah42 commented Apr 27, 2021