benchmark::DoNotOptimize not behaving as expected at -O1, -O2 and -O3 #242

NAThompson · 2016-06-24T19:17:45Z

For the following code:

#include <cmath>
#include <benchmark/benchmark.h>

static void BM_Pow(benchmark::State& state)
{
    double y;
    while (state.KeepRunning())
    {
        benchmark::DoNotOptimize(y = std::pow(1.2, 3.8));
    }
}

BENCHMARK(BM_Pow);
BENCHMARK_MAIN();

Compiled on Ubuntu 16.04, with clang version

$ clang -v
clang version 3.8.0-2ubuntu3 (tags/RELEASE_380/final)
Target: x86_64-pc-linux-gnu

and CPPFLAGS:= -std=c++14 -O3, no call to std::pow is made; to wit, when the assembly is spat out with clang++ $(CPPFLAGS) -S -masm=intel test.cpp, and the output examined with

$ cat run_benchmarks.s | grep 'call' | awk '{print $2}' | xargs c++filt | sort | uniq
__assert_fail
benchmark::Initialize(int*, char**)
benchmark::internal::Benchmark::Benchmark(char const*)
benchmark::internal::RegisterBenchmarkInternal(benchmark::internal::Benchmark*)
benchmark::RunSpecifiedBenchmarks()
benchmark::State::KeepRunning()
benchmark::State::PauseTiming()
benchmark::State::ResumeTiming()
operator delete(void*)
operator new(unsigned long)
_Unwind_Resume

I do not see any call to pow, however, when compiling with -O0, I do see a call to pow.

Even at -O0, I do not see the call to pow within the while loop, but rather it is called only once.

This is a workaround:

#include <cmath>
#include <ostream>
#include <benchmark/benchmark.h>

static void BM_Pow(benchmark::State& state)
{
    double y;
    while (state.KeepRunning())
    {
        benchmark::DoNotOptimize(y = std::pow(1.2, 3.8));
    }
    std::ostream cnull(0);
    cnull << y;
}

BENCHMARK(BM_Pow);
BENCHMARK_MAIN();

Then the call to pow is observed in the loop. However, this seems a heroic response to a user error rather than a necessity.

Is there a better way?

The text was updated successfully, but these errors were encountered:

EricWF · 2016-06-24T19:20:44Z

DoNotOptimize(...) can only help prevent the optimization of the result, and not the intermediate expressions.

EricWF · 2016-06-24T19:26:46Z

I'll look into this more over the weekend.

NAThompson · 2016-06-24T19:44:44Z

So if I understand you correctly, then if std::pow is constexpr, then it could be evaluated at compile time, and hence my benchmark::DoNotOptimize(y = std::pow(1.2, 3.8)); would just compile down to mov xmm0, 1.9993495762998474. But if I didn't put in benchmark::DoNotOptimize, it could throw away to move operation as well?

EricWF · 2016-07-01T21:57:38Z

Essentially yes that is correct, except the optimizations can happen even without constexpr.

DoNotOptimize(<expr>) works by forcing the result of <expr> to be stored to memory, which in turn forces the compiler to actually evaluate <expr>. It does not prevent the compiler from optimizing the evaluation of <expr> but it does prevent the expression from being discarded completely.

As you noted in your example the compiler optimized <expr> so that it only had to be evaluated once and therefore could reuse the result each loop iteration. Unfortunately you just have to be aware of these Gotcha's when writing benchmarks.

In my experience it's important to give the benchmark different inputs on every iteration to prevent this kind of optimization from taking place.

EricWF · 2016-07-11T21:40:40Z

I checked in a slightly improved version of DoNotOptimize(...) and additional docs that try and clarify how to use it. Including a description of the problem your running into. I'm going to close this because I don't think I can do much better than that.

Thanks for the report. It's greatly appreciated.

NAThompson changed the title ~~benchmark::DoNotOptimize not behaving as expected at -O1, -O2 and -O3~~ benchmark::DoNotOptimize not behaving as expected at -O1, -O2 and -O3 Jun 24, 2016

NAThompson changed the title ~~benchmark::DoNotOptimize not behaving as expected at -O1, -O2 and -O3~~ benchmark::DoNotOptimize not behaving as expected at -O1, -O2 and -O3 Jun 24, 2016

EricWF closed this as completed Jul 11, 2016

springmeyer mentioned this issue Sep 11, 2017

Benchmark: Phase 2 mapbox/hpp-skel#48

Merged

3 tasks

brawner mentioned this issue Oct 20, 2020

Initial benchmark tests for rclcpp::init/shutdown create/destroy node ros2/rclcpp#1411

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmark::DoNotOptimize not behaving as expected at -O1, -O2 and -O3 #242

benchmark::DoNotOptimize not behaving as expected at -O1, -O2 and -O3 #242

NAThompson commented Jun 24, 2016

EricWF commented Jun 24, 2016

EricWF commented Jun 24, 2016

NAThompson commented Jun 24, 2016

EricWF commented Jul 1, 2016

EricWF commented Jul 11, 2016

benchmark::DoNotOptimize not behaving as expected at -O1, -O2 and -O3 #242

benchmark::DoNotOptimize not behaving as expected at -O1, -O2 and -O3 #242

Comments

NAThompson commented Jun 24, 2016

EricWF commented Jun 24, 2016

EricWF commented Jun 24, 2016

NAThompson commented Jun 24, 2016

EricWF commented Jul 1, 2016

EricWF commented Jul 11, 2016