New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
code with celero::DoNotOptimizeAway can be partly optimized away #13
Comments
Can you demonstrate this bug on a POD or STL type such that it can be duplicated? DoNotOptimizeAway takes a reference. In this case, it should take a reference to the result of the "u+v" operation and should not at all change the results or the way "u+v" is computed. |
I think STL is too complicated for compiler, but for POD, compiler can do great job. I created a small example: #include <celero/Celero.h>
#include <eigen3/Eigen/Eigen>
CELERO_MAIN;
Eigen::Vector3f u, v;
struct Vec {
float x, y, z;
};
Vec a, b;
Vec add(const Vec& a, const Vec& b) {
Vec c;
c.x = a.x + b.x;
c.y = a.y + b.y;
c.z = a.z + b.z;
return c;
}
BASELINE(DemoSimple, Baseline, 0, 7100000)
{
asm("# test eigen begin");
celero::DoNotOptimizeAway(Eigen::Vector3f(u + v));
asm("# test eigen end");
asm("# test POD begin");
celero::DoNotOptimizeAway(add(a, b));
asm("# test POD end");
}
The assembler I got from gcc 4.7 is # 22 "/home/xu/projects/Celero/examples/bug_report.cpp" 1
# test eigen begin
# 0 "" 2
#NO_APP
movss u(%rip), %xmm0
addss v(%rip), %xmm0
movss %xmm0, (%rsp)
call getpid
cmpl $1, %eax
je .L68
.L65:
#APP
# 24 "/home/xu/projects/Celero/examples/bug_report.cpp" 1
# test eigen end
# 0 "" 2
# 26 "/home/xu/projects/Celero/examples/bug_report.cpp" 1
# test POD begin
# 0 "" 2
#NO_APP
movss a(%rip), %xmm0
addss b(%rip), %xmm0
movss %xmm0, 16(%rsp)
call getpid
cmpl $1, %eax
je .L69
.L66:
#APP
# 28 "/home/xu/projects/Celero/examples/bug_report.cpp" 1
# test POD end With bugfix, the result is follow, so you can see the difference. # 22 "/home/xu/projects/Celero/examples/bug_report.cpp" 1
# test eigen begin
# 0 "" 2
#NO_APP
movss u(%rip), %xmm0
addss v(%rip), %xmm0
movss %xmm0, (%rsp)
movss u+4(%rip), %xmm0
addss v+4(%rip), %xmm0
movss %xmm0, 4(%rsp)
movss u+8(%rip), %xmm0
addss v+8(%rip), %xmm0
movss %xmm0, 8(%rsp)
call getpid
cmpl $1, %eax
je .L65
.L68:
#APP
# 24 "/home/xu/projects/Celero/examples/bug_report.cpp" 1
# test eigen end
# 0 "" 2
# 26 "/home/xu/projects/Celero/examples/bug_report.cpp" 1
# test POD begin
# 0 "" 2
#NO_APP
movss a+4(%rip), %xmm1
movss a+8(%rip), %xmm0
addss b+4(%rip), %xmm1
movss a(%rip), %xmm2
addss b+8(%rip), %xmm0
addss b(%rip), %xmm2
movss %xmm1, 20(%rsp)
movss %xmm0, 24(%rsp)
movss %xmm2, 16(%rsp)
call getpid
cmpl $1, %eax
je .L71
.L67:
#APP
# 28 "/home/xu/projects/Celero/examples/bug_report.cpp" 1
# test POD end |
Acknowledged. I see there is a problem here. I am checking in a fix. The fix for Visual Studio is not as nice as for gcc & clang, but I believe it addresses this issue. Thanks for the bug report! |
Should fix "DoNotOptimizeAway" on GCC and Clang. Seems to have fixed it on Visual Studio 2013 as well, but this is harder to verify.
Thanks for the great project first.
The celero::DoNotOptimizeAway only cheats the compiler with calling putchar on the first char in data. But the compiler (at least GCC 4.7) is smart enough to keep calculation of first char and optimize other parts away.
Example:
Vector3 u, v;
celero::DoNotOptimizeAway(u + v);
the compiler will only calculate u[0] + v[0], and ignore u[1] + v[1] and u[2] + v[2]
this can be checked by generated asm code.
I have a dirty fix:
The text was updated successfully, but these errors were encountered: