Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarks should e.g. write output #5

Closed
AngusL opened this issue Feb 13, 2017 · 1 comment
Closed

Benchmarks should e.g. write output #5

AngusL opened this issue Feb 13, 2017 · 1 comment

Comments

@AngusL
Copy link

AngusL commented Feb 13, 2017

As written, the compiled benchmarks don't measure the time taken for a n-body simulation. For example, the Fortran compiles (GNU Fortran (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11)) to

0000000000400610 <main>:
  400610:       48 83 ec 08             sub    $0x8,%rsp
  400614:       e8 d7 ff ff ff          callq  4005f0 <_gfortran_set_args@plt>
  400619:       be 70 0b 40 00          mov    $0x400b70,%esi
  40061e:       bf 07 00 00 00          mov    $0x7,%edi
  400623:       e8 98 ff ff ff          callq  4005c0 <_gfortran_set_options@plt>
  400628:       66 0f 57 c0             xorpd  %xmm0,%xmm0
  40062c:       f2 0f 10 0d 64 05 00    movsd  0x564(%rip),%xmm1        # 400b98 <options.38.2155+0x28>
  400633:       00
  400634:       f2 0f 10 15 64 05 00    movsd  0x564(%rip),%xmm2        # 400ba0 <options.38.2155+0x30>
  40063b:       00
  40063c:       0f 1f 40 00             nopl   0x0(%rax)
  400640:       f2 0f 58 c1             addsd  %xmm1,%xmm0
  400644:       f2 0f 58 c1             addsd  %xmm1,%xmm0
  400648:       66 0f 2e d0             ucomisd %xmm0,%xmm2
  40064c:       73 f2                   jae    400640 <main+0x30>
  40064e:       31 c0                   xor    %eax,%eax
  400650:       48 83 c4 08             add    $0x8,%rsp
  400654:       c3                      retq
  400655:       0f 1f 00                nopl   (%rax)

The loop from 400640 to 40064c corresponds to the time = time + half_time_step lines only i.e. the entire n-body simulation is optimised away. For these benchmarks to be at all useful, you must ensure that the compiler is unable to prove that virtually all the work is unnecessary. The easiest way to do this is to e.g. write out the results at the end, and measure only the computation time if that's what's of interest to you.

@marblestation
Copy link
Owner

That's a very good point and its related to #6. Thanks! Now it is fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants