Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
Readme.md
main.cpp

Readme.md

Branch Prediction

The question with most votes so far asked at StackOverflow is about branch prediction.

The code is to test the performance of code generated by different compilers.

Some modified versions (inspired by answers of that question) of the originally posted sum loop are also included.

I have tried GCC 4.8.2 and ICC 14.0.1.

Here are the outputs:

GCC 4.8.2 with -O2:

Sorted:
2.43
sum = 314931600000
Unsorted:
14.39
sum = 314635000000
Sorted swapped:
0
sum = 314226200000
Unsorted swapped:
0
sum = 315452200000

ICC 14.0.1 with -O2:

Sorted:
0.3
sum = 314931600000
Unsorted:
0.31
sum = 314635000000
Sorted swapped:
0.3
sum = 314226200000
Unsorted swapped:
0.31
sum = 315452200000

I understand ICC's results now after reading Mysticial's answer. ICC swaps the inner and outer loops for me without having me to rewrite the code.

But GCC's 0s are surprising.

I modified the code further to change

for (unsigned i = 0; i < 100000; ++i)
{
  sum += data[c];
}

to

sum += data[c] * 100000;

and ICC gives me 0s just like GCC.

So ICC is intelligently optimizing the code by interchanging the loops when its poorly written. GCC does not do this.

However, GCC recognizes that the benchmark loop is even worse written when it has been rewritten as a inner loop and just removes the loop.

Poorly written loops in code cannot be fixed completely by smart compilers. This is true at least for now.