a third branching test program, comparing compiler optimizations and openmp usage
results from my 9950x3d compiled using clang with -O0/3 -march=x86-64-v4 -fopenmp | results from my 2x e5-2680v4 compiled using clang with -O0/3 -march=x86-64-v3 -fopenmp |
--------------------------------------------------------------------------------|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|
O0: | O3: | O0: | O3: |
seed = 13 | seed = 13 | seed = 13 | seed = 13 |
100 iterations | 100 iterations | 100 iterations | 100 iterations |
| | | |
branching 0 sum : 619102100, time: 0000.002s (no omp, if branching) | branching 0 sum : 619102100, time: 0000.000s (no omp, if branching) | branching 0 sum : 619102100, time: 0000.015s (no omp, if branching) | branching 0 sum : 619102100, time: 0000.001s (no omp, if branching) |
branching 1 sum : 952662100, time: 0000.002s (no omp, ternary branching) | branching 1 sum : 952662100, time: 0000.000s (no omp, ternary branching) | branching 1 sum : 952662100, time: 0000.013s (no omp, ternary branching) | branching 1 sum : 952662100, time: 0000.001s (no omp, ternary branching) |
branching 2 sum : 427665400, time: 0000.016s (omp, if branching) | branching 2 sum : 427665400, time: 0000.011s (omp, if branching) | branching 2 sum : 427665400, time: 0000.016s (omp, if branching) | branching 2 sum : 427665400, time: 0000.011s (omp, if branching) |
branching 3 sum : 795020100, time: 0000.001s (omp, ternary branching) | branching 3 sum : 795020100, time: 0000.000s (omp, ternary branching) | branching 3 sum : 795020100, time: 0000.002s (omp, ternary branching) | branching 3 sum : 795020100, time: 0000.000s (omp, ternary branching) |
branchless 1 sum : 769295800, time: 0000.003s (no omp, no branching) | branchless 1 sum : 769295800, time: 0000.001s (no omp, no branching) | branchless 1 sum : 769295800, time: 0000.014s (no omp, no branching) | branchless 1 sum : 769295800, time: 0000.001s (no omp, no branching) |
branchless 2 sum : 517891900, time: 0000.001s (omp, no branching) | branchless 2 sum : 517891900, time: 0000.000s (omp, no branching) | branchless 2 sum : 517891900, time: 0000.001s (omp, no branching) | branchless 2 sum : 517891900, time: 0000.000s (omp, no branching) |
| | | |
total time taken : 0000.012s | total time taken : 0000.003s | total time taken : 0000.061s | total time taken : 0000.014s |
| | | |
--------------------------------------------------------------------------------|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|
O0: | O3: | O0: | O3: |
seed = 15 | seed = 15 | seed = 15 | seed = 15 |
1 000 iterations | 1 000 iterations | 1 000 iterations | 1 000 iterations |
| | | |
branching 0 sum : 6464965000, time: 0000.017s (no omp, if branching) | branching 0 sum : 6464965000, time: 0000.001s (no omp, if branching) | branching 0 sum : 6464965000, time: 0000.136s (no omp, if branching) | branching 0 sum : 6464965000, time: 0000.005s (no omp, if branching) |
branching 1 sum : 4866297000, time: 0000.022s (no omp, ternary branching) | branching 1 sum : 4866297000, time: 0000.001s (no omp, ternary branching) | branching 1 sum : 4866297000, time: 0000.213s (no omp, ternary branching) | branching 1 sum : 4866297000, time: 0000.006s (no omp, ternary branching) |
branching 2 sum : 5219985000, time: 0000.036s (omp, if branching) | branching 2 sum : 5219985000, time: 0000.002s (omp, if branching) | branching 2 sum : 5219985000, time: 0000.055s (omp, if branching) | branching 2 sum : 5219985000, time: 0000.009s (omp, if branching) |
branching 3 sum : 3713981000, time: 0000.008s (omp, ternary branching) | branching 3 sum : 3713981000, time: 0000.000s (omp, ternary branching) | branching 3 sum : 3713981000, time: 0000.013s (omp, ternary branching) | branching 3 sum : 3713981000, time: 0000.001s (omp, ternary branching) |
branchless 1 sum : 9128063000, time: 0000.029s (no omp, no branching) | branchless 1 sum : 9128063000, time: 0000.002s (no omp, no branching) | branchless 1 sum : 9128063000, time: 0000.130s (no omp, no branching) | branchless 1 sum : 9128063000, time: 0000.011s (no omp, no branching) |
branchless 2 sum : 3471713000, time: 0000.002s (omp, no branching) | branchless 2 sum : 3471713000, time: 0000.001s (omp, no branching) | branchless 2 sum : 3471713000, time: 0000.006s (omp, no branching) | branchless 2 sum : 3471713000, time: 0000.000s (omp, no branching) |
| | | |
total time taken : 0000.114s | total time taken : 0000.007s | total time taken : 0000.553s | total time taken : 0000.032s |
| | | |
--------------------------------------------------------------------------------|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|
O0: | O3: | O0: | O3: |
seed = 23 | seed = 23 | seed = 23 | seed = 23 |
10 000 iterations | 10 000 iterations | 10 000 iterations | 10 000 iterations |
| | | |
branching 0 sum : 76470830000, time: 0000.225s (no omp, if branching) | branching 0 sum : 76470830000, time: 0000.017s (no omp, if branching) | branching 0 sum : 76470830000, time: 0001.017s (no omp, if branching) | branching 0 sum : 76470830000, time: 0000.055s (no omp, if branching) |
branching 1 sum : 69793490000, time: 0000.230s (no omp, ternary branching) | branching 1 sum : 69793490000, time: 0000.017s (no omp, ternary branching) | branching 1 sum : 69793490000, time: 0001.391s (no omp, ternary branching) | branching 1 sum : 69793490000, time: 0000.054s (no omp, ternary branching) |
branching 2 sum : 92710890000, time: 0000.141s (omp, if branching) | branching 2 sum : 92710890000, time: 0000.008s (omp, if branching) | branching 2 sum : 92710890000, time: 0000.129s (omp, if branching) | branching 2 sum : 92710890000, time: 0000.010s (omp, if branching) |
branching 3 sum : 81672990000, time: 0000.033s (omp, ternary branching) | branching 3 sum : 81672990000, time: 0000.002s (omp, ternary branching) | branching 3 sum : 81672990000, time: 0000.063s (omp, ternary branching) | branching 3 sum : 81672990000, time: 0000.004s (omp, ternary branching) |
branchless 1 sum : 46755180000, time: 0000.272s (no omp, no branching) | branchless 1 sum : 46755180000, time: 0000.024s (no omp, no branching) | branchless 1 sum : 46755180000, time: 0001.091s (no omp, no branching) | branchless 1 sum : 46755180000, time: 0000.107s (no omp, no branching) |
branchless 2 sum : 61535460000, time: 0000.023s (omp, no branching) | branchless 2 sum : 61535460000, time: 0000.001s (omp, no branching) | branchless 2 sum : 61535460000, time: 0000.053s (omp, no branching) | branchless 2 sum : 61535460000, time: 0000.002s (omp, no branching) |
| | | |
total time taken : 0000.924s | total time taken : 0000.069s | total time taken : 0003.744s | total time taken : 0000.232s |
| | | |
--------------------------------------------------------------------------------|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|
O0: | O3: | O0: | O3: |
seed = 35 | seed = 35 | seed = 35 | seed = 35 |
100 000 iterations | 100 000 iterations | 100 000 iterations | 100 000 iteration |
| | | |
branching 0 sum : 915637000000, time: 0001.653s (no omp, if branching) | branching 0 sum : 915637000000, time: 0000.181s (no omp, if branching) | branching 0 sum : 915637000000, time: 0008.320s (no omp, if branching) | branching 0 sum : 915637000000, time: 0000.541s (no omp, if branching) |
branching 1 sum : 993437000000, time: 0001.945s (no omp, ternary branching) | branching 1 sum : 993437000000, time: 0000.176s (no omp, ternary branching) | branching 1 sum : 993437000000, time: 0012.792s (no omp, ternary branching) | branching 1 sum : 993437000000, time: 0000.540s (no omp, ternary branching) |
branching 2 sum : 470436000000, time: 0000.200s (omp, if branching) | branching 2 sum : 470436000000, time: 0000.166s (omp, if branching) | branching 2 sum : 470436000000, time: 0000.679s (omp, if branching) | branching 2 sum : 470436000000, time: 0000.223s (omp, if branching) |
branching 3 sum : 407236800000, time: 0000.338s (omp, ternary branching) | branching 3 sum : 407236800000, time: 0000.020s (omp, ternary branching) | branching 3 sum : 407236800000, time: 0000.786s (omp, ternary branching) | branching 3 sum : 407236800000, time: 0000.035s (omp, ternary branching) |
branchless 1 sum : 351323400000, time: 0002.162s (no omp, no branching) | branchless 1 sum : 351323400000, time: 0000.225s (no omp, no branching) | branchless 1 sum : 351323400000, time: 0010.321s (no omp, no branching) | branchless 1 sum : 351323400000, time: 0000.632s (no omp, no branching) |
branchless 2 sum : 519788400000, time: 0000.174s (omp, no branching) | branchless 2 sum : 519788400000, time: 0000.012s (omp, no branching) | branchless 2 sum : 519788400000, time: 0000.358s (omp, no branching) | branchless 2 sum : 519788400000, time: 0000.016s (omp, no branching) |
| | | |
total time taken : 0006.472s | total time taken : 0000.780s | total time taken : 0033.256s | total time taken : 0001.987s |
| | | |
--------------------------------------------------------------------------------|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|
O0: | O3: | O0: | O3: |
seed = 38 | seed = 38 | seed = 38 | seed = 38 |
1 000 000 iterations | 1 000 000 iterations | 1 000 000 iterations | 1 000 000 iteration |
| | | |
branching 0 sum : 9492960000000, time: 0016.507s (no omp, if branching) | branching 0 sum : 9492960000000, time: 0001.760s (no omp, if branching) | branching 0 sum : 9492960000000, time: 0083.222s (no omp, if branching) | branching 0 sum : 9492960000000, time: 0005.361s (no omp, if branching) |
branching 1 sum : 8490179000000, time: 0019.275s (no omp, ternary branching) | branching 1 sum : 8490179000000, time: 0001.766s (no omp, ternary branching) | branching 1 sum : 8490179000000, time: 0127.412s (no omp, ternary branching) | branching 1 sum : 8490179000000, time: 0005.275s (no omp, ternary branching) |
branching 2 sum : 6194245000000, time: 0001.631s (omp, if branching) | branching 2 sum : 6194245000000, time: 0000.159s (omp, if branching) | branching 2 sum : 6194245000000, time: 0004.650s (omp, if branching) | branching 2 sum : 6194245000000, time: 0000.281s (omp, if branching) |
branching 3 sum : 8511006000000, time: 0001.896s (omp, ternary branching) | branching 3 sum : 8511006000000, time: 0000.160s (omp, ternary branching) | branching 3 sum : 8511006000000, time: 0003.659s (omp, ternary branching) | branching 3 sum : 8511006000000, time: 0000.242s (omp, ternary branching) |
branchless 1 sum : 5625151000000, time: 0020.942s (no omp, no branching) | branchless 1 sum : 5625151000000, time: 0001.813s (no omp, no branching) | branchless 1 sum : 5625151000000, time: 0099.294s (no omp, no branching) | branchless 1 sum : 5625151000000, time: 0005.696s (no omp, no branching) |
branchless 2 sum : 7767285000000, time: 0001.743s (omp, no branching) | branchless 2 sum : 7767285000000, time: 0000.076s (omp, no branching) | branchless 2 sum : 7767285000000, time: 0003.235s (omp, no branching) | branchless 2 sum : 7767285000000, time: 0000.121s (omp, no branching) |
| | | |
total time taken : 0061.994s | total time taken : 0005.734s | total time taken : 0321.472s | total time taken : 0016.976s |
| | | |
--------------------------------------------------------------------------------|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|
O0: | O3: | O0: | O3: |
seed = 40 | seed = 40 | seed = 40 | seed = 40 |
10 000 000 iterations | 10 000 000 iterations | 10 000 000 iterations | 10 000 000 iteration |
| | | |
branching 0 sum : 97131160000000, time: 0164.926s (no omp, if branching) | branching 0 sum : 97131160000000, time: 0017.743s (no omp, if branching) | branching 0 sum : 97131160000000, time: 0835.084s (no omp, if branching) | branching 0 sum : 97131160000000, time: 0052.663s (no omp, if branching) |
branching 1 sum : 36903210000000, time: 0197.652s (no omp, ternary branching) | branching 1 sum : 36903210000000, time: 0017.609s (no omp, ternary branching) | branching 1 sum : 36903210000000, time: 2675.550s (no omp, ternary branching) | branching 1 sum : 36903210000000, time: 0052.400s (no omp, ternary branching) |
branching 2 sum : 72930240000000, time: 0018.749s (omp, if branching) | branching 2 sum : 72930240000000, time: 0001.931s (omp, if branching) | branching 2 sum : 72930240000000, time: 0034.469s (omp, if branching) | branching 2 sum : 72930240000000, time: 0002.176s (omp, if branching) |
branching 3 sum : 41889540000000, time: 0033.025s (omp, ternary branching) | branching 3 sum : 41889540000000, time: 0001.516s (omp, ternary branching) | branching 3 sum : 41889540000000, time: 0062.685s (omp, ternary branching) | branching 3 sum : 41889540000000, time: 0002.092s (omp, ternary branching) |
branchless 1 sum : 72120620000000, time: 0209.128s (no omp, no branching) | branchless 1 sum : 72120620000000, time: 0017.849s (no omp, no branching) | branchless 1 sum : 72120620000000, time: 0954.937s (no omp, no branching) | branchless 1 sum : 72120620000000, time: 0055.094s (no omp, no branching) |
branchless 2 sum : 57091300000000, time: 0017.565s (omp, no branching) | branchless 2 sum : 57091300000000, time: 0000.607s (omp, no branching) | branchless 2 sum : 57091300000000, time: 0031.417s (omp, no branching) | branchless 2 sum : 57091300000000, time: 0000.996s (omp, no branching) |
| | | |
total time taken : 0641.045s | total time taken : 0057.255s | total time taken : 4594.142s | total time taken : 0165.42 |
| | | |
--------------------------------------------------------------------------------|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|