-
Notifications
You must be signed in to change notification settings - Fork 2
i7 2600K with DDR3 1600 MHz
wyldckat edited this page Jan 30, 2016
·
2 revisions
This page registers the performance achieved with the Intel i7-2600K (stock clocking), sporting 4 DDR3 modules of 4 GB each at 1600 MHz. The machine used is part of blueCAPE's IT pool.
- Notes
- HyperThreading was turned on in the BIOS settings.
- The use of mpirun is merely as a helper application. The avxtest* binaries are not running cooperatively.
- Keep in mind that these results are not statistically balanced, since they are the result after a single run.
These were executed on CentOS 6.5 x86_64, using a custom build of GCC 4.8.4. Built with the native options:
g++ -O3 -march=native avxtest.cpp -o avxtest g++ -O3 -march=native avxtest64.cpp -o avxtest64
- 32-bit:
./avxtest.sh
- x86:
- Time taken (ms): 26080.0
- AVX:
- Time taken (ms): 3280.0
- 64-bit:
./avxtest64.sh
- x86_64:
- Time taken (ms): 25750.0
- AVX:
- Time taken (ms): 7230.0
- 32-bit:
mpirun -n 2 ./avxtest.sh
- x86:
- Time taken (ms): 26710.0
- Time taken (ms): 26780.0
- AVX:
- Time taken (ms): 3390.0
- Time taken (ms): 3400.0
- 64-bit:
mpirun -n 2 ./avxtest64.sh
- x86_64:
- Time taken (ms): 26450.0
- Time taken (ms): 26520.0
- AVX:
- Time taken (ms): 7350.0
- Time taken (ms): 7400.0
- 32-bit:
mpirun -n 4 ./avxtest.sh
- x86:
- Time taken (ms): 27850.0
- Time taken (ms): 27900.0
- Time taken (ms): 28000.0
- Time taken (ms): 28060.0
- AVX:
- Time taken (ms): 3540.0
- Time taken (ms): 3550.0
- Time taken (ms): 3520.0
- Time taken (ms): 3550.0
- 64-bit:
mpirun -n 4 ./avxtest64.sh
- x86_64:
- Time taken (ms): 27420.0
- Time taken (ms): 27500.0
- Time taken (ms): 27530.0
- Time taken (ms): 27660.0
- AVX:
- Time taken (ms): 7760.0
- Time taken (ms): 7690.0
- Time taken (ms): 7820.0
- Time taken (ms): 7780.0
- 32-bit:
mpirun -n 8 ./avxtest.sh
- x86:
- Time taken (ms): 40810.0
- Time taken (ms): 40790.0
- Time taken (ms): 40720.0
- Time taken (ms): 40870.0
- Time taken (ms): 40830.0
- Time taken (ms): 40690.0
- Time taken (ms): 40740.0
- Time taken (ms): 40670.0
- AVX:
- Time taken (ms): 7210.0
- Time taken (ms): 7250.0
- Time taken (ms): 7220.0
- Time taken (ms): 7200.0
- Time taken (ms): 7010.0
- Time taken (ms): 7260.0
- Time taken (ms): 7110.0
- Time taken (ms): 7150.0
- 64-bit:
mpirun -n 8 ./avxtest64.sh
- x86_64:
- Time taken (ms): 40050.0
- Time taken (ms): 39910.0
- Time taken (ms): 39970.0
- Time taken (ms): 39920.0
- Time taken (ms): 39900.0
- Time taken (ms): 39940.0
- Time taken (ms): 39940.0
- Time taken (ms): 39920.0
- AVX:
- Time taken (ms): 18580.0
- Time taken (ms): 18440.0
- Time taken (ms): 18360.0
- Time taken (ms): 18660.0
- Time taken (ms): 18150.0
- Time taken (ms): 18650.0
- Time taken (ms): 18700.0
- Time taken (ms): 18550.0
Architecture/Mode | 1 core | 2 cores (std-dev) | 4 cores (std-dev) | 8 cores (std-dev) |
---|---|---|---|---|
x86 (ms) | 26080.0 | 26745.0 (35.0) | 27952.5 (82.2724133595) | 40765.0 (66.3324958071) |
x86_64 (ms) | 25750.0 | 26485.0 (35.0) | 27527.5 (86.4219300872) | 39943.75 (44.9826355386) |
AVX float (ms) | 3280.0 | 3395.0 (5.0) | 3540.0 (12.2474487139) | 7176.25 (78.0924932372) |
AVX double (ms) | 7230.0 | 7375.0 (25.0) | 7762.5 (47.1036092035) | 18511.25 (173.812938241) |
- | - | - | - | - |
Core frequency (MHz) (cpufreq-aperf) |
3775 | 3673 | 3469 | 3469 |
downscale ratio (c1/cx) | 1 | 1.0277702151 | 1.0882098587 | 1.0882098587 |
x86 | 1 | 1.02549846626 | 1.07179831288 | 1.56307515337 |
x86_64 | 1 | 1.02854368932 | 1.06902912621 | 1.55121359223 |
AVX float | 1 | 1.03506097561 | 1.07926829268 | 2.18788109756 |
AVX double | 1 | 1.02005532503 | 1.07365145228 | 2.56033886584 |
- The downscale ratio on the x86/x86_64 calculations are all almost exactly on the mark for the downscale range, with the exception when relying on the HyperThreading feature.
- Note: the frequencies were revised with cpufreq-aperf, to compare with the ones at cpu-world.com.
- On this machine that has HyperThreading turned on, it was possible to assess that... this feature has a considerable good impact on improved performance, but only when simply using x86 FPU; while when using AVX, it lost a lot of performance, which implies that some tuning on the compiler (or CPU?) side is needed for better scheduling.
- On 32-bit x86 FPU, it meant roughly 37% more processing power: 2*27952.5/40765 - 1.0 = 0.3714
- On 64-bit x86 FPU, it meant almost 38% more processing power: 2*27527.5/39943.75 - 1.0 = 0.3783
- For an additional reference, cpubenchmark.net gives an index of 8525 to this CPU, but keep in mind that this index accounts for HyperThreading. The estimated index without HyperThreading is at a 1.3 ratio, namely 6558.
- An interesting result is that the 64-bit timings are only slightly faster than the 32-bit ones.
The information provided on this wiki is meant only as a quick reference of results and did not go through a strong peer review nor statistical analysis. The source code is open to the public and has no warranty on whether it works properly or not.
Feel free to run your own tests to get your own results. Quote the content of this wiki page with the respective link to it, if you use these results.