-
Notifications
You must be signed in to change notification settings - Fork 2
AMD Ryzen 7 1700 with DDR4 at 2133 MHz
This page registers the performance achieved with the AMD Ryzen 7 1700, sporting 2 DDR4 modules of 16GB in total, each module clocked at 2133 MHz, although they can go up to 2666 MHz. In addition, the motherboard ASUS PRIME X370-PRO, the stock cooler was used and default frequencies were left as automatically defined. The machine used was built and lent by P. Beirão.
- Notes
- HyperThreading was turned on in the BIOS settings.
- The use of mpirun is merely as a helper application. The avxtest* binaries are not running cooperatively.
- Keep in mind that these results are not statistically balanced, since they are the result after a single run.
These were executed on Windows 10 x64, using the MSys2 stack in blueCFD-Core 2016-2, namely GCC 5.3.0. Built with the native options:
g++ -O3 -march=native avxtest.cpp -o avxtest g++ -O3 -march=native avxtest64.cpp -o avxtest64
- 32-bit:
./avxtest
- x86:
- Time taken (ms): 36629.0
- AVX:
- Time taken (ms): 4019.0
- 64-bit:
./avxtest64
- x86_64:
- Time taken (ms): 36659.0
- AVX:
- Time taken (ms): 7463.0
- 32-bit:
mpirun -n 2 ./avxtest
- x86:
- Time taken (ms): 36811.0
- Time taken (ms): 36844.0
- AVX:
- Time taken (ms): 4100.0
- Time taken (ms): 4111.0
- 64-bit:
mpirun -n 2 ./avxtest64
- x86_64:
- Time taken (ms): 36264.0
- Time taken (ms): 36406.0
- AVX:
- Time taken (ms): 7467.0
- Time taken (ms): 7464.0
- 32-bit:
mpirun -n 4 ./avxtest
- x86:
- Time taken (ms): 36713.0
- Time taken (ms): 36789.0
- Time taken (ms): 36791.0
- Time taken (ms): 36859.0
- AVX:
- Time taken (ms): 4105.0
- Time taken (ms): 4061.9997559
- Time taken (ms): 4086.0
- Time taken (ms): 4086.0
- 64-bit:
mpirun -n 4 ./avxtest64
- x86_64:
- Time taken (ms): 36542.0
- Time taken (ms): 36693.0
- Time taken (ms): 36734.0
- Time taken (ms): 36735.0
- AVX:
- Time taken (ms): 7488.0
- Time taken (ms): 7527.0
- Time taken (ms): 7558.0
- Time taken (ms): 7587.0
- 32-bit:
mpirun -n 6 ./avxtest
- x86:
- Time taken (ms): 37988.0
- Time taken (ms): 38032.0
- Time taken (ms): 38113.0
- Time taken (ms): 38121.0
- Time taken (ms): 38168.0
- Time taken (ms): 38237.0
- AVX:
- Time taken (ms): 4499.0
- Time taken (ms): 4558.0
- Time taken (ms): 4532.0
- Time taken (ms): 4438.0
- Time taken (ms): 4567.0
- Time taken (ms): 4542.0
- 64-bit:
mpirun -n 6 ./avxtest64
- x86_64:
- Time taken (ms): 37485.0
- Time taken (ms): 37552.0
- Time taken (ms): 37640.0
- Time taken (ms): 37756.0
- Time taken (ms): 37927.0
- Time taken (ms): 38024.0
- AVX:
- Time taken (ms): 8470.0
- Time taken (ms): 8558.0
- Time taken (ms): 8445.0
- Time taken (ms): 8760.0
- Time taken (ms): 8818.0
- Time taken (ms): 8874.0
- 32-bit:
mpirun -n 8 ./avxtest
- x86:
- Time taken (ms): 39665.0
- Time taken (ms): 40070.0
- Time taken (ms): 40292.0
- Time taken (ms): 40428.0
- Time taken (ms): 40552.0
- Time taken (ms): 40607.0
- Time taken (ms): 40631.0
- Time taken (ms): 40703.0
- AVX:
- Time taken (ms): 4507.0
- Time taken (ms): 4749.0
- Time taken (ms): 4737.0
- Time taken (ms): 4577.0
- Time taken (ms): 4930.0
- Time taken (ms): 4724.0
- Time taken (ms): 4794.0
- Time taken (ms): 4834.0
- 64-bit:
mpirun -n 8 ./avxtest64
- x86_64:
- Time taken (ms): 39574.0
- Time taken (ms): 39633.0
- Time taken (ms): 39683.0
- Time taken (ms): 39877.0
- Time taken (ms): 39897.0
- Time taken (ms): 39934.0
- Time taken (ms): 40008.0
- Time taken (ms): 40101.0
- AVX:
- Time taken (ms): 8911.0
- Time taken (ms): 9091.0
- Time taken (ms): 9133.0
- Time taken (ms): 9198.0
- Time taken (ms): 9640.0
- Time taken (ms): 9177.0
- Time taken (ms): 9492.0
- Time taken (ms): 9523.0
- 32-bit:
mpirun -n 10 ./avxtest
- x86:
- Time taken (ms): 41676.0
- Time taken (ms): 41704.0
- Time taken (ms): 41711.0
- Time taken (ms): 41717.0
- Time taken (ms): 41833.0
- Time taken (ms): 42103.0
- Time taken (ms): 42120.0
- Time taken (ms): 42256.0
- Time taken (ms): 42274.0
- Time taken (ms): 42344.0
- AVX:
- Time taken (ms): 5389.0
- Time taken (ms): 5423.0
- Time taken (ms): 5587.0
- Time taken (ms): 5563.0
- Time taken (ms): 5457.0
- Time taken (ms): 5193.0
- Time taken (ms): 5394.0
- Time taken (ms): 5480.0
- Time taken (ms): 5427.0
- Time taken (ms): 5420.0
- 64-bit:
mpirun -n 10 ./avxtest64
- x86_64:
- Time taken (ms): 40678.0
- Time taken (ms): 40713.0
- Time taken (ms): 40881.0
- Time taken (ms): 40939.0
- Time taken (ms): 40974.0
- Time taken (ms): 40994.0
- Time taken (ms): 41004.0
- Time taken (ms): 41048.0
- Time taken (ms): 41051.0
- Time taken (ms): 41137.0
- AVX:
- Time taken (ms): 10572.0
- Time taken (ms): 10854.0
- Time taken (ms): 10628.0
- Time taken (ms): 11087.0
- Time taken (ms): 10925.0
- Time taken (ms): 11037.0
- Time taken (ms): 11116.0
- Time taken (ms): 11072.0
- Time taken (ms): 11211.0
- Time taken (ms): 11345.0
- 32-bit:
mpirun -n 12 ./avxtest
- x86:
- Time taken (ms): 42752.0
- Time taken (ms): 42882.0
- Time taken (ms): 42932.0
- Time taken (ms): 43024.0
- Time taken (ms): 43160.0
- Time taken (ms): 43237.0
- Time taken (ms): 43290.0
- Time taken (ms): 43450.0
- Time taken (ms): 43463.0
- Time taken (ms): 43448.0
- Time taken (ms): 43526.0
- Time taken (ms): 43717.0
- AVX:
- Time taken (ms): 5834.0
- Time taken (ms): 6125.0
- Time taken (ms): 6181.0
- Time taken (ms): 5875.0
- Time taken (ms): 6115.0
- Time taken (ms): 5737.0
- Time taken (ms): 6008.0
- Time taken (ms): 6020.0
- Time taken (ms): 6317.0
- Time taken (ms): 6115.0
- Time taken (ms): 6095.0
- Time taken (ms): 6132.0
- 64-bit:
mpirun -n 12 ./avxtest64
- x86_64:
- Time taken (ms): 41652.0
- Time taken (ms): 41688.0
- Time taken (ms): 41670.0
- Time taken (ms): 41811.0
- Time taken (ms): 41882.0
- Time taken (ms): 41928.0
- Time taken (ms): 41995.0
- Time taken (ms): 42176.0
- Time taken (ms): 42188.0
- Time taken (ms): 42526.0
- Time taken (ms): 42543.0
- Time taken (ms): 42600.0
- AVX:
- Time taken (ms): 11412.0
- Time taken (ms): 11451.0
- Time taken (ms): 11750.0
- Time taken (ms): 11411.0
- Time taken (ms): 11961.0
- Time taken (ms): 12141.0
- Time taken (ms): 12082.0
- Time taken (ms): 12287.0
- Time taken (ms): 12092.0
- Time taken (ms): 12195.0
- Time taken (ms): 12267.0
- Time taken (ms): 12359.0
- 32-bit:
mpirun -n 14 ./avxtest
- x86:
- Time taken (ms): 44381.0
- Time taken (ms): 44728.0
- Time taken (ms): 44755.0
- Time taken (ms): 44839.0
- Time taken (ms): 44884.0
- Time taken (ms): 44907.0
- Time taken (ms): 44924.0
- Time taken (ms): 44959.0
- Time taken (ms): 44994.0
- Time taken (ms): 45010.0
- Time taken (ms): 45075.0
- Time taken (ms): 45068.0
- Time taken (ms): 45204.0
- Time taken (ms): 45330.0
- AVX:
- Time taken (ms): 5884.0
- Time taken (ms): 6938.0
- Time taken (ms): 6463.0
- Time taken (ms): 6638.0
- Time taken (ms): 6491.0
- Time taken (ms): 6840.0
- Time taken (ms): 6632.0
- Time taken (ms): 6747.0
- Time taken (ms): 6467.0
- Time taken (ms): 6887.0
- Time taken (ms): 6946.0
- Time taken (ms): 6831.0
- Time taken (ms): 6977.0
- Time taken (ms): 6860.0
- 64-bit:
mpirun -n 14 ./avxtest64
- x86_64:
- Time taken (ms): 43192.0
- Time taken (ms): 43216.0
- Time taken (ms): 43272.0
- Time taken (ms): 43321.0
- Time taken (ms): 43360.0
- Time taken (ms): 43384.0
- Time taken (ms): 43411.0
- Time taken (ms): 43463.0
- Time taken (ms): 43450.0
- Time taken (ms): 43464.0
- Time taken (ms): 43531.0
- Time taken (ms): 43513.0
- Time taken (ms): 43563.0
- Time taken (ms): 43592.0
- AVX:
- Time taken (ms): 13847.0
- Time taken (ms): 14200.0
- Time taken (ms): 13957.0
- Time taken (ms): 14116.0
- Time taken (ms): 14208.0
- Time taken (ms): 14346.0
- Time taken (ms): 14210.0
- Time taken (ms): 14202.0
- Time taken (ms): 14213.0
- Time taken (ms): 14292.0
- Time taken (ms): 14149.0
- Time taken (ms): 14247.0
- Time taken (ms): 14208.0
- Time taken (ms): 14607.0
- 32-bit:
mpirun -n 16 ./avxtest
- x86:
- Time taken (ms): 46557.0
- Time taken (ms): 46593.0
- Time taken (ms): 46609.0
- Time taken (ms): 46605.0
- Time taken (ms): 46634.0
- Time taken (ms): 46631.0
- Time taken (ms): 46669.0
- Time taken (ms): 46697.0
- Time taken (ms): 46709.0
- Time taken (ms): 46740.0
- Time taken (ms): 46748.0
- Time taken (ms): 46884.0
- Time taken (ms): 47185.0
- Time taken (ms): 47227.0
- Time taken (ms): 47238.0
- Time taken (ms): 47268.0
- AVX:
- Time taken (ms): 7665.0
- Time taken (ms): 7672.0
- Time taken (ms): 7626.0
- Time taken (ms): 7657.0
- Time taken (ms): 7672.0
- Time taken (ms): 7589.0
- Time taken (ms): 7741.0
- Time taken (ms): 7646.0
- Time taken (ms): 7599.0
- Time taken (ms): 7722.0
- Time taken (ms): 7656.0
- Time taken (ms): 7571.0
- Time taken (ms): 7403.0
- Time taken (ms): 7425.0
- Time taken (ms): 7470.0
- Time taken (ms): 7479.0
- 64-bit:
mpirun -n 16 ./avxtest64
- x86_64:
- Time taken (ms): 44668.0
- Time taken (ms): 44667.0
- Time taken (ms): 44670.0
- Time taken (ms): 44673.0
- Time taken (ms): 44701.0
- Time taken (ms): 44732.0
- Time taken (ms): 44730.0
- Time taken (ms): 44765.0
- Time taken (ms): 44822.0
- Time taken (ms): 44820.0
- Time taken (ms): 44869.0
- Time taken (ms): 44998.0
- Time taken (ms): 45188.0
- Time taken (ms): 45345.0
- Time taken (ms): 45366.0
- Time taken (ms): 45415.0
- AVX:
- Time taken (ms): 16267.0
- Time taken (ms): 16252.0
- Time taken (ms): 16279.0
- Time taken (ms): 16253.0
- Time taken (ms): 16253.0
- Time taken (ms): 16248.0
- Time taken (ms): 16275.0
- Time taken (ms): 16249.0
- Time taken (ms): 16221.0
- Time taken (ms): 16216.0
- Time taken (ms): 16349.0
- Time taken (ms): 16160.0
- Time taken (ms): 16141.0
- Time taken (ms): 15942.0
- Time taken (ms): 16056.0
- Time taken (ms): 16049.0
Architecture/Mode | 1 core | 2 cores (std-dev) | 4 cores (std-dev) | 6 cores (std-dev) | 8 cores (std-dev) | 10 cores (std-dev) | 12 cores (std-dev) | 14 cores (std-dev) | 16 cores (std-dev) |
---|---|---|---|---|---|---|---|---|---|
x86 (ms) | 36629.0 | 36827.5 (16.5) | 36788.0 (51.6623654124) | 38109.8333333 (82.2504643688) | 40368.5 (328.777660433) | 41973.8 (257.134906226) | 43240.0833333 (283.307094373) | 44932.7142857 (218.790646369) | 46812.125 (252.61157807) |
x86_64 (ms) | 36659.0 | 36335.0 (71.0) | 36676.0 (79.1991161567) | 37730.6666667 (193.961909204) | 39838.375 (176.079483118) | 40941.9 (139.49225785) | 42054.9166667 (334.646195838) | 43409.4285714 (120.832064267) | 44901.8125 (263.846361248) |
AVX float (ms) | 4019.0 | 4105.5 (5.5) | 4084.74993898 (15.254188823) | 4522.66666667 (43.5953616289) | 4731.5 (126.727068932) | 5433.3 (102.353358518) | 6046.16666667 (155.191190328) | 6685.78571429 (282.724292395) | 7599.5625 (100.146747794) |
AVX double (ms) | 7463.0 | 7465.5 (1.5) | 7540.0 (36.7627528893) | 8654.16666667 (169.943536375) | 9270.625 (235.508459243) | 10984.7 (231.795621184) | 11950.6666667 (339.670755618) | 14200.1428571 (167.194009267) | 16200.625 (102.303393761) |
- | - | - | - | - | - | - | - | - | - |
Core frequency (MHz) (Windows 10 Task Manager) |
3100 | 3100 | 3100 | 3100 | 3150 | 3150 | 3150 | 3150 | 3150 |
downscale ratio (c1/cx) | 1 | 1 | 1 | 1 | 0.98 | 0.98 | 0.98 | 0.98 | 0.98 |
x86 | 1 | 1.00541920336 | 1.00434082285 | 1.04042789411 | 1.10209123918 | 1.14591716946 | 1.1804876828 | 1.22669781555 | 1.2780071801 |
x86_64 | 1 | 0.991161788374 | 1.00046373333 | 1.02923338516 | 1.08672836138 | 1.11683079189 | 1.14719214017 | 1.18414109963 | 1.22485099157 |
AVX float | 1 | 1.02152276686 | 1.01635977581 | 1.12532139006 | 1.1772829062 | 1.35190345857 | 1.50439578668 | 1.66354459176 | 1.89090880816 |
AVX double | 1 | 1.00033498593 | 1.01031756666 | 1.15960962973 | 1.24221157711 | 1.4718879807 | 1.60132207781 | 1.90273922781 | 2.17079257671 |
Before going to specific inferences, there a few details to keep in mind:
- The Ryzen 7 1700 is designed to work at 3.0 GHz as a base frequency and should be able to reach a maximum of 3.7GHz.
- Therefore, expect a scale down with the CPU speed of a 3.7/3.0 GHz ratio = 1.23(3).
- However, Windows 10 Task Manager roughly indicated the frequencies listed on the previous table.
- It's possible that it could not reach higher frequencies either due to the stock cooling or because logical core affinity was not defined when running the applications.
- This CPU only has 2 memory channels, even though this isn't very relevant to the application being used, since the matrices are relatively small.
- 32-bit runs either at a similar speed or worse speed than 64-bit runs, at least when not using AVX.
- It's not clear why there was some slowdown when using more cores, but it's possible that the Task Manager did not accurately reported the CPU frequency, if the cores are automatically sped up internally and not reported to the OS, given that all types of calculations slowed down as more cores were used... although it could also have been due to not using affinity locking for each application.
- There is a very interesting result regarding the normal calculations (non-AVX), when using the pseudo-hyperthreads, using all 16 logical cores gave a reduction of around 25% in speed per core... which compared to the:
- A10 7850K with DDR3 at 2133 MHz had only had a reduction of 5%, when it used all 4 logical cores, out of the 2 core modules;
- i7 2600K with DDR3 1600 MHz had 55% slow down, when using all 8 logical cores out of 4 real cores.
- Which means that the Ryzen 7 HT'ish mechanism is more efficient than Intel's, and apparently it is as efficient as the A10's architecture... at least when not using AVX.
- When looking at the AVX performance and how it scales with other CPUs, it does seem to downscale with CPU frequency when using the real cores, but the performance when using all 16 logical cores seems a bit strange, although it did fair better than the old i7 2600K with DDR3 1600 MHz.
- When comparing the non-AVX performance and how it faired against the other tests on this wiki, it's easiest to compare to the E5 2680v2 with DDR3 1600 MHz, given the similar frequencies and high core count:
- Comparing using only 8 cores, the Ryzen 7 1700 is a bit slower than the E5 2680v2, although using all logical cores on the AMD gave a ratio of ~2.8 seconds/core at non-AVX 64-bit versus 3.1 on the Intel... but this isn't exactly fair, since Hyperthreading wasn't turned on the tested E5 2680v2.
- Although doing the same comparison for AVX, the E5 2680v2 beats on whichever comparison of cores... although with 1 core, the performance is similar for 64-bit AVX.
- Overall, it could be said in favour of the Ryzen 7 1700, is that it is somewhat comparable to the E5 2680v2 in CPU performance, while selling for a fraction of the cost: around 300 USD versus 1700 USD.
- The non-AVX performance and scale down with CPU speed is fairly interesting, given that the same does not happen on many of the Intel's HyperThreading implementations, such as the i7 950 with DDR3 1333 MHz.
- The traditional FPU is not shared between cores, but the AVX is shared between each core pair. This is one of the reasons why it is similar to HyperThreading and yet it performs better when used in regular FPU operations.
The information provided on this wiki is meant only as a quick reference of results and did not go through a strong peer review nor statistical analysis. The source code is open to the public and has no warranty on whether it works properly or not.
Feel free to run your own tests to get your own results. Quote the content of this wiki page with the respective link to it, if you use these results.