I wanted to get maximal single core performamce for not too much money, and used Passmark Single Thread Performance list for investigation. I found some not that expensive above 4,000 Thread Mark, but 7600X with 4,221 was best price/performance.
Rank | CPU Name | Core | Turbo | Cores | CPU Mark | Thread Mark | ||
---|---|---|---|---|---|---|---|---|
[MHz] | [MHz] | [Million Operations/s] | [Million Operations/s] | |||||
Pi400 | 3513 | ARM Cortex-A72 | 1800 | N/A | 4 | 769 | 542 | |
laptop | 407 | Intel Core i7-11850H | 2500 | 4800 | 8 | 20,833 | 3,152 | |
workstation | 290 | Intel Core i7-11700K | 3600 | 5000 | 8 | 24,670 | 3,422 | of older son |
new PC | 18 | AMD Ryzen 5 7600X | 4700 | 5300 | 6 | 28,969 | 4,221 | |
19,506 | 3,690 | seen @4.7GHz |
AMD Ryzen 5 7600X PC came with Win11 preinstalled on a 240GB SATA SSD.
I decided to keep that for being able to compare performance of Win11 with Ubunutu 22.04 Server on identical hardware and Bios settings.
I installed Ubuntu 22.04 server on an external "SanDisk Extreme Pro USB 3.1 Solid Flash Drive" for now.
Later I installed Ubuntu 22.04 desktop on additionally bought Crucial P3 Plus 500GB M.2 NVMe SSD.
Finally I did buy 16GB DDR5-5600 RAM to replace the PC's 8GB DDR-4800 RAM.
I switched to Linux at work 16 years agon, at home even earlier. Therefore I have no experience in how to make Win11 secure, and decided to block internet access for new PC in cable router. So my Win11 will never get internet access again (it likely had internet access at company where I bought it, during installation of Win11).
In case new PC Ubuntu needs internet access (eg. to install new stuff), I enable internet access in cable router, install and then disable internet access again.
For installing new software on Win11, I download install package on my Pi400 into ~/win11
directory. Then I use Win11 Terminal scp
to copy onto Win11, and install then.
New PC has Lan access, and I created some screenshots on Win11 side with installed "snipping tool". I copy files to my Pi400 with Win11 Terminal scp
.
Windows Defender currently does not allow to ping/ssh/scp into Win11 from Pi400, but as long as the other direction is fully functional that is fine.
I searched for "office PC 7600X" on German ebay, and found this as cheapest with 7600X CPU (486.89€ or 533.66$): https://www.ebay.de/itm/195733279470Minimal parts: AMD Ryzen 5 7600X CPU, ASRock A620M-HDV/M.2 mainboard, 8GB DDR5-4800 RAM, 240GB SSD, integrated graphics, ...
Minimal because I did not know what might be important, and high single core integer performannce for factoring very big numbers might not need more (also for my https://github.com/Hermann-SW/RSA_numbers_factored).
I never bought a CPU only 9 months after launch date before ... current price of CPU is 50% of total PC price:
https://www.amd.com/en/products/cpu/amd-ryzen-5-7600x
The additional Crucial P3 Plus 500GB M.2 NVMe SSD did cost 28.56€ or 31.20$. It supports Gen4x4 which is available on ASRock mainboard. Maximal read speed is "up to" 5000MB/s, measured speed of 1460MB/s is nice as well:
hermann@7600x:~$ sudo hdparm -tT /dev/nvme0n1
/dev/nvme0n1:
Timing cached reads: 66182 MB in 2.00 seconds = 33165.89 MB/sec
Timing buffered disk reads: 4384 MB in 3.00 seconds = 1460.92 MB/sec
hermann@7600x:~$
Finally the Crucial RAM 16GB DDR5 5600MHz did cost 48.99€ or 53.70$.
So complete price (PC + NVME SSD + DDR5-5600 RAM) was 565€ or 619$.
Currently all inestigations on Ubuntu and Win11 were done without PBO and with 8GB DDR5-4800 RAM.
I enabled PBO and 5200MHz DDR5-5600 RAM profile in Bios.
After that PC did not boot anymore, on many retries.
So I did reset Bios by shortening mainboard "Clear CMOS Jumper".
After that Bios was back on default values (most "Auto").
And booted straight away into Ubuntu, where I did run Passmark performance test.
I could not believe what I saw on screen, so did shoot several smartphone photos.
6 cores reported for 5452MHz, and CPU Single Threaded
value of 4159 Million Operations/s.
Next I disabled SMT (Simultaneous MultiThreating) and SVT (Secure Virtual Machine) in Bios, that were enabled by Bios reset. I want to run at most one compute intensive task per one of the 6 cores, and not do virtualization.
After reboot into Win11, running CPU-Z stress test showed CPU temperature maximum of 76°C only.
This screenshot shows Passmark run with CPU Single Threaded
of 4222 Million Operations/s.
That is rank 18(!) of Passmark's Single Threaded Performance
list of 3100+ CPU models, for only 619$.
https://www.cpubenchmark.net/singleThread.html
The other numbers are not as they should be, and I have currently no explanation why Bios default settings give rank 18 at all ...
I just made sure that single core performance is real, by determining sqrt(-1) (mod p)
for 10,000-/36,401-/100,355-digit primes, see section #c-with-libgmpxx below for details. Really only 75.7%(!) of i7-11850H runtime for 100,355-digit prime (matching single threaded list numbers for both CPUs well: 3152 / 4222 = 74.7%).
#digits of prime | Ubuntu 22.04 | RHEL 8.7 | |
---|---|---|---|
7600X w/ boost |
i7-11850H | 7600X / i7-11850H |
|
10000 | 2.2 | 2.8 | 78.6% |
36401 | 45.8 | 59.6 | 76.8% |
100355 | 452.3 | 597.1 | 75.7% |
What I like more than just the numbers is, that perf stat
reports 5,306 GHz
for 7.5 minutes runtime on average!
hermann@7600x:~/RSA_numbers_factored/c++$ perf stat ./sqrtm1 2 > out 2> err
hermann@7600x:~/RSA_numbers_factored/c++$ grep GHz err
2.399.752.357.779 cycles # 5,306 GHz (83,33%)
hermann@7600x:~/RSA_numbers_factored/c++$ head -3 err
smallest quadratic non-residue prime: 7
452.293s
done
hermann@7600x:~/RSA_numbers_factored/c++$
Last, but not least, I did run sqrtm1 on 36401-digit prime 6× in parallel:
for((i=1; i<=6; ++i)); do perf stat ./sqrtm1 1 > out$i 2> err$i & done; sleep 50
perf stat
reported 5.019GHz on all 6 cores! And runtimes increased by at most 7.8% (49.379s/45.8s=107.8%) compared to single core execution.
hermann@7600x:~/RSA_numbers_factored/c++$ grep GHz err?
err1: 246.553.792.473 cycles # 5,019 GHz (83,32%)
err2: 246.044.857.451 cycles # 5,019 GHz (83,33%)
err3: 246.383.433.696 cycles # 5,019 GHz (83,33%)
err4: 247.880.628.160 cycles # 5,020 GHz (83,33%)
err5: 245.969.957.044 cycles # 5,019 GHz (83,33%)
err6: 246.262.037.340 cycles # 5,019 GHz (83,33%)
hermann@7600x:~/RSA_numbers_factored/c++$ grep "^[0-9][0-9.]*s" err?
err1:49.119s
err2:49.0201s
err3:49.0884s
err4:49.379s
err5:49.0044s
err6:49.0637s
hermann@7600x:~/RSA_numbers_factored/c++
49s runtimes for computing "2 to the power of (p-1)/4 modulo p" for big prime number p (36401 decimal digits).
Factoring RSA-100 in 1:32:59h was already impressive, but factoring RSA-110 in less than 14h is amazing (older computer took more than 26h). My msieve fork makes sure that computation is identical on all platforms (by a fixed random seed, and rewriting CPU cache size dependent code):
https://github.com/Hermann-SW/msieve#readme
hermann@7600x:~/msieve/test-110$ perf stat make 2>err
...
52c52
< elapsed time 13:43:43
---
> elapsed time 26:41:28
hermann@7600x:~/msieve/test-110$ grep GHz err
258.615.607.146.484 cycles # 5,233 GHz (83,33%)
hermann@7600x:~/msieve/test-110$
The "only slight increase" (7.8%) in CPU runtimes when running 6 jobs in parallel reported for running sqrtm1
above does not hold for factoring RSA numbers with msieve. Runtime overhead gradually increases with the number of processes. Factoring RSA-79 six times in parallel adds 20% to runtime, factoring RSA-100 six times in parallel adds 53.2% CPU runtime. Find details here:
https://github.com/Hermann-SW/msieve#readme
These numbers were determined with PassMark GUI application for Win11, and 64bit Linux command line app for Ubuntu. Same Bios settings (SMT and SVM disabled, no PBO), same PC. I have no idea yet why some numbers are so different between both OSes, for "the same" PassMark software.
Ubuntu 22.04 | Win11 | |||
---|---|---|---|---|
pt_linux_x64 -r 1 | (GUI) CPU MARK | |||
CPU_INTEGER_MATH | 45436.9 | 45188 | Million Operations/s | |
CPU_FLOATINGPOINT_MATH | 40944.2 | < | 42643 | Million Operations/s |
CPU_PRIME | 167.7 | << | 227 | Million Primes/s |
CPU_SORTING | 27179.5 | < | 28144 | Thousand Strings/s |
CPU_ENCRYPTION | 10110.3 | > | 9267 | MB/s |
CPU_COMPRESSION | 199937.7 | 198128 | KB/s | |
CPU_SINGLETHREAD | 3690.2 | 3703 | Million Operations/s | |
CPU_PHYSICS | 2080.4 | >> | 1619 | Frames/s |
CPU_MATRIX_MULT_SSE | 18412.6 | << | 20983 | Million Matrices/s |
Bench on left side shows near optimal Multi Thread Ratio of 6.
Stress CPU on right after 30 minutes shows maximal CPU temperature less than 70°C.
I had huge problems with PARI/GP on Ubuntu 22.04 server on 7600X CPU (not on Win11 on 7600X):
httphttps://www.amazon.de/dp/B0B25NTRGD?s://pari.math.u-bordeaux.fr/archives/pari-users-2306/msg00095.html
All problems are gone after installing Ubuntu 22.04 Desktop on a fast SSD (history: problem PARI/GP section).
I compiled PARI/GP from source "Stable version: pari-2.15.3.tar.gz (4.9 MB), Mar 14 2023" for Ubuntu and RHEL. I installed "Stable 64-bit version: Pari64-2-15-3.exe (97.9 MB), Mar 14 2023" on Win11.
I used https://github.com/Hermann-SW/RSA_numbers_factored/blob/main/pari/sqrtm1.gp for computing sqrt(-1) (mod p)
for 10,000-/36,401-/100,355-digit primes.
I had this in gprc
for all OSes:
...
parisizemax = 2G
parisize = 2G
...
#digits of prime | (Win11) | Ubuntu 22.04 | RHEL 8.7 | |
---|---|---|---|---|
(7600X) | 7600X | i7-11850H | 7600X / i7-11850H |
|
10000 | (3.0) | 2.7 | 3.0 | 90% |
36401 | (70.5) | 65.9 | 74.3 | 88.7% |
100355 | (660.8) | 611.4 | 708.9 | 86.2% |
With these measurements I can rule out Win11 for 7600X, since it is 8.0% slower than Ubuntu 22.04 desktop on same CPU plus Bios settings.
I did run 6 gp
on 36,401-digit prime in parallel, nothing else during that time:
for((i=1;i<=6;++i)); do perf stat gp < sqrtm1.gp > out$i 2> err$i & done
7600X runs at 4.684GHz on all six cores, with runtimes [ms] 65974 / 66008.5 / 66052 (min/avg/max), only slightly longer than 65915ms for single process running:
hermann@7600x:~/RSA_numbers_factored/pari$ grep GHz err?
err1: 309.919.346.018 cycles # 4,684 GHz (83,34%)
err2: 310.044.500.390 cycles # 4,684 GHz (83,34%)
err3: 310.427.368.510 cycles # 4,684 GHz (83,33%)
err4: 310.091.040.427 cycles # 4,684 GHz (83,34%)
err5: 310.295.693.879 cycles # 4,684 GHz (83,33%)
err6: 310.013.102.550 cycles # 4,684 GHz (83,33%)
hermann@7600x:~/RSA_numbers_factored/pari$ grep result out?
out1: *** last result computed in 1min, 5,974 ms.
out2: *** last result computed in 1min, 5,989 ms.
out3: *** last result computed in 1min, 6,033 ms.
out4: *** last result computed in 1min, 6,013 ms.
out5: *** last result computed in 1min, 6,052 ms.
out6: *** last result computed in 1min, 5,990 ms.
hermann@7600x:~/RSA_numbers_factored/pari$
The number of instructions is slightly different for the 6 gp
s:
hermann@7600x:~/RSA_numbers_factored/pari$ grep instructions err?
err1: 933.294.654.610 instructions # 3,01 insn per cycle
err2: 933.309.870.993 instructions # 3,01 insn per cycle
err3: 933.319.974.107 instructions # 3,01 insn per cycle
err4: 933.300.655.312 instructions # 3,01 insn per cycle
err5: 933.300.069.071 instructions # 3,01 insn per cycle
err6: 933.308.641.012 instructions # 3,01 insn per cycle
hermann@7600x:~/RSA_numbers_factored/pari$
perf stat
output for single 100355-digit prime run on 7600X Ubuntu (currently without PBO=Performance Boost Optimization, with DDR5-4800 RAM):
611.605,71 msec task-clock # 1,000 CPUs utilized
8.980 context-switches # 14,683 /sec
0 cpu-migrations # 0,000 /sec
244.980 page-faults # 400,552 /sec
2.867.129.733.053 cycles # 4,688 GHz (83,33%)
187.295.262 stalled-cycles-frontend # 0,01% frontend cycles idle (83,33%)
1.994.163.508 stalled-cycles-backend # 0,07% backend cycles idle (83,33%)
8.797.163.689.795 instructions # 3,07 insn per cycle
# 0,00 stalled cycles per insn (83,33%)
473.206.111.956 branches # 773,711 M/sec (83,34%)
1.187.115.474 branch-misses # 0,25% of all branches (83,33%)
611,688749737 seconds time elapsed
611,451802000 seconds user
0,147991000 seconds sys
I used https://github.com/Hermann-SW/RSA_numbers_factored/blob/main/c%2B%2B/sqrtm1.cc for computing sqrt(-1) (mod p)
for 10,000-/36,401-/100,355-digit primes here as well. 7600X runtimes [s] are 10%+ better than i7-11850H. Currently without
PBO (Performance Boost Optimization), matching Passmark ratio from above Thread Mark values 3152/3690=85.4%:
#digits of prime | Ubuntu 22.04 | RHEL 8.7 | |
---|---|---|---|
7600X | i7-11850H | 7600X / i7-11850H |
|
10000 | 2.5 | 2.8 | 89.3% |
36401 | 52.5 | 59.6 | 88.1% |
100355 | 505.4 | 597.1 | 84.6% |