-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ryzen/ZEN DYNAMIC_ARCH support is broken #1146
Comments
AMD Ryzen support is currently very sketchy, please see the notes in PR #1133 if you haven't already. Instead of setting OPENBLAS_CORETYPE=Zen I suspect building with TARGET=ZEN would have worked better ? (Not sure why the DYNAMIC_ARCH build would fallback to BARCELONA unless you disallowed avx - did you get a fallback warning from the code ?) |
No fallback warnings. As explained in the original post, making a static target works (but is useless for me). |
Also obviously AVX is working because OPENBLAS_CORETYPE=Haswell works fine. |
https://github.com/xianyi/OpenBLAS/blob/develop/driver/others/dynamic.c#L395 This check should've been for "exfamily" instead. I'll patch this up and prepare a pull request. |
Right, thanks. I obviously missed that typo when I committed his PR, and did (do) not have the hardware to notice its effect, sorry. |
Setting OPENBLAS_CORETYPE=Zen still makes this default to Prescott, so there's more fixes to be done. (Unsetting makes it autodetect correctly) |
That appears to be in force_coretype (driver/others/dynamic.c as well) - the for loop that compares corenames runs to 22 only... |
Updated the pull request with a fix for that too. |
Building from source from 20a413e on Ubuntu 16.04 (gcc 5.4.1) with 4.10.1 kernel.
Zen is not autodetected, the machine is detected as Barcelona instead. Trying to manually specify OPENBLAS_CORETYPE=Zen results in Prescott kernels being used.
Removing DYNAMIC_ARCH=1 results in a ZEN kernel. So static detection works (but see next issue, performance is worse than with HASWELL kernel...).
processor : 15
vendor_id : AuthenticAMD
cpu family : 23
model : 1
model name : AMD Ryzen 7 1700 Eight-Core Processor
stepping : 1
microcode : 0x8001105
cpu MHz : 1550.000
cache size : 512 KB
physical id : 0
siblings : 16
core id : 7
cpu cores : 8
apicid : 15
initial apicid : 15
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic overflow_recov succor smca
bugs : fxsave_leak sysret_ss_attrs null_seg
bogomips : 7163.90
TLB size : 2560 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate eff_freq_ro [13] [14]
The text was updated successfully, but these errors were encountered: