New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmark OpenBLAS, Intel MKL vs ATLAS #18
Comments
Hi Yoshi This is very interesting information. I have been working on getting necpp to work with Eigen (eigen.tuxfamily.org), however it has been difficult because eigen aligns rows and columns of matrices with 4-byte address boundaries. I will keep trying, as it will make an interesting comparison as well. Kind Regards Tim Molteno |
you run atlas on a dual core machine and the other on quad so really if adjust the atlas numbers they are about the same. Plus what is cost of openblas? 0 MKL? ++$$$$ i'll stick with openblas i'd like to see you benchmark plasma, libblis and libflame. i think these will be faster than openblas as they have been updated with current kernels and throw in an openCL comparison if you can. also try an openmpi lib openblas doesn't have a lot of kernels to tune, so when it ran its generic x86_64 configure it probably didn't determine your cache size correctly and is bombing when malloc returns a null pointer. probably. |
http://gcdart.blogspot.jp/2013/06/fast-matrix-multiply-and-ml.html This is a good reference for the discussion. |
Just FYI. MKL is now FREE, free as in free beer, or a free couch on the side of the road used by a guy who looks like Homer Simpson, or free as in the US' ideology on speech. https://software.intel.com/en-us/articles/free_mkl *Disclaimer: These words above are my own and do not reflect the opinion or ideals of Intel. This is not endorsed by any entity. @ytakeyasu Can you share the full compile args you used to link OpenBLAS and MKL? Thanks |
Hi, • Intel (R) product : Intel(R) MKL 11.1 then, I got link options as follow: -wl, --start-group Regards. Yoshi Takeyasu |
I hope this is still active. Did you install the libraries yourself, from sources, or did you use the stock atlas and openblas from a repository. ATLAS really has to be tuned to your system. The tuning can give at least factors of 2-3. |
Hi,
This is not a problem report, but I'd like to share my benchmark of LAPACK / BLAS library. Because of my huge simulation model, I have been replacing my CPU and math-library. My conclusion is that Intel MKL is the best, OpenBLAS is worth to try.
The simulation model (smooth-walled 3-section conical horn antenna) consists of surface-patch by SP&SC. Total run-time is measured by gettimeofday() instead of sysconf(). Note that OpenBLAS performs more than the ratio of CPU's core (Duo vs Quad). As shown in the Flat profile below, 90 % of the calculation is zgemm_kernel_n to be parallel by multi-core.
Flat profile:
matrix_algebra.cpp is modified for OpenBLAS:
With regard to Transposed matrix, zgetrs.c of OpenBLAS is modified also:
This is a dirty solution. It would be appreciated if someone suggest a better solution.
OpenBLAS is superb, but I experienced Memory Seg-Fault in case of over-60GB memory usage and 8 core CPU. Though I've confirmed that this Seg-Fault is NOT caused by NEC2++, but fixing the problem of OpenBLAS was beyond my capability. Then, I migrated to Intel MKL.
matrix_algebra.cpp is modified for Intel MKL:
Link options are:
Intel Math Kernel Library Link Line Advisor suggests these options. I used a little bit older version of the resources.
NEC2++ : ver.1.5.1
OpenBLAS : ver.2.5
Intel MKL : ver.11.1
gcc : ver.4.7.2
icc : ver.13.0.1
I hope this may help your serious number-crunching.
Best regards.
Yoshi Takeyasu
The text was updated successfully, but these errors were encountered: