-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Huge performance difference between Intel OpenMP and Gcc OpenMp #230
Comments
Hi @yinghai, OpenMP is used for parallelization. More cores you have higher performance you get. The rough estimation for the scaling is linear: 2x more cores 2x faster MKL-DNN. There are some other factors also, of course. So on the data you sent it seems like you have ~12 core system, for which using OpenMP is essential to get good performance. |
@emfomenk Thanks. But my question is not about using or not using OpenMP. It's about the performance difference between using Intel's OpenMP library (libiomp) using gcc's OpenMp. In both cases, I can see from |
Beg your pardon, I read the original message inaccurately...
|
@emfomenk Cool! With this env var, it gives similar performance results to iomp ones. Thanks for the tip. However, |
Typically Intel MKL-DNN is built with Intel MKL-ML (the small subset of Intel MKL). In that case if you want to use GNU OpenMP you should link against In your case you build Intel MKL-DNN with full Intel MKL (by linking with If either
You have 4 choices:
In the later 2 cases no extra settings are required. |
We do have those libraries in our MKL installation. Let me try that! Thanks! |
@emfomenk Thanks. I tried option 3 by changing cmakefile and it worked! https://github.com/yinghai/mkl-dnn/blob/mkl/cmake/MKL.cmake#L141-L158 |
Good to hear that it works! |
@emfomenk @yinghai beg you two pardon, I don't understand this. according to yinghai's first post of ldd output, the slower one is only using GNU OpenMP. and the faster one seems to mix the GOMP and IOMP? do I miss something?
|
Hi @ftian1, The missing part is:
Here, libmkl_rt by default loads Intel OpenMP. That was the cause of the issue. And that is why I recommended to use:
The example was compiled with GNU OpenMP, but Intel MKL-DNN was not instructed to use GNU OpenMP, so it used Intel OpenMP. When you mix two OpenMP RTs in one application you might get huge performance penalties and even incorrect results. So this is prohibited. In general there is no big difference whether you use GNU OpenMP or Intel OpenMP (as long as you use only one of them). Both are very well optimized. |
thanks, Evarist. "ldd libmklml_intel.so" shows it depends on libiomp5.so. |
Both
However, |
sorry, I may not describe my doubt clearly. I am still confused by the slow perf of simple_net_cpp with GCC openmp although you have explained libmkl_rt.so dispatcher would invoke iomp or gomp according to env or func call and this app mixed the use of iomp and gomp. my understanding is "ldd" cmd shows all dynamic dependency libraries of an executive file. From the ldd log shared by yinghai, the simple_net_cpp with gcc openmp build doesn't show it depends on libiomp. then how could it finally invoke iomp implementation through libmkl_rt.so? the only explanation from my side is libmkl_rt.so statically links iomp lib. but you have said it doesn't. I am confused on this point. hoping I present my question clearly:(
|
ldd shows only explicit dependencies that you pass to a linker at DSO or application link time: $ gcc -shared foo.o -olibfoo.so -lbar
$ ldd libfoo.so
libbar.so => ... However you can also have an implicit dependency on other libraries using That's exactly what One of the reasons for that is because |
@emfomenk I see. forgot dlopen() after long time no use...:( |
I can;t find any references to omp:
|
^^ btw this is latest 2020 version of mkl that I just downloaded today |
Intel MKL manages OpenMP runtime in dynamic libraries using
|
how can I tell it which one to load or know which one it will load? it’s not reasonable to not allow a previously loaded omp, especially llvm omp, since this is largely out of our control via clang or transitive dependencies. it’s actually quite offensive that MKL tries to dictate to me stuff that should be under my discretion |
You choose the OpenMP threading layer to link to library based on the compiler you use. If you use TBB, you link to the TBB library. If you link to The fact that there are multiple OpenMP threadling layers is an unfortunate consequence of the fact that OpenMP libraries from different compilers are not compatible. Also, note, that Clang OpenMP is not supported, as far as I know. Maybe @aaraujom can correct me here. |
As far as I know clang is not supported by Intel MKL. I recommend checking Intel MKL Link Line Advisor for supported linking configurations. |
Hi folks,
I am trying mkl-dnn. We have Intel MKL library installed but for some reason we don't want to use Intel's OpenMP (to prevent clash with upstream project which potentially could use gcc openmp). So I hacked the
cmake/OpenMP.cmake
to remove lines below https://github.com/intel/mkl-dnn/blob/40eb4d898e4caf7f23fdd25b348915134e878080/cmake/OpenMP.cmake#L51.After that I ran
examples/simple-net-cpp
and noticed a huge difference in performance. Here are the comparison:My question whether this is expected? And how come change of OpenMP implementations has such an impact? Thanks.
The text was updated successfully, but these errors were encountered: