-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
question about which version to use #43
Comments
Hi @wangyuyue To repro mlperf results, you may refer to this doc. It will choose the right ideep commit for you cd pytorch
git fetch origin pull/25235/head:mlperf
git checkout mlperf
git submodule sync && git submodule update --init --recursive |
Hi, thanks for the response. I notice that I should set MKLROOT. Should I install MKL in advance and set MKLROOT as /opt/intel/mkl (the default install location is /opt/intel)? |
Yes, you should install MKL. We do not use intel c++ compiler, or have we used ICX (intel next gen)? I'm not sure. You could set libgomp.so instead of libiomp5.so, but that will have performance impact because all the configurations targeted for intel OpenMP. And intel openmp is an open source project under LLVM umbrella. |
|
Oh, my bad, Euler must be compiled by ICC or you have absolutely no performance at all. 🥳 |
I'm confused... So what does this project use? Euler, MKL, or MKL-DNN? I notice that when generating int8 model and build pytorch with Deep-learning-math-kernel, |
Now I installed icc and icpc. Firstly I run
Then I run
|
DNNL/MKL is major acceleration library with production quality. Euler was a winograd kernel library. For a competitive performance, as long as code was published for scrutiny, we could do whatever reasonable optimization we want. So it is more complicated than ordinary environment. 😊 |
@wangyuyue For the Euler build error, can you paste the contents of generated build-commands after cmake as below?
|
|
Can you help to check your ICC version? |
I downloaded the installer from Intel system studio, and I think it's the latest version. Can you also provide the install webpage to me if I need to install an older version? Thanks. |
Here are some environment variables I set:
Hope this is helpful for the debug. |
Can run verbose make?
BTW, for ICC, normally you run the setup procedure like this (before the build): |
|
Can you check your CPU info like this:
Euler expects you have at least Skylake server with AVX512 instruction set. For MLPerf it requires Cascade Lake (CLX) or Cooper Lake (CPX) which has AVX512_VNNI instruction support. The error you met should because your building box is not with AVX512 (the -xHost tells the compiler to generate instructions according to your host building environment but Euler has intrinsics using AVX512). If you can please use CLX to build. Or you can cross-compile it using the command below. But you may not run it without a HW with AVX512/VNNI.
|
My server doesn't have AVX512 support. So do I have an alternative? I mean skip Euler related matters and complete the MLPerf experiment. I hope I don't need to make too many changes. Thanks. |
You can disable Euler and use MKLDNN engine instead. During the build of PyTorch/Caffe2, disable EULER like this:
BTW which document are your referencing to? @wuhuikx May have more insights on that part. |
AVX512 is needed for MLPerf numbers. Otherwise you can only run the workload. 😊 |
Hi, I don't need to get the exactly same performance as the original experiment. It is enough to run the whole process. |
@CaoZhongZ Yes. MLPerf submission is based on INT8 low precision. At 2019 even MKLDNN does not support INT8 on AVX2. So the minimal HW requirement is Skylake with AVX512 to run INT8 with either MKLDNN or Euler. @wangyuyue |
int8 optimizations for systems without Intel AVX512 are available in oneDNN v1.2 and later. |
Thanks for the response. So I need to clone the latest oneDNN as third party of Pytorch? Do I need some other changes? @vpirogov
|
Unfortunately this will not work unless quite an effort on back porting. @wangyuyue |
Hi, I'm trying to reproduce MLPerf/inference_results_v0.5, and I meet some trouble with ideep API.
For example,
auto in_format = dataOrder_ == "NCHW" ? ideep::format::nchw : ideep::format::nhwc;
, I think it has changed to format_tag. And reinit has changed to init, right?But I don't know all correspondences. Should I use an older version ideep as third_party of pytorch and rebuild pytorch from source? I really appreciate it if you can help me.
The text was updated successfully, but these errors were encountered: