question about which version to use #43

wangyuyue · 2020-08-10T01:06:26Z

Hi, I'm trying to reproduce MLPerf/inference_results_v0.5, and I meet some trouble with ideep API.
For example, auto in_format = dataOrder_ == "NCHW" ? ideep::format::nchw : ideep::format::nhwc; , I think it has changed to format_tag. And reinit has changed to init, right?
But I don't know all correspondences. Should I use an older version ideep as third_party of pytorch and rebuild pytorch from source? I really appreciate it if you can help me.

The text was updated successfully, but these errors were encountered:

pinzhenx · 2020-08-10T02:57:23Z

Hi @wangyuyue

To repro mlperf results, you may refer to this doc. It will choose the right ideep commit for you

cd pytorch
git fetch origin pull/25235/head:mlperf
git checkout mlperf
git submodule sync && git submodule update --init --recursive

wangyuyue · 2020-08-11T02:37:26Z

Hi, thanks for the response. I notice that I should set MKLROOT. Should I install MKL in advance and set MKLROOT as /opt/intel/mkl (the default install location is /opt/intel)?
And is intel c++ compiler a necessity? Can I use gcc to do all the compilation?
For these two lines of code, what should I set if I use libgomp.so instead of libiomp5.so?
export LD_PRELOAD=/opt/intel/compilers_and_libraries_2018.1.163/linux/compiler/lib/intel64/libiomp5.so
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/intel/compilers_and_libraries_2018.1.163/linux/compiler/lib/intel64

CaoZhongZ · 2020-08-11T02:48:53Z

Yes, you should install MKL. We do not use intel c++ compiler, or have we used ICX (intel next gen)? I'm not sure. You could set libgomp.so instead of libiomp5.so, but that will have performance impact because all the configurations targeted for intel OpenMP. And intel openmp is an open source project under LLVM umbrella.

wangyuyue · 2020-08-11T14:12:38Z

export BOOST_ROOT=<boost_install_folder>/include
cd /path/to/mlperf_submit/pytorch
cd third_party/ideep/euler/
mkdir build; cd build
cmake3 .. -DCMAKE_C_COMPILER=icc -DCMAKE_CXX_COMPILER=icpc -DWITH_VNNI=2
make
I think intel c++ compiler is firstly used here.
I change icc to gcc, icpc to g++, cmake3 to cmake, and BOOST_ROOT=/opt/boost_1_70_0/.
Then I get the error message below:
/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/eld_conv.cpp:270:68: error: ‘new’ of type ‘euler::elx_int8_conv_wino_t<euler::ConvTypes<unsigned char, float, signed char, float>, euler::ConvImplTypes<unsigned char, signed char, float, float, float>, 6, 3, 16, 512>’ with extended alignment 64 [-Werror=aligned-new=] xc = new elx_int8_conv_wino_t<U, T, 6, 3, 16, ISA_AVX512>(*this);
Thanks for your kind help!

CaoZhongZ · 2020-08-11T15:08:06Z

Oh, my bad, Euler must be compiled by ICC or you have absolutely no performance at all. 🥳

wangyuyue · 2020-08-12T01:18:35Z

I'm confused... So what does this project use? Euler, MKL, or MKL-DNN? I notice that when generating int8 model and build pytorch with Deep-learning-math-kernel, export MKLROOT=/home/huiwu1/src/mkl2019_3. But when roofline test and Build Pytorch Backend, export MKLROOT=/path_to_mkl_dnn/mkl2019_3. So what does MKLROOT here refer to? MKL or MKL-DNN. I think they are different concepts.

wangyuyue · 2020-08-12T03:23:44Z

Now I installed icc and icpc.

Firstly I run cmake .. -DCMAKE_C_COMPILER=icc -DCMAKE_CXX_COMPILER=icpc -DWITH_VNNI=2

-- Euler version: v0.0.1+HEAD.c18604e
-- MT_RUNTIME: omp
-- No preference for use of exported gflags CMake configuration set, and no hints for include/library directories provided. Defaulting to preferring an installed/exported gflags CMake configuration if available.
-- Found installed version of gflags: /usr/lib/x86_64-linux-gnu/cmake/gflags
-- Detected gflags version: 2.2.1
-- Found gflags include dirs: /usr/include
-- Found gflags libraries: gflags_shared
-- Found gflags namespace: google
-- Configuring done
-- Generating done
-- Build files have been written to: /opt/mlperf_submit/pytorch/third_party/ideep/euler/build

Then I run make, and meet error. Below is part of the error message.

/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp(101): error: class "_mm<16>" has no member "stream_ps"
          _mm<V>::stream_ps(&md6(atinput6, _hA, _wA, _I3, _I2, _T, 0),
                  ^
          detected during instantiation of "void euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::__execute_post(TinputType *, euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::op_type *, int, int, int, int) [with TinputType=float, InputType=float, I=512, A=4, K=3, V=16]" at line 53

/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp(102): error: expression must be a pointer to a complete object type
                         *((__m<V> *)&md3(at, _hA, _wA, 0)));
                          ^
          detected during instantiation of "void euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::__execute_post(TinputType *, euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::op_type *, int, int, int, int) [with TinputType=float, InputType=float, I=512, A=4, K=3, V=16]" at line 53

/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp(107): error: class "_mm<16>" has no member "store_ps"
          _mm<V>::store_ps(&md6(atinput6, _hA, _wA, _I3, _I2, _T, 0),
                  ^
          detected during instantiation of "void euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::__execute_post(TinputType *, euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::op_type *, int, int, int, int) [with TinputType=float, InputType=float, I=512, A=4, K=3, V=16]" at line 53

/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp(108): error: expression must be a pointer to a complete object type
                        *((__m<V> *)&md3(at, _hA, _wA, 0)));
                         ^
          detected during instantiation of "void euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::__execute_post(TinputType *, euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::op_type *, int, int, int, int) [with TinputType=float, InputType=float, I=512, A=4, K=3, V=16]" at line 53

/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp(116): error: class "_mm<16>" has no member "cvt_f32_b16"
          auto fp16v = _mm<V>::cvt_f32_b16(*(__i<V> *)&md3(at, _hA, _wA, 0));
                               ^
          detected during instantiation of "void euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::__execute_post(TinputType *, euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::op_type *, int, int, int, int) [with TinputType=float, InputType=float, I=512, A=4, K=3, V=16]" at line 53

/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp(116): error: expression must be a pointer to a complete object type
          auto fp16v = _mm<V>::cvt_f32_b16(*(__i<V> *)&md3(at, _hA, _wA, 0));
                                            ^
          detected during instantiation of "void euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::__execute_post(TinputType *, euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::op_type *, int, int, int, int) [with TinputType=float, InputType=float, I=512, A=4, K=3, V=16]" at line 53

CaoZhongZ · 2020-08-12T05:08:52Z

@uyongw

CaoZhongZ · 2020-08-12T05:13:22Z

I'm confused... So what does this project use? Euler, MKL, or MKL-DNN? I notice that when generating int8 model and build pytorch with Deep-learning-math-kernel, export MKLROOT=/home/huiwu1/src/mkl2019_3. But when roofline test and Build Pytorch Backend, export MKLROOT=/path_to_mkl_dnn/mkl2019_3. So what does MKLROOT here refer to? MKL or MKL-DNN. I think they are different concepts.

DNNL/MKL is major acceleration library with production quality. Euler was a winograd kernel library. For a competitive performance, as long as code was published for scrutiny, we could do whatever reasonable optimization we want. So it is more complicated than ordinary environment. 😊

uyongw · 2020-08-12T10:20:21Z

@wangyuyue For the Euler build error, can you paste the contents of generated build-commands after cmake as below?

$ head -n10 build/compile_commands.json

wangyuyue · 2020-08-12T10:23:47Z

@wangyuyue For the Euler build error, can you paste the contents of generated build-commands after cmake as below?

$ head -n10 build/compile_commands.json
Here is the content:

[
{
  "directory": "/opt/mlperf_submit/pytorch/third_party/ideep/euler/build",
  "command": "/opt/intel/sw_dev_tools/bin/icpc  -DEULER_VERSION=v0.0.1+HEAD.c18604e -DMT_RUNTIME=MT_RUNTIME_OMP -DWITH_VNNI -Del_EXPORTS -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/. -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/include -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/tests -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common  -fPIC -fvisibility=hidden   -Wall -Werror -Wextra -fopenmp -Wno-sign-compare -Wno-uninitialized -Wno-unused-variable -Wno-unused-parameter -std=c++11 -O2 -DNDEBUG -xHost -qopt-zmm-usage=high -no-inline-max-size -no-inline-max-total-size -wd15335 -o CMakeFiles/el.dir/src/common/el_log.cpp.o -c /opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common/el_log.cpp",
  "file": "/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common/el_log.cpp"
},
{
  "directory": "/opt/mlperf_submit/pytorch/third_party/ideep/euler/build",
  "command": "/opt/intel/sw_dev_tools/bin/icpc  -DEULER_VERSION=v0.0.1+HEAD.c18604e -DMT_RUNTIME=MT_RUNTIME_OMP -DWITH_VNNI -Del_EXPORTS -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/. -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/include -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/tests -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common  -fPIC -fvisibility=hidden   -Wall -Werror -Wextra -fopenmp -Wno-sign-compare -Wno-uninitialized -Wno-unused-variable -Wno-unused-parameter -std=c++11 -O2 -DNDEBUG -xHost -qopt-zmm-usage=high -no-inline-max-size -no-inline-max-total-size -wd15335 -o CMakeFiles/el.dir/src/eld_conv.cpp.o -c /opt/mlperf_submit/pytorch/third_party/ideep/euler/src/eld_conv.cpp",
  "file": "/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/eld_conv.cpp"

uyongw · 2020-08-12T10:27:49Z

Can you help to check your ICC version?
$ /opt/intel/sw_dev_tools/bin/icpc--version

wangyuyue · 2020-08-12T12:29:45Z

$ /opt/intel/sw_dev_tools/bin/icpc--version

icpc (ICC) 19.1.2.254 20200623
Copyright (C) 1985-2020 Intel Corporation.  All rights reserved.

I downloaded the installer from Intel system studio, and I think it's the latest version. Can you also provide the install webpage to me if I need to install an older version? Thanks.

wangyuyue · 2020-08-12T12:34:10Z

Here are some environment variables I set:

export MKLROOT=/opt/intel/mkl

export USE_OPENMP=ON

export CAFFE2_USE_MKLDNN=ON

export MKLDNN_USE_CBLAS=ON

export LD_PRELOAD=/usr/lib/gcc/x86_64-linux-gnu/7/libgomp.so

export OMP_NUM_THREADS=8 KMP_AFFINITY=proclist=[0-7],granularity=thread,explicit

export PYTHONPATH=/opt/mlperf_submit/pytorch/build

export BOOST_ROOT=/opt/boost_1_70_0

export PATH=$PATH:/opt/intel/sw_dev_tools/bin

Hope this is helpful for the debug.

uyongw · 2020-08-12T13:11:42Z

Can run verbose make?

$ cd build
$ make VERBOSE=1

BTW, for ICC, normally you run the setup procedure like this (before the build):
$ source /opt/intel/sw_dev_tools/bin/compilervars.sh -arch intel64 -platform linux

wangyuyue · 2020-08-12T14:04:14Z

$ source /opt/intel/sw_dev_tools/bin/compilervars.sh -arch intel64 -platform linux
$ cmake .. -DCMAKE_C_COMPILER=icc -DCMAKE_CXX_COMPILER=icpc -DWITH_VNNI=2

-- Euler version: v0.0.1+HEAD.c18604e
-- MT_RUNTIME: omp
-- No preference for use of exported gflags CMake configuration set, and no hints for include/library directories provided. Defaulting to preferring an installed/exported gflags CMake configuration if available.
-- Found installed version of gflags: /usr/lib/x86_64-linux-gnu/cmake/gflags
-- Detected gflags version: 2.2.1
-- Found gflags include dirs: /usr/include
-- Found gflags libraries: gflags_shared
-- Found gflags namespace: google
-- Configuring done
You have changed variables that require your cache to be deleted.
Configure will be re-run and you may have to reset some variables.
The following variables have changed:
CMAKE_CXX_COMPILER= icpc
CMAKE_C_COMPILER= icc

-- The CXX compiler identification is Intel 19.1.2.20200623
-- The C compiler identification is Intel 19.1.2.20200623
-- Check for working CXX compiler: /opt/intel/sw_dev_tools/compilers_and_libraries_2020.2.254/linux/bin/intel64/icpc
-- Check for working CXX compiler: /opt/intel/sw_dev_tools/compilers_and_libraries_2020.2.254/linux/bin/intel64/icpc -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Check for working C compiler: /opt/intel/sw_dev_tools/compilers_and_libraries_2020.2.254/linux/bin/intel64/icc
-- Check for working C compiler: /opt/intel/sw_dev_tools/compilers_and_libraries_2020.2.254/linux/bin/intel64/icc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Found Git: /usr/bin/git (found version "2.17.1")
-- Euler version: v0.0.1+HEAD.c18604e
-- MT_RUNTIME: omp
-- No preference for use of exported gflags CMake configuration set, and no hints for include/library directories provided. Defaulting to preferring an installed/exported gflags CMake configuration if available.
-- Found installed version of gflags: /usr/lib/x86_64-linux-gnu/cmake/gflags
-- Detected gflags version: 2.2.1
-- Found Gflags: /usr/include
-- Found gflags include dirs: /usr/include
-- Found gflags libraries: gflags_shared
-- Found gflags namespace: google
-- Configuring done
-- Generating done
-- Build files have been written to: /opt/mlperf_submit/pytorch/third_party/ideep/euler/build

make VERBOSE=1

/usr/bin/cmake -H/opt/mlperf_submit/pytorch/third_party/ideep/euler -B/opt/mlperf_submit/pytorch/third_party/ideep/euler/build --check-build-system CMakeFiles/Makefile.cmake 0
/usr/bin/cmake -E cmake_progress_start /opt/mlperf_submit/pytorch/third_party/ideep/euler/build/CMakeFiles /opt/mlperf_submit/pytorch/third_party/ideep/euler/build/CMakeFiles/progress.marks
make -f CMakeFiles/Makefile2 all
make[1]: Entering directory '/opt/mlperf_submit/pytorch/third_party/ideep/euler/build'
make -f CMakeFiles/el.dir/build.make CMakeFiles/el.dir/depend
make[2]: Entering directory '/opt/mlperf_submit/pytorch/third_party/ideep/euler/build'
cd /opt/mlperf_submit/pytorch/third_party/ideep/euler/build && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /opt/mlperf_submit/pytorch/third_party/ideep/euler /opt/mlperf_submit/pytorch/third_party/ideep/euler /opt/mlperf_submit/pytorch/third_party/ideep/euler/build /opt/mlperf_submit/pytorch/third_party/ideep/euler/build /opt/mlperf_submit/pytorch/third_party/ideep/euler/build/CMakeFiles/el.dir/DependInfo.cmake --color=
Dependee "/opt/mlperf_submit/pytorch/third_party/ideep/euler/build/CMakeFiles/el.dir/DependInfo.cmake" is newer than depender "/opt/mlperf_submit/pytorch/third_party/ideep/euler/build/CMakeFiles/el.dir/depend.internal".
Dependee "/opt/mlperf_submit/pytorch/third_party/ideep/euler/build/CMakeFiles/CMakeDirectoryInformation.cmake" is newer than depender "/opt/mlperf_submit/pytorch/third_party/ideep/euler/build/CMakeFiles/el.dir/depend.internal".
Scanning dependencies of target el
make[2]: Leaving directory '/opt/mlperf_submit/pytorch/third_party/ideep/euler/build'
make -f CMakeFiles/el.dir/build.make CMakeFiles/el.dir/build
make[2]: Entering directory '/opt/mlperf_submit/pytorch/third_party/ideep/euler/build'
[  0%] Building CXX object CMakeFiles/el.dir/src/common/el_log.cpp.o
/opt/intel/sw_dev_tools/compilers_and_libraries_2020.2.254/linux/bin/intel64/icpc  -DEULER_VERSION=v0.0.1+HEAD.c18604e -DMT_RUNTIME=MT_RUNTIME_OMP -Del_EXPORTS -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/. -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/include -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/tests -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common  -fPIC -fvisibility=hidden   -Wall -Werror -Wextra -fopenmp -Wno-sign-compare -Wno-uninitialized -Wno-unused-variable -Wno-unused-parameter -std=c++11 -O2 -DNDEBUG -xHost -qopt-zmm-usage=high -no-inline-max-size -no-inline-max-total-size -wd15335 -o CMakeFiles/el.dir/src/common/el_log.cpp.o -c /opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common/el_log.cpp
[  1%] Building CXX object CMakeFiles/el.dir/src/eld_conv.cpp.o
/opt/intel/sw_dev_tools/compilers_and_libraries_2020.2.254/linux/bin/intel64/icpc  -DEULER_VERSION=v0.0.1+HEAD.c18604e -DMT_RUNTIME=MT_RUNTIME_OMP -Del_EXPORTS -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/. -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/include -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/tests -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common  -fPIC -fvisibility=hidden   -Wall -Werror -Wextra -fopenmp -Wno-sign-compare -Wno-uninitialized -Wno-unused-variable -Wno-unused-parameter -std=c++11 -O2 -DNDEBUG -xHost -qopt-zmm-usage=high -no-inline-max-size -no-inline-max-total-size -wd15335 -o CMakeFiles/el.dir/src/eld_conv.cpp.o -c /opt/mlperf_submit/pytorch/third_party/ideep/euler/src/eld_conv.cpp
[  1%] Building CXX object CMakeFiles/el.dir/src/elx_conv.cpp.o
/opt/intel/sw_dev_tools/compilers_and_libraries_2020.2.254/linux/bin/intel64/icpc  -DEULER_VERSION=v0.0.1+HEAD.c18604e -DMT_RUNTIME=MT_RUNTIME_OMP -Del_EXPORTS -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/. -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/include -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/tests -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common  -fPIC -fvisibility=hidden   -Wall -Werror -Wextra -fopenmp -Wno-sign-compare -Wno-uninitialized -Wno-unused-variable -Wno-unused-parameter -std=c++11 -O2 -DNDEBUG -xHost -qopt-zmm-usage=high -no-inline-max-size -no-inline-max-total-size -wd15335 -o CMakeFiles/el.dir/src/elx_conv.cpp.o -c /opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv.cpp
[  2%] Building CXX object CMakeFiles/el.dir/src/elx_conv_wino_trans_input.cpp.o
/opt/intel/sw_dev_tools/compilers_and_libraries_2020.2.254/linux/bin/intel64/icpc  -DEULER_VERSION=v0.0.1+HEAD.c18604e -DMT_RUNTIME=MT_RUNTIME_OMP -Del_EXPORTS -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/. -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/include -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/tests -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common  -fPIC -fvisibility=hidden   -Wall -Werror -Wextra -fopenmp -Wno-sign-compare -Wno-uninitialized -Wno-unused-variable -Wno-unused-parameter -std=c++11 -O2 -DNDEBUG -xHost -qopt-zmm-usage=high -no-inline-max-size -no-inline-max-total-size -wd15335 -o CMakeFiles/el.dir/src/elx_conv_wino_trans_input.cpp.o -c /opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp
------------------------------------------------------------------------------------------------
I omitted the error messages, which are similar to what I pasted above https://github.com/intel/ideep/issues/43#issuecomment-672547982.
------------------------------------------------------------------------------------------------
/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp(280): error: incomplete type is not allowed
    const __i<V> vindex = _mm<V>::set_epi32(SET_VINDEX_16(xc->ih * xc->iw));
                 ^
          detected during instantiation of "void euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::__execute_nchw(TinputType *, InputType *, int, int, int) [with TinputType=short, InputType=float, I=512, A=4, K=3, V=16]" at line 359

compilation aborted for /opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp (code 4)
CMakeFiles/el.dir/build.make:134: recipe for target 'CMakeFiles/el.dir/src/elx_conv_wino_trans_input.cpp.o' failed
make[2]: *** [CMakeFiles/el.dir/src/elx_conv_wino_trans_input.cpp.o] Error 4
make[2]: Leaving directory '/opt/mlperf_submit/pytorch/third_party/ideep/euler/build'
CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/el.dir/all' failed
make[1]: *** [CMakeFiles/el.dir/all] Error 2
make[1]: Leaving directory '/opt/mlperf_submit/pytorch/third_party/ideep/euler/build'
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

uyongw · 2020-08-12T14:43:00Z

Can you check your CPU info like this:

$ lscpu
or
$ cat /proc/cpuinfo | grep flags | head -n1

Euler expects you have at least Skylake server with AVX512 instruction set. For MLPerf it requires Cascade Lake (CLX) or Cooper Lake (CPX) which has AVX512_VNNI instruction support. The error you met should because your building box is not with AVX512 (the -xHost tells the compiler to generate instructions according to your host building environment but Euler has intrinsics using AVX512).

If you can please use CLX to build. Or you can cross-compile it using the command below. But you may not run it without a HW with AVX512/VNNI.

$ mkdir -p build && cd build
$ cmake .. -DCMAKE_C_COMPILER=icc -DCMAKE_CXX_COMPILER=icpc -DCMAKE_CXX_FLAGS=-xCore-AVX512 -DWITH_VNNI=ON -DWITH_TEST=OFF
$ make -j

wangyuyue · 2020-08-13T01:06:17Z

My server doesn't have AVX512 support. So do I have an alternative? I mean skip Euler related matters and complete the MLPerf experiment. I hope I don't need to make too many changes. Thanks.

uyongw · 2020-08-13T01:47:44Z

You can disable Euler and use MKLDNN engine instead.

During the build of PyTorch/Caffe2, disable EULER like this:

export CAFFE2_USE_EULER=OFF

BTW which document are your referencing to? @wuhuikx May have more insights on that part.

CaoZhongZ · 2020-08-13T02:27:05Z

My server doesn't have AVX512 support. So do I have an alternative? I mean skip Euler related matters and complete the MLPerf experiment. I hope I don't need to do too many changes. Thanks.

AVX512 is needed for MLPerf numbers. Otherwise you can only run the workload. 😊

wangyuyue · 2020-08-13T02:39:17Z

My server doesn't have AVX512 support. So do I have an alternative? I mean skip Euler related matters and complete the MLPerf experiment. I hope I don't need to do too many changes. Thanks.

AVX512 is needed for MLPerf numbers. Otherwise you can only run the workload. 😊

Hi, I don't need to get the exactly same performance as the original experiment. It is enough to run the whole process.
@uyongw I'm reproducing using this doc

uyongw · 2020-08-13T16:13:21Z

@CaoZhongZ Yes. MLPerf submission is based on INT8 low precision. At 2019 even MKLDNN does not support INT8 on AVX2. So the minimal HW requirement is Skylake with AVX512 to run INT8 with either MKLDNN or Euler. @wangyuyue

vpirogov · 2020-08-13T16:20:32Z

int8 optimizations for systems without Intel AVX512 are available in oneDNN v1.2 and later.

wangyuyue · 2020-08-14T01:08:53Z

Thanks for the response. So I need to clone the latest oneDNN as third party of Pytorch? Do I need some other changes? @vpirogov
Here is the result of cat /proc/cpuinfo | grep flags | head -n1 on my host

flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts md_clear flush_l1d

uyongw · 2020-08-15T10:08:44Z

Unfortunately this will not work unless quite an effort on back porting. @wangyuyue

wangyuyue closed this as completed Nov 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about which version to use #43

question about which version to use #43

wangyuyue commented Aug 10, 2020

pinzhenx commented Aug 10, 2020 •

edited

Loading

wangyuyue commented Aug 11, 2020

CaoZhongZ commented Aug 11, 2020

wangyuyue commented Aug 11, 2020 •

edited

Loading

CaoZhongZ commented Aug 11, 2020

wangyuyue commented Aug 12, 2020

wangyuyue commented Aug 12, 2020 •

edited

Loading

CaoZhongZ commented Aug 12, 2020

CaoZhongZ commented Aug 12, 2020

uyongw commented Aug 12, 2020

wangyuyue commented Aug 12, 2020 •

edited

Loading

uyongw commented Aug 12, 2020

wangyuyue commented Aug 12, 2020 •

edited

Loading

wangyuyue commented Aug 12, 2020 •

edited

Loading

uyongw commented Aug 12, 2020 •

edited

Loading

wangyuyue commented Aug 12, 2020

uyongw commented Aug 12, 2020

wangyuyue commented Aug 13, 2020 •

edited

Loading

uyongw commented Aug 13, 2020

CaoZhongZ commented Aug 13, 2020

wangyuyue commented Aug 13, 2020 •

edited

Loading

uyongw commented Aug 13, 2020

vpirogov commented Aug 13, 2020

wangyuyue commented Aug 14, 2020

uyongw commented Aug 15, 2020

question about which version to use #43

question about which version to use #43

Comments

wangyuyue commented Aug 10, 2020

pinzhenx commented Aug 10, 2020 • edited Loading

wangyuyue commented Aug 11, 2020

CaoZhongZ commented Aug 11, 2020

wangyuyue commented Aug 11, 2020 • edited Loading

CaoZhongZ commented Aug 11, 2020

wangyuyue commented Aug 12, 2020

wangyuyue commented Aug 12, 2020 • edited Loading

CaoZhongZ commented Aug 12, 2020

CaoZhongZ commented Aug 12, 2020

uyongw commented Aug 12, 2020

wangyuyue commented Aug 12, 2020 • edited Loading

uyongw commented Aug 12, 2020

wangyuyue commented Aug 12, 2020 • edited Loading

wangyuyue commented Aug 12, 2020 • edited Loading

uyongw commented Aug 12, 2020 • edited Loading

wangyuyue commented Aug 12, 2020

uyongw commented Aug 12, 2020

wangyuyue commented Aug 13, 2020 • edited Loading

uyongw commented Aug 13, 2020

CaoZhongZ commented Aug 13, 2020

wangyuyue commented Aug 13, 2020 • edited Loading

uyongw commented Aug 13, 2020

vpirogov commented Aug 13, 2020

wangyuyue commented Aug 14, 2020

uyongw commented Aug 15, 2020

pinzhenx commented Aug 10, 2020 •

edited

Loading

wangyuyue commented Aug 11, 2020 •

edited

Loading

wangyuyue commented Aug 12, 2020 •

edited

Loading

wangyuyue commented Aug 12, 2020 •

edited

Loading

wangyuyue commented Aug 12, 2020 •

edited

Loading

wangyuyue commented Aug 12, 2020 •

edited

Loading

uyongw commented Aug 12, 2020 •

edited

Loading

wangyuyue commented Aug 13, 2020 •

edited

Loading

wangyuyue commented Aug 13, 2020 •

edited

Loading