Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question about which version to use #43

Closed
wangyuyue opened this issue Aug 10, 2020 · 25 comments
Closed

question about which version to use #43

wangyuyue opened this issue Aug 10, 2020 · 25 comments

Comments

@wangyuyue
Copy link

Hi, I'm trying to reproduce MLPerf/inference_results_v0.5, and I meet some trouble with ideep API.
For example, auto in_format = dataOrder_ == "NCHW" ? ideep::format::nchw : ideep::format::nhwc; , I think it has changed to format_tag. And reinit has changed to init, right?
But I don't know all correspondences. Should I use an older version ideep as third_party of pytorch and rebuild pytorch from source? I really appreciate it if you can help me.

@pinzhenx
Copy link

pinzhenx commented Aug 10, 2020

Hi @wangyuyue

To repro mlperf results, you may refer to this doc. It will choose the right ideep commit for you

cd pytorch
git fetch origin pull/25235/head:mlperf
git checkout mlperf
git submodule sync && git submodule update --init --recursive

@wangyuyue
Copy link
Author

Hi, thanks for the response. I notice that I should set MKLROOT. Should I install MKL in advance and set MKLROOT as /opt/intel/mkl (the default install location is /opt/intel)?
And is intel c++ compiler a necessity? Can I use gcc to do all the compilation?
For these two lines of code, what should I set if I use libgomp.so instead of libiomp5.so?
export LD_PRELOAD=/opt/intel/compilers_and_libraries_2018.1.163/linux/compiler/lib/intel64/libiomp5.so
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/intel/compilers_and_libraries_2018.1.163/linux/compiler/lib/intel64

@CaoZhongZ
Copy link
Contributor

Yes, you should install MKL. We do not use intel c++ compiler, or have we used ICX (intel next gen)? I'm not sure. You could set libgomp.so instead of libiomp5.so, but that will have performance impact because all the configurations targeted for intel OpenMP. And intel openmp is an open source project under LLVM umbrella.

@wangyuyue
Copy link
Author

wangyuyue commented Aug 11, 2020

export BOOST_ROOT=<boost_install_folder>/include
cd /path/to/mlperf_submit/pytorch
cd third_party/ideep/euler/
mkdir build; cd build
cmake3 .. -DCMAKE_C_COMPILER=icc -DCMAKE_CXX_COMPILER=icpc -DWITH_VNNI=2
make
I think intel c++ compiler is firstly used here.
I change icc to gcc, icpc to g++, cmake3 to cmake, and BOOST_ROOT=/opt/boost_1_70_0/.
Then I get the error message below:
/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/eld_conv.cpp:270:68: error: ‘new’ of type ‘euler::elx_int8_conv_wino_t<euler::ConvTypes<unsigned char, float, signed char, float>, euler::ConvImplTypes<unsigned char, signed char, float, float, float>, 6, 3, 16, 512>’ with extended alignment 64 [-Werror=aligned-new=] xc = new elx_int8_conv_wino_t<U, T, 6, 3, 16, ISA_AVX512>(*this);
Thanks for your kind help!

@CaoZhongZ
Copy link
Contributor

Oh, my bad, Euler must be compiled by ICC or you have absolutely no performance at all. 🥳

@wangyuyue
Copy link
Author

I'm confused... So what does this project use? Euler, MKL, or MKL-DNN? I notice that when generating int8 model and build pytorch with Deep-learning-math-kernel, export MKLROOT=/home/huiwu1/src/mkl2019_3. But when roofline test and Build Pytorch Backend, export MKLROOT=/path_to_mkl_dnn/mkl2019_3. So what does MKLROOT here refer to? MKL or MKL-DNN. I think they are different concepts.

@wangyuyue
Copy link
Author

wangyuyue commented Aug 12, 2020

Now I installed icc and icpc.

Firstly I run cmake .. -DCMAKE_C_COMPILER=icc -DCMAKE_CXX_COMPILER=icpc -DWITH_VNNI=2

-- Euler version: v0.0.1+HEAD.c18604e
-- MT_RUNTIME: omp
-- No preference for use of exported gflags CMake configuration set, and no hints for include/library directories provided. Defaulting to preferring an installed/exported gflags CMake configuration if available.
-- Found installed version of gflags: /usr/lib/x86_64-linux-gnu/cmake/gflags
-- Detected gflags version: 2.2.1
-- Found gflags include dirs: /usr/include
-- Found gflags libraries: gflags_shared
-- Found gflags namespace: google
-- Configuring done
-- Generating done
-- Build files have been written to: /opt/mlperf_submit/pytorch/third_party/ideep/euler/build

Then I run make, and meet error. Below is part of the error message.

/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp(101): error: class "_mm<16>" has no member "stream_ps"
          _mm<V>::stream_ps(&md6(atinput6, _hA, _wA, _I3, _I2, _T, 0),
                  ^
          detected during instantiation of "void euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::__execute_post(TinputType *, euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::op_type *, int, int, int, int) [with TinputType=float, InputType=float, I=512, A=4, K=3, V=16]" at line 53

/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp(102): error: expression must be a pointer to a complete object type
                         *((__m<V> *)&md3(at, _hA, _wA, 0)));
                          ^
          detected during instantiation of "void euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::__execute_post(TinputType *, euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::op_type *, int, int, int, int) [with TinputType=float, InputType=float, I=512, A=4, K=3, V=16]" at line 53

/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp(107): error: class "_mm<16>" has no member "store_ps"
          _mm<V>::store_ps(&md6(atinput6, _hA, _wA, _I3, _I2, _T, 0),
                  ^
          detected during instantiation of "void euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::__execute_post(TinputType *, euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::op_type *, int, int, int, int) [with TinputType=float, InputType=float, I=512, A=4, K=3, V=16]" at line 53

/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp(108): error: expression must be a pointer to a complete object type
                        *((__m<V> *)&md3(at, _hA, _wA, 0)));
                         ^
          detected during instantiation of "void euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::__execute_post(TinputType *, euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::op_type *, int, int, int, int) [with TinputType=float, InputType=float, I=512, A=4, K=3, V=16]" at line 53

/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp(116): error: class "_mm<16>" has no member "cvt_f32_b16"
          auto fp16v = _mm<V>::cvt_f32_b16(*(__i<V> *)&md3(at, _hA, _wA, 0));
                               ^
          detected during instantiation of "void euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::__execute_post(TinputType *, euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::op_type *, int, int, int, int) [with TinputType=float, InputType=float, I=512, A=4, K=3, V=16]" at line 53

/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp(116): error: expression must be a pointer to a complete object type
          auto fp16v = _mm<V>::cvt_f32_b16(*(__i<V> *)&md3(at, _hA, _wA, 0));
                                            ^
          detected during instantiation of "void euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::__execute_post(TinputType *, euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::op_type *, int, int, int, int) [with TinputType=float, InputType=float, I=512, A=4, K=3, V=16]" at line 53

@CaoZhongZ
Copy link
Contributor

@uyongw

@CaoZhongZ
Copy link
Contributor

I'm confused... So what does this project use? Euler, MKL, or MKL-DNN? I notice that when generating int8 model and build pytorch with Deep-learning-math-kernel, export MKLROOT=/home/huiwu1/src/mkl2019_3. But when roofline test and Build Pytorch Backend, export MKLROOT=/path_to_mkl_dnn/mkl2019_3. So what does MKLROOT here refer to? MKL or MKL-DNN. I think they are different concepts.

DNNL/MKL is major acceleration library with production quality. Euler was a winograd kernel library. For a competitive performance, as long as code was published for scrutiny, we could do whatever reasonable optimization we want. So it is more complicated than ordinary environment. 😊

@uyongw
Copy link

uyongw commented Aug 12, 2020

@wangyuyue For the Euler build error, can you paste the contents of generated build-commands after cmake as below?

$ head -n10 build/compile_commands.json

@wangyuyue
Copy link
Author

wangyuyue commented Aug 12, 2020

@wangyuyue For the Euler build error, can you paste the contents of generated build-commands after cmake as below?

$ head -n10 build/compile_commands.json
Here is the content:

[
{
  "directory": "/opt/mlperf_submit/pytorch/third_party/ideep/euler/build",
  "command": "/opt/intel/sw_dev_tools/bin/icpc  -DEULER_VERSION=v0.0.1+HEAD.c18604e -DMT_RUNTIME=MT_RUNTIME_OMP -DWITH_VNNI -Del_EXPORTS -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/. -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/include -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/tests -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common  -fPIC -fvisibility=hidden   -Wall -Werror -Wextra -fopenmp -Wno-sign-compare -Wno-uninitialized -Wno-unused-variable -Wno-unused-parameter -std=c++11 -O2 -DNDEBUG -xHost -qopt-zmm-usage=high -no-inline-max-size -no-inline-max-total-size -wd15335 -o CMakeFiles/el.dir/src/common/el_log.cpp.o -c /opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common/el_log.cpp",
  "file": "/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common/el_log.cpp"
},
{
  "directory": "/opt/mlperf_submit/pytorch/third_party/ideep/euler/build",
  "command": "/opt/intel/sw_dev_tools/bin/icpc  -DEULER_VERSION=v0.0.1+HEAD.c18604e -DMT_RUNTIME=MT_RUNTIME_OMP -DWITH_VNNI -Del_EXPORTS -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/. -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/include -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/tests -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common  -fPIC -fvisibility=hidden   -Wall -Werror -Wextra -fopenmp -Wno-sign-compare -Wno-uninitialized -Wno-unused-variable -Wno-unused-parameter -std=c++11 -O2 -DNDEBUG -xHost -qopt-zmm-usage=high -no-inline-max-size -no-inline-max-total-size -wd15335 -o CMakeFiles/el.dir/src/eld_conv.cpp.o -c /opt/mlperf_submit/pytorch/third_party/ideep/euler/src/eld_conv.cpp",
  "file": "/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/eld_conv.cpp"

@uyongw
Copy link

uyongw commented Aug 12, 2020

Can you help to check your ICC version?
$ /opt/intel/sw_dev_tools/bin/icpc--version

@wangyuyue
Copy link
Author

wangyuyue commented Aug 12, 2020

$ /opt/intel/sw_dev_tools/bin/icpc--version

icpc (ICC) 19.1.2.254 20200623
Copyright (C) 1985-2020 Intel Corporation.  All rights reserved.

I downloaded the installer from Intel system studio, and I think it's the latest version. Can you also provide the install webpage to me if I need to install an older version? Thanks.

@wangyuyue
Copy link
Author

wangyuyue commented Aug 12, 2020

Here are some environment variables I set:

export MKLROOT=/opt/intel/mkl

export USE_OPENMP=ON

export CAFFE2_USE_MKLDNN=ON

export MKLDNN_USE_CBLAS=ON

export LD_PRELOAD=/usr/lib/gcc/x86_64-linux-gnu/7/libgomp.so

export OMP_NUM_THREADS=8 KMP_AFFINITY=proclist=[0-7],granularity=thread,explicit

export PYTHONPATH=/opt/mlperf_submit/pytorch/build

export BOOST_ROOT=/opt/boost_1_70_0

export PATH=$PATH:/opt/intel/sw_dev_tools/bin

Hope this is helpful for the debug.

@uyongw
Copy link

uyongw commented Aug 12, 2020

Can run verbose make?

$ cd build
$ make VERBOSE=1

BTW, for ICC, normally you run the setup procedure like this (before the build):
$ source /opt/intel/sw_dev_tools/bin/compilervars.sh -arch intel64 -platform linux

@wangyuyue
Copy link
Author

$ source /opt/intel/sw_dev_tools/bin/compilervars.sh -arch intel64 -platform linux
$ cmake .. -DCMAKE_C_COMPILER=icc -DCMAKE_CXX_COMPILER=icpc -DWITH_VNNI=2

-- Euler version: v0.0.1+HEAD.c18604e
-- MT_RUNTIME: omp
-- No preference for use of exported gflags CMake configuration set, and no hints for include/library directories provided. Defaulting to preferring an installed/exported gflags CMake configuration if available.
-- Found installed version of gflags: /usr/lib/x86_64-linux-gnu/cmake/gflags
-- Detected gflags version: 2.2.1
-- Found gflags include dirs: /usr/include
-- Found gflags libraries: gflags_shared
-- Found gflags namespace: google
-- Configuring done
You have changed variables that require your cache to be deleted.
Configure will be re-run and you may have to reset some variables.
The following variables have changed:
CMAKE_CXX_COMPILER= icpc
CMAKE_C_COMPILER= icc

-- The CXX compiler identification is Intel 19.1.2.20200623
-- The C compiler identification is Intel 19.1.2.20200623
-- Check for working CXX compiler: /opt/intel/sw_dev_tools/compilers_and_libraries_2020.2.254/linux/bin/intel64/icpc
-- Check for working CXX compiler: /opt/intel/sw_dev_tools/compilers_and_libraries_2020.2.254/linux/bin/intel64/icpc -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Check for working C compiler: /opt/intel/sw_dev_tools/compilers_and_libraries_2020.2.254/linux/bin/intel64/icc
-- Check for working C compiler: /opt/intel/sw_dev_tools/compilers_and_libraries_2020.2.254/linux/bin/intel64/icc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Found Git: /usr/bin/git (found version "2.17.1")
-- Euler version: v0.0.1+HEAD.c18604e
-- MT_RUNTIME: omp
-- No preference for use of exported gflags CMake configuration set, and no hints for include/library directories provided. Defaulting to preferring an installed/exported gflags CMake configuration if available.
-- Found installed version of gflags: /usr/lib/x86_64-linux-gnu/cmake/gflags
-- Detected gflags version: 2.2.1
-- Found Gflags: /usr/include
-- Found gflags include dirs: /usr/include
-- Found gflags libraries: gflags_shared
-- Found gflags namespace: google
-- Configuring done
-- Generating done
-- Build files have been written to: /opt/mlperf_submit/pytorch/third_party/ideep/euler/build

make VERBOSE=1

/usr/bin/cmake -H/opt/mlperf_submit/pytorch/third_party/ideep/euler -B/opt/mlperf_submit/pytorch/third_party/ideep/euler/build --check-build-system CMakeFiles/Makefile.cmake 0
/usr/bin/cmake -E cmake_progress_start /opt/mlperf_submit/pytorch/third_party/ideep/euler/build/CMakeFiles /opt/mlperf_submit/pytorch/third_party/ideep/euler/build/CMakeFiles/progress.marks
make -f CMakeFiles/Makefile2 all
make[1]: Entering directory '/opt/mlperf_submit/pytorch/third_party/ideep/euler/build'
make -f CMakeFiles/el.dir/build.make CMakeFiles/el.dir/depend
make[2]: Entering directory '/opt/mlperf_submit/pytorch/third_party/ideep/euler/build'
cd /opt/mlperf_submit/pytorch/third_party/ideep/euler/build && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /opt/mlperf_submit/pytorch/third_party/ideep/euler /opt/mlperf_submit/pytorch/third_party/ideep/euler /opt/mlperf_submit/pytorch/third_party/ideep/euler/build /opt/mlperf_submit/pytorch/third_party/ideep/euler/build /opt/mlperf_submit/pytorch/third_party/ideep/euler/build/CMakeFiles/el.dir/DependInfo.cmake --color=
Dependee "/opt/mlperf_submit/pytorch/third_party/ideep/euler/build/CMakeFiles/el.dir/DependInfo.cmake" is newer than depender "/opt/mlperf_submit/pytorch/third_party/ideep/euler/build/CMakeFiles/el.dir/depend.internal".
Dependee "/opt/mlperf_submit/pytorch/third_party/ideep/euler/build/CMakeFiles/CMakeDirectoryInformation.cmake" is newer than depender "/opt/mlperf_submit/pytorch/third_party/ideep/euler/build/CMakeFiles/el.dir/depend.internal".
Scanning dependencies of target el
make[2]: Leaving directory '/opt/mlperf_submit/pytorch/third_party/ideep/euler/build'
make -f CMakeFiles/el.dir/build.make CMakeFiles/el.dir/build
make[2]: Entering directory '/opt/mlperf_submit/pytorch/third_party/ideep/euler/build'
[  0%] Building CXX object CMakeFiles/el.dir/src/common/el_log.cpp.o
/opt/intel/sw_dev_tools/compilers_and_libraries_2020.2.254/linux/bin/intel64/icpc  -DEULER_VERSION=v0.0.1+HEAD.c18604e -DMT_RUNTIME=MT_RUNTIME_OMP -Del_EXPORTS -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/. -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/include -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/tests -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common  -fPIC -fvisibility=hidden   -Wall -Werror -Wextra -fopenmp -Wno-sign-compare -Wno-uninitialized -Wno-unused-variable -Wno-unused-parameter -std=c++11 -O2 -DNDEBUG -xHost -qopt-zmm-usage=high -no-inline-max-size -no-inline-max-total-size -wd15335 -o CMakeFiles/el.dir/src/common/el_log.cpp.o -c /opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common/el_log.cpp
[  1%] Building CXX object CMakeFiles/el.dir/src/eld_conv.cpp.o
/opt/intel/sw_dev_tools/compilers_and_libraries_2020.2.254/linux/bin/intel64/icpc  -DEULER_VERSION=v0.0.1+HEAD.c18604e -DMT_RUNTIME=MT_RUNTIME_OMP -Del_EXPORTS -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/. -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/include -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/tests -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common  -fPIC -fvisibility=hidden   -Wall -Werror -Wextra -fopenmp -Wno-sign-compare -Wno-uninitialized -Wno-unused-variable -Wno-unused-parameter -std=c++11 -O2 -DNDEBUG -xHost -qopt-zmm-usage=high -no-inline-max-size -no-inline-max-total-size -wd15335 -o CMakeFiles/el.dir/src/eld_conv.cpp.o -c /opt/mlperf_submit/pytorch/third_party/ideep/euler/src/eld_conv.cpp
[  1%] Building CXX object CMakeFiles/el.dir/src/elx_conv.cpp.o
/opt/intel/sw_dev_tools/compilers_and_libraries_2020.2.254/linux/bin/intel64/icpc  -DEULER_VERSION=v0.0.1+HEAD.c18604e -DMT_RUNTIME=MT_RUNTIME_OMP -Del_EXPORTS -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/. -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/include -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/tests -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common  -fPIC -fvisibility=hidden   -Wall -Werror -Wextra -fopenmp -Wno-sign-compare -Wno-uninitialized -Wno-unused-variable -Wno-unused-parameter -std=c++11 -O2 -DNDEBUG -xHost -qopt-zmm-usage=high -no-inline-max-size -no-inline-max-total-size -wd15335 -o CMakeFiles/el.dir/src/elx_conv.cpp.o -c /opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv.cpp
[  2%] Building CXX object CMakeFiles/el.dir/src/elx_conv_wino_trans_input.cpp.o
/opt/intel/sw_dev_tools/compilers_and_libraries_2020.2.254/linux/bin/intel64/icpc  -DEULER_VERSION=v0.0.1+HEAD.c18604e -DMT_RUNTIME=MT_RUNTIME_OMP -Del_EXPORTS -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/. -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/include -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/tests -I/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/common  -fPIC -fvisibility=hidden   -Wall -Werror -Wextra -fopenmp -Wno-sign-compare -Wno-uninitialized -Wno-unused-variable -Wno-unused-parameter -std=c++11 -O2 -DNDEBUG -xHost -qopt-zmm-usage=high -no-inline-max-size -no-inline-max-total-size -wd15335 -o CMakeFiles/el.dir/src/elx_conv_wino_trans_input.cpp.o -c /opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp
------------------------------------------------------------------------------------------------
I omitted the error messages, which are similar to what I pasted above https://github.com/intel/ideep/issues/43#issuecomment-672547982.
------------------------------------------------------------------------------------------------
/opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp(280): error: incomplete type is not allowed
    const __i<V> vindex = _mm<V>::set_epi32(SET_VINDEX_16(xc->ih * xc->iw));
                 ^
          detected during instantiation of "void euler::elx_conv_wino_trans_input_t<TinputType, InputType, I, A, K, V>::__execute_nchw(TinputType *, InputType *, int, int, int) [with TinputType=short, InputType=float, I=512, A=4, K=3, V=16]" at line 359

compilation aborted for /opt/mlperf_submit/pytorch/third_party/ideep/euler/src/elx_conv_wino_trans_input.cpp (code 4)
CMakeFiles/el.dir/build.make:134: recipe for target 'CMakeFiles/el.dir/src/elx_conv_wino_trans_input.cpp.o' failed
make[2]: *** [CMakeFiles/el.dir/src/elx_conv_wino_trans_input.cpp.o] Error 4
make[2]: Leaving directory '/opt/mlperf_submit/pytorch/third_party/ideep/euler/build'
CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/el.dir/all' failed
make[1]: *** [CMakeFiles/el.dir/all] Error 2
make[1]: Leaving directory '/opt/mlperf_submit/pytorch/third_party/ideep/euler/build'
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

@uyongw
Copy link

uyongw commented Aug 12, 2020

Can you check your CPU info like this:

$ lscpu
or
$ cat /proc/cpuinfo | grep flags | head -n1

Euler expects you have at least Skylake server with AVX512 instruction set. For MLPerf it requires Cascade Lake (CLX) or Cooper Lake (CPX) which has AVX512_VNNI instruction support. The error you met should because your building box is not with AVX512 (the -xHost tells the compiler to generate instructions according to your host building environment but Euler has intrinsics using AVX512).

If you can please use CLX to build. Or you can cross-compile it using the command below. But you may not run it without a HW with AVX512/VNNI.

$ mkdir -p build && cd build
$ cmake .. -DCMAKE_C_COMPILER=icc -DCMAKE_CXX_COMPILER=icpc -DCMAKE_CXX_FLAGS=-xCore-AVX512 -DWITH_VNNI=ON -DWITH_TEST=OFF
$ make -j

@wangyuyue
Copy link
Author

wangyuyue commented Aug 13, 2020

My server doesn't have AVX512 support. So do I have an alternative? I mean skip Euler related matters and complete the MLPerf experiment. I hope I don't need to make too many changes. Thanks.

@uyongw
Copy link

uyongw commented Aug 13, 2020

You can disable Euler and use MKLDNN engine instead.

During the build of PyTorch/Caffe2, disable EULER like this:

export CAFFE2_USE_EULER=OFF

BTW which document are your referencing to? @wuhuikx May have more insights on that part.

@CaoZhongZ
Copy link
Contributor

My server doesn't have AVX512 support. So do I have an alternative? I mean skip Euler related matters and complete the MLPerf experiment. I hope I don't need to do too many changes. Thanks.

AVX512 is needed for MLPerf numbers. Otherwise you can only run the workload. 😊

@wangyuyue
Copy link
Author

wangyuyue commented Aug 13, 2020

My server doesn't have AVX512 support. So do I have an alternative? I mean skip Euler related matters and complete the MLPerf experiment. I hope I don't need to do too many changes. Thanks.

AVX512 is needed for MLPerf numbers. Otherwise you can only run the workload. 😊

Hi, I don't need to get the exactly same performance as the original experiment. It is enough to run the whole process.
@uyongw I'm reproducing using this doc

@uyongw
Copy link

uyongw commented Aug 13, 2020

@CaoZhongZ Yes. MLPerf submission is based on INT8 low precision. At 2019 even MKLDNN does not support INT8 on AVX2. So the minimal HW requirement is Skylake with AVX512 to run INT8 with either MKLDNN or Euler. @wangyuyue

@vpirogov
Copy link
Member

int8 optimizations for systems without Intel AVX512 are available in oneDNN v1.2 and later.

@wangyuyue
Copy link
Author

Thanks for the response. So I need to clone the latest oneDNN as third party of Pytorch? Do I need some other changes? @vpirogov
Here is the result of cat /proc/cpuinfo | grep flags | head -n1 on my host

flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts md_clear flush_l1d

@uyongw
Copy link

uyongw commented Aug 15, 2020

Unfortunately this will not work unless quite an effort on back porting. @wangyuyue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants