Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dealii failure with oneapi@2022.2.0 #186

Closed
balay opened this issue Oct 13, 2022 · 11 comments
Closed

dealii failure with oneapi@2022.2.0 #186

balay opened this issue Oct 13, 2022 · 11 comments

Comments

@balay
Copy link
Member

balay commented Oct 13, 2022

https://gitlab.com/xsdk-project/spack-xsdk/-/jobs/3165942119

The build just hangs [i see clang processes running for ever]

Notice: Duration: 738 minutes 30 seconds

I tried:

/bin/spack install -j12 xsdk@0.8.0%oneapi@2022.2.0 ^intel-oneapi-mpi ^intel-oneapi-mkl ^intel-tbb%gcc@9.4.0 ^netlib-scalapack%gcc@9.4.0 ^dealii cxxflags=-O0

and get (but no build errors):

ninja: build stopped: subcommand failed.

So I'm not sure why this is failing.

spack-build-out.txt

@bangerth
Copy link

Unrelated, but the warning

overriding '-march=skylake' with '-march=skylake'

is a bit funny...

@bangerth
Copy link

The error I see is this one:

icpc: error #10106: Fatal error in /opt/intel/oneapi/compiler/2022.2.0/linux/bin/intel64/../../bin/intel64/mcpcom, terminated by kill signal
compilation aborted for /home/xsdk/spack.x/spack-stage/spack-stage-dealii-9.4.0-wdtlnh3ssfy672gtwd4uondfgt7glanx/spack-src/source/numerics/matrix_creator_inst2.cc (code 1)

I don't know for sure, but this may be a compiler (invoked many times in parallel by ninja) running out of memory.

Is the error reproducible?

@balay
Copy link
Member Author

balay commented Oct 14, 2022

Ah thanks! Somehow missed that message. I swtched from -j12 to -j6 - and the build progresses. But i get errors with intel-tbb

It might be related to the fact that I used gcc for intel-tbb [as its build fails with intel-oneapi compilers - and its picking up intel-tbb@2020.3 for gcc build - instead of the latest 2021.5.0].

./bin/spack install -j6 xsdk@0.8.0%oneapi@2022.2.0 ^intel-oneapi-mpi ^intel-oneapi-mkl ^intel-tbb%gcc@9.4.0 ^netlib-scalapack%gcc@9.4.0 ^dealii cxxflags=-O0

spack-build-out.txt

For now I'll switch dealii build to use ^dealii%gcc@9.4.0

@balay
Copy link
Member Author

balay commented Oct 14, 2022

Ah - intel-tbb defaults to 2021.5.0 in spack. And I see 2021.7.0 is now available . Ok that does build with oneapi@2022.2.0

Will see if dealii build goes through with it.

@balay
Copy link
Member Author

balay commented Oct 14, 2022

Ah - intel-tbb defaults to 2021.5.0 in spack. And I see 2021.7.0 is now available .

Ok that does build with oneapi@2022.2.0 and dealii build goes through with it

./bin/spack install -j6 xsdk@0.8.0%oneapi@2022.2.0 ^intel-oneapi-mpi ^intel-oneapi-mkl ^intel-tbb@2021.7.0 ^netlib-scalapack%gcc@9.4.0 ^dealii cxxflags=-O0

@balay
Copy link
Member Author

balay commented Oct 18, 2022

The build just hangs [i see clang processes running for ever]

This hang persists. Perhaps its a compiler bug. I'm not sure. I'll switch this CI to build dealii with gcc

@marcfehling
Copy link

We frequently ran into the same issue in our CI with the classic Intel compiler icpc, and suspected that the worker simply ran out of memory. Reducing the number of jobs helped, see dealii/dealii#15306.

Since the classic compilers will be retired this year, we switched to the new icpx compiler that should substitute the classic one. We did not experience the 'hanging' issue with it, see dealii/dealii#15308.

@balos1 balos1 closed this as completed Aug 3, 2023
@balay
Copy link
Member Author

balay commented Aug 31, 2023

Rechecking this build for 1.0.0 - I still see this hang. I understand this issue might not be fixable - but adding in this info here.

[for 0.8.0 - I had to disable dealii for this build. perhaps I might have to do the same for 1.0.0]

@marcfehling
Copy link

I still see this hang.

Do you still see it with the new icpx compiler?

@balay
Copy link
Member Author

balay commented Aug 31, 2023

I don't yet have a xsdk build setup with icx compilers.

I fear there will be many breakages with it. Well the basic Intel MPI wrappers don't work well with it [atleast with PETSc]

Its something I need to check on.. [with the latest oneapi compilers]

@balay
Copy link
Member Author

balay commented Aug 31, 2023

Actually this dealii build with hang is with icx compilers. [spack hides some of this info - so I'm not sure exactly how to verify] - looking at PETSc build here - I see:

Executing: /opt/intel/oneapi/mpi/2021.7.0/bin/mpiicpc -show
stdout: /home/xsdk/spack/lib/spack/env/oneapi/icpx -I"/opt/intel/oneapi/mpi/2021.7.0/include" -L"/opt/intel/oneapi/mpi/2021.7.0/lib/release" -L"/opt/intel/oneapi/mpi/2021.7.0/lib" -Xlinker --enable-new-dtags -Xlinker -rpath -Xlinker "/opt/intel/oneapi/mpi/2021.7.0/lib/release" -Xlinker -rpath -Xlinker "/opt/intel/oneapi/mpi/2021.7.0/lib" -lmpicxx -lmpifort -lmpi -ldl -lrt -lpthread
Executing: /opt/intel/oneapi/mpi/2021.7.0/bin/mpiicpc --version
stdout:
Intel(R) oneAPI DPC++/C++ Compiler 2022.2.0 (2022.2.0.20220730)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/intel/oneapi/compiler/2022.2.0/linux/bin-llvm
Configuration file: /opt/intel/oneapi/compiler/2022.2.0/linux/bin/icpx.cfg

This version of OneAPI is a bit old. I have a newer version installed on a different machine. I'm yet to setup a working spack build with this newer OneAPI..

@balay balay closed this as completed Nov 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants