Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

not compiling with CUDA 11.3 #3591

Closed
cbontoiu opened this issue Apr 17, 2021 · 13 comments
Closed

not compiling with CUDA 11.3 #3591

cbontoiu opened this issue Apr 17, 2021 · 13 comments
Labels
backend: cuda CUDA backend bug a bug in the project's code

Comments

@cbontoiu
Copy link
Contributor

cbontoiu commented Apr 17, 2021

Hello,

The picongpu@develop does not compile with gcc@9.3.0. The compilation of the LWFA example ended up with this error:

/home/quasar/src/spack/opt/spack/linux-ubuntu20.04-skylake/gcc-9.3.0/picongpu-develop-bg7garcln6uuhro2352lmgcgefsn3htr/thirdParty/cupla/alpaka/include/alpaka/event/EventGenericThreads.hpp:280:19: error: ‘__T30’ was not declared in this scope
  280 |                 auto vQueues(dev.getAllQueues());

There are some incompatibilities when using the newest gcc 9 compilers and the error message is also mentioned here: https://gcc.gnu.org/gcc-9/changes.html

P.S. In fact I discovered later that the same error appears when compiling picongpu (master version) with gcc@7.5.0

log_out.txt

Regards,
Cristian

@sbastrakov sbastrakov added the backend: cuda CUDA backend label Apr 19, 2021
@sbastrakov
Copy link
Member

sbastrakov commented Apr 19, 2021

Hello @cbontoiu thanks for your report. We've just got a similar report for alpaka (which is used inside PIConGPU) with CUDA 11.3. So it may be that CUDA 11.3 is the issue (you are also using it according to the log), not gcc 9.3. Note that alpaka only officially supports up to CUDA 11.2 so far. Could you try with an earlier CUDA version?

@psychocoderHPC
Copy link
Member

@cbontoiu If you have installed the CUDA 11.3 driver on your system you can simply compile PIConGPU with CUDA 11.2 and run it on a system with CUDA 11.3 driver.

@cbontoiu
Copy link
Contributor Author

@psychocoderHPC This option is interesting, but I don't know how should I apply it. I thought compilation goes with the CUDA version used at the installation of PIConGPU (in my case with Spack but using CUDA and CUDA aware openMPI from the system).

In the meantime I had a fresh install of PIConGPU, dev version, using CUDA 11.2 and openMPI 4.1.1 (from the system) but the latter one gives some errors at the compilations shown in the text file attached. You may want to investigate this incompatibility as well.

log_out.txt

Thank you.

@sbastrakov
Copy link
Member

Hello @cbontoiu . The issue is that your build uses two MPI libraries and that causes a conflict. One is your system openMPI. And another one is MPI at anaconda, that is pulled by ADIOS in anaconda. I think it is reasonable to approach it gradually. First disable ADIOS and try without it.

@sbastrakov
Copy link
Member

After that works, you could either rebuild that ADIOS thing or tell it to use your system openMPI and thus avoid the conflict

@cbontoiu
Copy link
Contributor Author

@sbastrakov Thank you. Indeed there was a clash between conda and spack and I managed to compile disabling the lines written by anaconda in my bashrc file. I don't know another way. I also managed to run the LWFA model and I am surprised how slow it was. This model used to complete in 1 min and 10 secs on this machine before the openpmd plugin was included. But now out of the box the running with the 1.cfg file took 3 mins and 25 seconds.

@cbontoiu
Copy link
Contributor Author

PIConGPU: 0.6.0-dev
Build-Type: Release

Third party:
OS: Linux-5.8.0-50-generic
arch: x86_64
CXX: GNU (7.5.0)
CMake: 3.20.1
CUDA: 11.2.67
mallocMC: 2.6.0
Boost: 1.70.0
MPI:
standard: 3.1
flavor: OpenMPI (4.1.0)
PNGwriter: 0.7.0
libSplash: 1.7.0 (Format 4.0)
ADIOS: NOTFOUND
openPMD: 0.13.3
PIConGPUVerbose PHYSICS(1) | Sliding Window is OFF
PIConGPUVerbose PHYSICS(1) | used Random Number Generator: RNGProvider3XorMin seed: 42
PIConGPUVerbose PHYSICS(1) | Courant c*dt <= 1.00229 ? 1
PIConGPUVerbose PHYSICS(1) | Resolving plasma oscillations?
Estimates are based on DensityRatio to BASE_DENSITY of each species
(see: density.param, speciesDefinition.param).
It and does not cover other forms of initialization
PIConGPUVerbose PHYSICS(1) | species e: omega_p * dt <= 0.1 ? 0.0247974
PIConGPUVerbose PHYSICS(1) | y-cells per wavelength: 18.0587
PIConGPUVerbose PHYSICS(1) | macro particles per device: 4718592
PIConGPUVerbose PHYSICS(1) | typical macro particle weighting: 6955.06
PIConGPUVerbose PHYSICS(1) | UNIT_SPEED 2.99792e+08
PIConGPUVerbose PHYSICS(1) | UNIT_TIME 1.39e-16
PIConGPUVerbose PHYSICS(1) | UNIT_LENGTH 4.16712e-08
PIConGPUVerbose PHYSICS(1) | UNIT_MASS 6.33563e-27
PIConGPUVerbose PHYSICS(1) | UNIT_CHARGE 1.11432e-15
PIConGPUVerbose PHYSICS(1) | UNIT_EFIELD 1.22627e+13
PIConGPUVerbose PHYSICS(1) | UNIT_BFIELD 40903.8
PIConGPUVerbose PHYSICS(1) | UNIT_ENERGY 5.69418e-10
initialization time: 10sec 833msec = 10 sec
0 % = 0 | time elapsed: 9sec 150msec | avg time per step: 0msec
4 % = 102 | time elapsed: 18sec 46msec | avg time per step: 15msec
9 % = 204 | time elapsed: 26sec 977msec | avg time per step: 15msec
14 % = 306 | time elapsed: 35sec 980msec | avg time per step: 15msec
19 % = 408 | time elapsed: 45sec 6msec | avg time per step: 16msec
24 % = 510 | time elapsed: 54sec 121msec | avg time per step: 16msec
29 % = 612 | time elapsed: 1min 3sec 237msec | avg time per step: 17msec
34 % = 714 | time elapsed: 1min 12sec 362msec | avg time per step: 17msec
39 % = 816 | time elapsed: 1min 21sec 528msec | avg time per step: 17msec
44 % = 918 | time elapsed: 1min 30sec 715msec | avg time per step: 18msec
49 % = 1020 | time elapsed: 1min 40sec 102msec | avg time per step: 19msec
54 % = 1122 | time elapsed: 1min 49sec 440msec | avg time per step: 19msec
59 % = 1224 | time elapsed: 1min 58sec 815msec | avg time per step: 19msec
64 % = 1326 | time elapsed: 2min 8sec 243msec | avg time per step: 20msec
69 % = 1428 | time elapsed: 2min 17sec 718msec | avg time per step: 20msec
74 % = 1530 | time elapsed: 2min 27sec 140msec | avg time per step: 21msec
79 % = 1632 | time elapsed: 2min 36sec 562msec | avg time per step: 21msec
84 % = 1734 | time elapsed: 2min 45sec 877msec | avg time per step: 20msec
89 % = 1836 | time elapsed: 2min 55sec 71msec | avg time per step: 19msec
94 % = 1938 | time elapsed: 3min 4sec 289msec | avg time per step: 19msec
99 % = 2040 | time elapsed: 3min 13sec 423msec | avg time per step: 20msec
calculation simulation time: 3min 13sec 582msec = 193 sec
full simulation time: 3min 25sec 8msec = 205 sec

@cbontoiu
Copy link
Contributor Author

source $HOME/src/spack/share/spack/setup-env.sh && spack load picongpu && spack load openpmd-api && export PIC_BACKEND="cuda:75" && export OMPI_MCA_io=^ompio
cd /home/quasar/PIC_INPUT/PICONGPU/TESTS/myLWFA
rm -r .build/ && pic-build &> log_out.txt
tbg -s bash -c etc/picongpu/1.cfg -t etc/picongpu/bash/mpiexec.tpl /media/quasar/RawDataDisk/PICONGPU/TESTS/myLaserWakefield

@sbastrakov
Copy link
Member

sbastrakov commented Apr 19, 2021

If you want to see how the output affects run time, you could disable it in your .cfg file. Or change the period of output. Look for the TBG_openPMD variable there

@cbontoiu
Copy link
Contributor Author

cbontoiu commented Apr 19, 2021

Indeed, 1 min and 25 secs can be spared when running without the creation of check points.

@cbontoiu
Copy link
Contributor Author

Do you keep a change log file from version 0.5.0 such that we could have an idea of where the developing is pointing and maybe suggest features?

@sbastrakov
Copy link
Member

Yes, here. Feature suggestions and external contributions are welcome. Ideally please create an new issue per suggestion.

@psychocoderHPC psychocoderHPC changed the title not compiling with gcc@9.3.0 not compiling with CUDA 11.3 Apr 21, 2021
@psychocoderHPC
Copy link
Member

We work on a fix for CUDA 11.3 alpaka-group/alpaka#1295
This will take some time, after we have CUDA 11.3 support in alpaka and cupla we will support CUDA 11.3 in PIConGPU too.

@psychocoderHPC psychocoderHPC added the bug a bug in the project's code label Apr 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend: cuda CUDA backend bug a bug in the project's code
Projects
None yet
Development

No branches or pull requests

3 participants