-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A problem about MPI #51
Comments
Hello, Is it possible for you to attach your output and error logs ? |
Of course. Please find the output file in the attachment. |
Hi, |
I don't think the error is related, but you have 16 processes with 8 threads each. Total of 128 threads. This is way too much compared to the 16 patches available in the benchmark you choose. Use 1 process with 16 threads or 2 processes with 8 threads instead. |
@jderouillat Sure. The supercomputer I am using consists of 44 Dawning CB85-G AMD blade machine. Each node (machine) has four eight-core 64 bits CPU, which constitute 8 symmetrical multi-processor. The internal memory of each node is 8 GB. The CPUs are Opeteron 6136 (2.4GHz) CPUs, each of which has 16 GB internal memory. The supercomputer uses the RedHat Linux 5.0 operation system. @mccoys I run SMILEI with a single MPI process and 16 threads, 2 processes with 8 threads for 16 patches. But I still get the same error. Then I run the code in the debug mode. Below is the output file. |
Could you execute the debug version within gdb (with only 1 MPI and 1 OpenMP, it will be easier to read the generated informations and OpenMP doesn't matter in your case, the simulation has not entered the OpenMP parallel section when it crashes) : $ mpirun -np 1 gdb --args .../smilei .../tst1d_00_em_propagation.py
(gdb) run
...
(gdb) backtrace
... And send us the stack trace of the crash ? |
Closing as no more input from user |
Dear developers,
Thank you for your advanced code. Before I installed SMILEI, I installed gcc-4.6.4, openmpi-1.10.2, hdf5-1.8.16 and python-2.7 for dependencies. Then I installed SMILEI successfully. I run the namelist (tst1d_0_em_propagation.py) in the benchmarks directory for test. I got a error message when the computer was trying to initialize the diagnostic fields about MPI: An error occurred in MPI_Comm_create_keyval reported by process [703004673,12] on communicator MPI_COMM_WORLD; MPI_ERR_ARG: invalid argument of some other kind. The submission script is as follow:
#/bin/bas
#PBS -N smilei
#PBS -l nodes=1:ppn=16
#PBS -l walltime=550:50:00
#PBS -j oe
#PBS -q high
cd $PBS_O_WORKDIR
mpiexec -n 16 ./smilei benchmarks/tst1d_0_em_propagation.py
exit 0
It seems that SMILEI is based on the DSM supercomputer. But I use a SPM supercomputer. Is that the reason of the problem? How do I fix it? Thank you.
The text was updated successfully, but these errors were encountered: