gap_fit MPI Segmentation fault #636

MES-physics · 2024-04-08T20:16:51Z

Dear Developers,
Please tell me what the usual problem is with this? I got the same type of error using both mpirun and srun. Trying to start gap_fit training. Last year I used 4 nodes with 64 ntasks per node, and it was working. I used the 2 -step process as before, sparsification first, then MPI run. Input file attached for the MPI run.
Thanks.

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

This is my MPI program :

libAtoms::Hello World: 2024-04-08 14:53:24
libAtoms::Hello World: git version  https://github.com/libAtoms/QUIP.git,v0.9.14-9-gcaff15489-dirty
libAtoms::Hello World: QUIP_ARCH    linux_x86_64_gfortran_openmpi+openmp
libAtoms::Hello World: compiled on  Mar 27 2024 at 11:08:38
libAtoms::Hello World: MPI parallelisation with 256 processes
libAtoms::Hello World: OpenMP parallelisation with 8 threads
libAtoms::Hello World: OMP_STACKSIZE=1G
libAtoms::Hello World: MPI run with the same seed on each process
libAtoms::Hello World: Random Seed = 837779713
libAtoms::Hello World: global verbosity = 0

Calls to system_timer will do nothing by default
[gap_fit_mpi.txt](https://github.com/libAtoms/QUIP/files/14910528/gap_fit_mpi.txt)

The text was updated successfully, but these errors were encountered:

albapa · 2024-04-08T20:47:28Z

@Sideboard has added some code that eliminates the need for the two-step process!

albapa · 2024-04-08T20:47:49Z

Which step fails? The sparsification or the fit?

MES-physics · 2024-04-08T20:52:17Z

The fit. Sparsification worked fine.

MES-physics · 2024-04-08T20:55:43Z

Yes I know about the change to one step, but haven't figured out how to use it yet.

albapa · 2024-04-08T21:57:16Z

The fit. Sparsification worked fine.

The mistake must be in the command line - at least it should print it back. Using the config_file mechanism is highly recommended.

Yes I know about the change to one step, but haven't figured out how to use it yet.

As far as I know it's as simple as submitting a single MPI job.

MES-physics · 2024-04-08T22:01:22Z

It does print the command line back. I'll look further.

MES-physics · 2024-04-10T15:18:14Z

Now I tried doing MPI all-in-one, and got the same segmentation fault. Here is a sample from the *.err file, with the command line feedback. I am using the exact command line I used last year with only the datafile change, trying to run on 2 nodes. Thanks for any advice.

#0  0x14d56297bc1f in ???
#1  0x14d563cfedf7 in ???
#2  0xbb5509 in ???
#3  0xbb5044 in ???
#4  0xbb4ce5 in ???
#5  0xa9e0d8 in ???
#6  0x41aa80 in ???
#7  0x40a292 in ???
#8  0x409b1e in ???
#9  0x14d5625c7492 in ???
#10  0x409b5d in ???
#11  0xffffffffffffffff in ???
./gap_fitMPI.sh: line 63: 1568279 Segmentation fault      /home/QUIPMPI/QUIP/build/linux_x86_64_gfortr
an_openmpi+openmp/gap_fit atoms_filename=Cours3909_238.xyz at_file=Cours3909_238.xyz gap = {distance_2b n_spars
e=15 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=6.0 delta=2.0 : angle_3b n_sparse=20
0 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=2.5 delta=0.05 : soap n_max=12 l_max=4 
atom_sigma=0.5 zeta=4.0 cutoff=6.0 cutoff_transition_width=1.0 central_weight=1.0 n_sparse=9000 delta=0.2 covar
iance_type=dot_product sparse_method=cur_points radial_decay=-0.5} default_sigma={0.001 0.01 0.05 0.0} default_
kernel_regularisation={0.001 0.01 0.05 0.0} energy_parameter_name=energy force_parameter_name=forces virial_par
ameter_name=virial do_copy_at_file=F sparse_jitter=1.0e-8 sparsify_only_no_fit=F sparse_separate_file=T openmp_
chunk_size=10000 gp_file=Cours238.xml core_ip_args={IP Glue} core_param_file=r6_innercut.xml config_type_sigma=
{Liquid:0.050:0.5:0.5:0.0: Liquid_Interface:0.050:0.5:0.5:0.0: Amorphous_Bulk:0.005:0.2:0.2:0.0: Amorphous_Surf
aces:0.005:0.2:0.2:0.0: Surfaces:0.002:0.1:0.2:0.0: Dimer:0.002:0.1:0.2:0.0: Fullerenes:0.002:0.1:0.2:0.0: Defe
cts:0.001:0.01:0.05:0.0: Crystalline_Bulk:0.001:0.01:0.05:0.0: Nanotubes:0.001:0.01:0.05:0.0: Graphite:0.001:0.
01:0.05:0.0: Diamond:0.001:0.01:0.05:0.0: Graphene:0.001:0.01:0.05:0.0: Graphite_Layer_Sep:0.001:0.01:0.05:0.0:
 Single_Atom:0.0001:0.001:0.05:0.0}
#0  0x1554e2f3ac1f in ???
#1  0x1554e42bddf7 in ???
#2  0xbb5509 in ???
#3  0xbb5044 in ???
#4  0xbb4ce5 in ???
#5  0xa9e0d8 in ???
#6  0x41aa80 in ???
#7  0x40a292 in ???
#8  0x409b1e in ???
#9  0x1554e2b86492 in ???
#10  0x409b5d in ???
#11  0xffffffffffffffff in ???
./gap_fitMPI.sh: line 63: 1568307 Segmentation fault      (core dumped) /home/QUIPMPI/QUIP/build/linux
_x86_64_gfortran_openmpi+openmp/gap_fit atoms_filename=Cours3909_238.xyz at_file=Cours3909_238.xyz gap = {dista
nce_2b n_sparse=15 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=6.0 delta=2.0 : angle_
3b n_sparse=200 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=2.5 delta=0.05 : soap n_m
ax=12 l_max=4 atom_sigma=0.5 zeta=4.0 cutoff=6.0 cutoff_transition_width=1.0 central_weight=1.0 n_sparse=9000 d
elta=0.2 covariance_type=dot_product sparse_method=cur_points radial_decay=-0.5} default_sigma={0.001 0.01 0.05
 0.0} default_kernel_regularisation={0.001 0.01 0.05 0.0} energy_parameter_name=energy force_parameter_name=for
ces virial_parameter_name=virial do_copy_at_file=F sparse_jitter=1.0e-8 sparsify_only_no_fit=F sparse_separate_
file=T openmp_chunk_size=10000 gp_file=Cours238.xml core_ip_args={IP Glue} core_param_file=r6_innercut.xml conf
ig_type_sigma={Liquid:0.050:0.5:0.5:0.0: Liquid_Interface:0.050:0.5:0.5:0.0: Amorphous_Bulk:0.005:0.2:0.2:0.0: 
Amorphous_Surfaces:0.005:0.2:0.2:0.0: Surfaces:0.002:0.1:0.2:0.0: Dimer:0.002:0.1:0.2:0.0: Fullerenes:0.002:0.1
:0.2:0.0: Defects:0.001:0.01:0.05:0.0: Crystalline_Bulk:0.001:0.01:0.05:0.0: Nanotubes:0.001:0.01:0.05:0.0: Gra
phite:0.001:0.01:0.05:0.0: Diamond:0.001:0.01:0.05:0.0: Graphene:0.001:0.01:0.05:0.0: Graphite_Layer_Sep:0.001:
0.01:0.05:0.0: Single_Atom:0.0001:0.001:0.05:0.0}

Here is the *.out file.

libAtoms::Hello World: 2024-04-09 22:25:43
libAtoms::Hello World: git version  https://github.com/libAtoms/QUIP.git,v0.9.14-9-gcaff15489-dirty
libAtoms::Hello World: QUIP_ARCH    linux_x86_64_gfortran_openmpi+openmp
libAtoms::Hello World: compiled on  Mar 27 2024 at 11:08:38
libAtoms::Hello World: MPI parallelisation with 128 processes
libAtoms::Hello World: OpenMP parallelisation with 64 threads
libAtoms::Hello World: OMP_STACKSIZE=1G
libAtoms::Hello World: MPI run with the same seed on each process
libAtoms::Hello World: Random Seed = 1745256961
libAtoms::Hello World: global verbosity = 0

Calls to system_timer will do nothing by default

albapa · 2024-04-11T09:19:45Z

The standard output does not report the parsing of the command line - I still suspect a problem there, otherwise it would stop at a later stage.

MES-physics · 2024-04-11T17:24:21Z

Ok, now what I did is go back to a sort of square-one, using Deringer's old command line in the 2017 paper, and putting it in a config_file. Attached is the config_file (changing only the input .xyz file and .xml name), the error file, and the out file. Same type of segmentation error happened. My previous trial was with a different .xyz file and command line, but the same error.
Please help?

Here is my run command :
mpirun -np 128 /home/myname/QUIPMPI/QUIP/build/linux_x86_64_gfortran_openmpi+openmp/gap_fit config_file=CDerConfig

CDerConfig.txt

CDerAmorphParam-1759026out.txt

CDerAmorphParam-1759026err.txt

bernstei · 2024-04-11T17:33:30Z

Does the crash generate a core file? If not, is it because your shell is limiting it, and you can turn that limit off? We may be able to figure out where it's crashing at least.

albapa · 2024-04-11T17:42:21Z

The error file shows as if it was crashing in the Scalapack initialisation. Could it be an incompatibility between the mpi and the scalapack?

MES-physics · 2024-04-11T18:00:14Z

I don't know how to turn the limit off? I guess I can look it up.
And if it is an incompatibility beween scalapack and MPI, how do I ask my admin how to fix it?
Thanks!
I have these in the SLURM script:

ulimit -s unlimited
export PYTHONUNBUFFERED=TRUE
export OMP_STACKSIZE=1G
export OMP_DYNAMIC=false
export OMP_PROC_BIND=true
export OMP_NUM_THREADS=64

I don't see any other output files than the ones I posted above. Thanks!

bernstei · 2024-04-11T18:04:06Z

That ulimit call should do it, but it's possible that it isn't propagated to the executables MPI actually runs. But if @albapa is right about the scalpack init, it may well be library compatibility. How did you compile?

MES-physics · 2024-04-11T19:15:31Z

I used the linux_x86_64_gfortran_openmpi+openmp architecture and
these modules which I also use to run.
module load gnu10 openmpi openblas
module load netcdf-c netcdf-fortran
module load scalapack

MES-physics · 2024-04-11T19:18:54Z

Also in the make config instructions, I did add netcdf-c support.

bernstei · 2024-04-11T19:29:33Z

Did you enable scalapack in "make config"? How does it know where to get your scalapack libraries? Did you add them to the math libraries when you ran "make config".

Can you upload your Makefile.inc?

Can you post the output of ldd path_to_your_gap_fit_executable ?

MES-physics · 2024-04-12T02:26:30Z

Did you enable scalapack in "make config"? YES I'm sure I did that.
Add scalapack libraries to math libraries? Not sure, maybe not, probably hit the default on that, and don't know how to do it.

ldd output: (OH OH, I see some things not found, but I did put lopenblas and netcdf in the make config questions, which my admin told me.)

ldd /home/myname/QUIPMPI/QUIP/build/linux_x86_64_gfortran_openmpi+openmp/gap_fit
	linux-vdso.so.1 (0x00007ffdd2359000)
	libnetcdf.so.19 => not found
	libopenblas.so.0 => not found
	libmpi_usempif08.so.40 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/openmpi-4.1.2-4a/lib/libmpi_usempif08.so.40 (0x00007f65a2a7c000)
	libmpi_usempi_ignore_tkr.so.40 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/openmpi-4.1.2-4a/lib/libmpi_usempi_ignore_tkr.so.40 (0x00007f65a286d000)
	libmpi_mpifh.so.40 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/openmpi-4.1.2-4a/lib/libmpi_mpifh.so.40 (0x00007f65a25ff000)
	libmpi.so.40 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/openmpi-4.1.2-4a/lib/libmpi.so.40 (0x00007f65a22c7000)
	libgfortran.so.5 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-9.3.0/gcc-10.3.0-ya/lib64/libgfortran.so.5 (0x00007f65a1e0f000)
	libm.so.6 => /lib64/libm.so.6 (0x00007f65a1a8d000)
	libmvec.so.1 => /lib64/libmvec.so.1 (0x00007f65a1862000)
	libgomp.so.1 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-9.3.0/gcc-10.3.0-ya/lib64/libgomp.so.1 (0x00007f65a1623000)
	libgcc_s.so.1 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-9.3.0/gcc-10.3.0-ya/lib64/libgcc_s.so.1 (0x00007f65a140b000)
	libquadmath.so.0 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-9.3.0/gcc-10.3.0-ya/lib64/libquadmath.so.0 (0x00007f65a11c4000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f65a0fa4000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f65a0bdf000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f65a2cbd000)
	libopen-rte.so.40 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/openmpi-4.1.2-4a/lib/libopen-rte.so.40 (0x00007f65a0925000)
	libopen-pal.so.40 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/openmpi-4.1.2-4a/lib/libopen-pal.so.40 (0x00007f65a0674000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007f65a0470000)
	librt.so.1 => /lib64/librt.so.1 (0x00007f65a0268000)
	libutil.so.1 => /lib64/libutil.so.1 (0x00007f65a0064000)
	libz.so.1 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/zlib-1.2.11-2y/lib/libz.so.1 (0x00007f659fe4d000)
	libhwloc.so.15 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/hwloc-2.7.0-f7/lib/libhwloc.so.15 (0x00007f659fbf2000)
	libevent_core-2.1.so.6 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/libevent-2.1.8-nd/lib/libevent_core-2.1.so.6 (0x00007f659f9bd000)
	libevent_pthreads-2.1.so.6 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/libevent-2.1.8-nd/lib/libevent_pthreads-2.1.so.6 (0x00007f659f7ba000)
	libpciaccess.so.0 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/libpciaccess-0.16-ro/lib/libpciaccess.so.0 (0x00007f659f5b1000)
	libxml2.so.2 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/libxml2-2.9.12-lq/lib/libxml2.so.2 (0x00007f659f245000)
	libcrypto.so.1.1 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/openssl-1.1.1m-ly/lib/libcrypto.so.1.1 (0x00007f659ed5c000)
	liblzma.so.5 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/xz-5.2.4-ok/lib/liblzma.so.5 (0x00007f659eb35000)
	libiconv.so.2 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/libiconv-1.16-a7/lib/libiconv.so.2 (0x00007f659e839000)

Makefile.inc.txt

bernstei · 2024-04-12T12:16:53Z

There should be no way those things could be missing when it actually runs. Did you have all the same modules loaded when you ran the ldd as are loaded inside the running job?

scalapack must be linked statically, which is unfortunate, since it means you can't tell which one it's using from the ldd output.

Unfortunately, there's an infinite number of ways to set up mpi, scalapack, and lapack, and they need to be consistent.

If you make clean, then make again to recreate the executable and save all the output, there should be a link line for the gap_fit executable which includes all the libraries. You should consult with whoever set up scalapack and the environment modules, who should be able to look at that link line (possibly also the value of $LD_LIBRARY_PATH when you run make) and tell you whether it's correct.

MES-physics · 2024-04-12T14:14:49Z

OK, sorry, here is the ldd command after adding the modules. And I will follow the rest of your advice, thank you so much!

ldd /home/myname/QUIPMPI/QUIP/build/linux_x86_64_gfortran_openmpi+openmp/gap_fit
	linux-vdso.so.1 (0x00007f8086727000)
	libnetcdf.so.19 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0-openmpi-4.1.2/netcdf-c-4.8.1-l3/lib/libnetcdf.so.19 (0x00007f8086132000)
	libopenblas.so.0 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/openblas-0.3.20-iq/lib/libopenblas.so.0 (0x00007f8083899000)
	libmpi_usempif08.so.40 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/openmpi-4.1.2-4a/lib/libmpi_usempif08.so.40 (0x00007f8083658000)
	libmpi_usempi_ignore_tkr.so.40 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/openmpi-4.1.2-4a/lib/libmpi_usempi_ignore_tkr.so.40 (0x00007f8083449000)
	libmpi_mpifh.so.40 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/openmpi-4.1.2-4a/lib/libmpi_mpifh.so.40 (0x00007f80831db000)
	libmpi.so.40 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/openmpi-4.1.2-4a/lib/libmpi.so.40 (0x00007f8082ea3000)
	libgfortran.so.5 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-9.3.0/gcc-10.3.0-ya/lib64/libgfortran.so.5 (0x00007f80829eb000)
	libm.so.6 => /lib64/libm.so.6 (0x00007f8082669000)
	libmvec.so.1 => /lib64/libmvec.so.1 (0x00007f808243e000)
	libgomp.so.1 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-9.3.0/gcc-10.3.0-ya/lib64/libgomp.so.1 (0x00007f80821ff000)
	libgcc_s.so.1 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-9.3.0/gcc-10.3.0-ya/lib64/libgcc_s.so.1 (0x00007f8081fe7000)
	libquadmath.so.0 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-9.3.0/gcc-10.3.0-ya/lib64/libquadmath.so.0 (0x00007f8081da0000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f8081b80000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f80817bb000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f80864fb000)
	libpnetcdf.so.4 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0-openmpi-4.1.2/parallel-netcdf-1.12.1-t6/lib/libpnetcdf.so.4 (0x00007f8081001000)
	libhdf5_hl.so.200 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0-openmpi-4.1.2/hdf5-1.12.1-z2/lib/libhdf5_hl.so.200 (0x00007f8080de0000)
	libhdf5.so.200 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0-openmpi-4.1.2/hdf5-1.12.1-z2/lib/libhdf5.so.200 (0x00007f80807cd000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007f80805c9000)
	libz.so.1 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/zlib-1.2.11-2y/lib/libz.so.1 (0x00007f80803b2000)
	libcurl.so.4 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/curl-7.80.0-rm/lib/libcurl.so.4 (0x00007f8080121000)
	libopen-rte.so.40 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/openmpi-4.1.2-4a/lib/libopen-rte.so.40 (0x00007f807fe67000)
	libopen-pal.so.40 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/openmpi-4.1.2-4a/lib/libopen-pal.so.40 (0x00007f807fbb6000)
	librt.so.1 => /lib64/librt.so.1 (0x00007f807f9ae000)
	libutil.so.1 => /lib64/libutil.so.1 (0x00007f807f7aa000)
	libhwloc.so.15 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/hwloc-2.7.0-f7/lib/libhwloc.so.15 (0x00007f807f54f000)
	libevent_core-2.1.so.6 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/libevent-2.1.8-nd/lib/libevent_core-2.1.so.6 (0x00007f807f31a000)
	libevent_pthreads-2.1.so.6 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/libevent-2.1.8-nd/lib/libevent_pthreads-2.1.so.6 (0x00007f807f117000)
	libmpi_cxx.so.40 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/openmpi-4.1.2-4a/lib/libmpi_cxx.so.40 (0x00007f807eefb000)
	libstdc++.so.6 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-9.3.0/gcc-10.3.0-ya/lib64/libstdc++.so.6 (0x00007f807eb27000)
	libssl.so.1.1 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/openssl-1.1.1m-ly/lib/libssl.so.1.1 (0x00007f807e893000)
	libcrypto.so.1.1 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/openssl-1.1.1m-ly/lib/libcrypto.so.1.1 (0x00007f807e3aa000)
	libpciaccess.so.0 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/libpciaccess-0.16-ro/lib/libpciaccess.so.0 (0x00007f807e1a1000)
	libxml2.so.2 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/libxml2-2.9.12-lq/lib/libxml2.so.2 (0x00007f807de35000)
	liblzma.so.5 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/xz-5.2.4-ok/lib/liblzma.so.5 (0x00007f807dc0e000)
	libiconv.so.2 => /opt/sw/spack/apps/linux-rhel8-x86_64_v2/gcc-10.3.0/libiconv-1.16-a7/lib/libiconv.so.2 (0x00007f807d912000)

MES-physics · 2024-05-01T00:37:54Z

OK, thanks very much. We installed again with OneAPI and MKL instead. It turned out that the scalapack modules we have do not work on AMD nodes. It worked now to make a potential.

MES-physics closed this as completed May 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gap_fit MPI Segmentation fault #636

gap_fit MPI Segmentation fault #636

MES-physics commented Apr 8, 2024

albapa commented Apr 8, 2024

albapa commented Apr 8, 2024

MES-physics commented Apr 8, 2024

MES-physics commented Apr 8, 2024

albapa commented Apr 8, 2024 •

edited

Loading

MES-physics commented Apr 8, 2024

MES-physics commented Apr 10, 2024

albapa commented Apr 11, 2024

MES-physics commented Apr 11, 2024

bernstei commented Apr 11, 2024

albapa commented Apr 11, 2024

MES-physics commented Apr 11, 2024

bernstei commented Apr 11, 2024

MES-physics commented Apr 11, 2024

MES-physics commented Apr 11, 2024

bernstei commented Apr 11, 2024

MES-physics commented Apr 12, 2024

bernstei commented Apr 12, 2024

MES-physics commented Apr 12, 2024

MES-physics commented May 1, 2024

gap_fit MPI Segmentation fault #636

gap_fit MPI Segmentation fault #636

Comments

MES-physics commented Apr 8, 2024

albapa commented Apr 8, 2024

albapa commented Apr 8, 2024

MES-physics commented Apr 8, 2024

MES-physics commented Apr 8, 2024

albapa commented Apr 8, 2024 • edited Loading

MES-physics commented Apr 8, 2024

MES-physics commented Apr 10, 2024

albapa commented Apr 11, 2024

MES-physics commented Apr 11, 2024

bernstei commented Apr 11, 2024

albapa commented Apr 11, 2024

MES-physics commented Apr 11, 2024

bernstei commented Apr 11, 2024

MES-physics commented Apr 11, 2024

MES-physics commented Apr 11, 2024

bernstei commented Apr 11, 2024

MES-physics commented Apr 12, 2024

bernstei commented Apr 12, 2024

MES-physics commented Apr 12, 2024

MES-physics commented May 1, 2024

albapa commented Apr 8, 2024 •

edited

Loading