Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Centos7.6 MPICH ping_pang.c restart segmentation fault #48

Open
shuqianwang opened this issue Jan 12, 2022 · 1 comment
Open

Centos7.6 MPICH ping_pang.c restart segmentation fault #48

shuqianwang opened this issue Jan 12, 2022 · 1 comment

Comments

@shuqianwang
Copy link

shuqianwang commented Jan 12, 2022

root@6248r-node121 test-ckpt-restart]# mpirun -np 2 mana_restart
[28951] mtcp_restart.c:799 main:
[Rank: 0] Choosing ckpt image: ./ckpt_rank_0/ckpt_a.out_3c3936238b6c9197-41000-1ceb312627b83.dmtcp
[28952] mtcp_restart.c:799 main:
[Rank: 1] Choosing ckpt image: ./ckpt_rank_1/ckpt_a.out_3c3936238b6c9197-40000-1ceb311d18559.dmtcp
[28951] mtcp_restart.c:1458 unmap_memory_areas_and_restore_vdso:
***Error: vdso/vvar order was different during ckpt.
[28952] mtcp_restart.c:1458 unmap_memory_areas_and_restore_vdso:
***Error: vdso/vvar order was different during ckpt.
/home/mana/bin/mana_restart: line 125: 28952 Segmentation fault (core dumped) $dir/dmtcp_restart --mpi --join-coordinator --coord-host $submissionHost --coord-port $submissionPort $options

when i ‘’make -j mana‘’ according the txt about install mana in Centos, it hava error as follows:
make[3]: ../../lib/dmtcp/libmpidummy.so' is up to date. make[3]: Leaving directory /root/mana/contrib/mpi-proxy-split'
make ../../bin/lh_proxy
make[3]: Entering directory /root/mana/contrib/mpi-proxy-split' make -C lower-half install make[4]: Entering directory /root/mana/contrib/mpi-proxy-split/lower-half'
if mpicc -v 2>&1 | grep -q 'MPICH version'; then
rm -f tmp.sh;
mpicc -show -static -Wl,-Ttext-segment -Wl,0xE000000 -Wl,--wrap -Wl,__munmap -Wl,--wrap -Wl,shmat -Wl,--wrap -Wl,shmget -o lh_proxy -Wl,-start-group
lh_proxy.o libproxy.a gethostbyname-static/gethostbyname_static.o -L$HOME/mpich-static/usr/lib64 -lmpi -llzma -lz -lm -lxml2 -lrt -lpthread -lc -Wl,-end-group |
sed -e 's^-lunwind ^ ^'> tmp.sh;
sh tmp.sh;
rm -f tmp.sh;
else
mpicc -static -Wl,-Ttext-segment -Wl,0xE000000 -Wl,--wrap -Wl,__munmap -Wl,--wrap -Wl,shmat -Wl,--wrap -Wl,shmget -o lh_proxy -Wl,-start-group
lh_proxy.o libproxy.a gethostbyname-static/gethostbyname_static.o -L$HOME/mpich-static/usr/lib64 -lmpi -llzma -lz -lm -lxml2 -lrt -lpthread -lc -Wl,-end-group;
fi
/opt/rh/devtoolset-8/root/usr/libexec/gcc/x86_64-redhat-linux/8/ld: cannot find -ludev
collect2: error: ld returned 1 exit status
cp -f lh_proxy gethostbyname-static/gethostbyname_static.o ../../../bin/
cp: cannot stat ‘lh_proxy’: No such file or directory
make[4]: *** [install] Error 1
make[4]: Leaving directory /root/mana/contrib/mpi-proxy-split/lower-half' make[3]: *** [../../bin/lh_proxy] Error 2 make[3]: Leaving directory /root/mana/contrib/mpi-proxy-split'
make[2]: *** [install] Error 2
make[2]: Leaving directory /root/mana/contrib/mpi-proxy-split' make[1]: *** [mana_part2] Error 2 make[1]: Leaving directory /root/mana'
make: *** [mana] Error 2

i found that is only hava shared udev library, do not hava the static .a udev library, so i download the systemd src rpm,and
recompile it with enable with static, but it also hava error as follows:

6997AE85-A7EE-43D3-B220-8FF52997B88B

i changed the configure manully without the restrictions and compiled with the libudev.a

@JainTwinkle
Copy link
Collaborator

@shuqianwang Several fixes related to CentOS have been merged in MANA. Could you please update your MANA repository and try again?
You might still need to link lh_proxy with libudev.a on your system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants