-
Notifications
You must be signed in to change notification settings - Fork 929
Closed
Description
Thank you for taking the time to submit an issue!
Background information
What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git branch name and hash, etc.)
- using openmpi-4.0.1
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
./configure --prefix=/usr/local/openmpi4 --enable-debug
make all install
export LD_LIBRARY_PATH=/usr/local/openmpi4/lib:$LD_LIBRARY_PATH
export PATH=/usr/local/openmpi4/bin:$PATH
Please describe the system on which you are running
- Operating system/version: ubuntu xenial(kernel 3.10.0-693.el7.x86_64),that is docker
- Computer hardware: x86
- Network type: normal(communication by ssh)
- Kuernetes : 1.15
- Docker :17.03.2-ce
Details of the problem
Please describe, in detail, the problem that you are having, including the behavior you expect to see, the actual behavior that you are seeing, steps to reproduce the problem, etc. It is most helpful if you can attach a small program that a developer can use to reproduce your problem.
Note: If you include verbatim output (or a code block), please use a GitHub Markdown code block like below:
shell$ mpirun --allow-run-as-root -mca plm_rsh_args '-p 2222' -bind-to socket -map-by socket -mca pml ob1 -np 2 -H 10.42.142.5,10.42.8.8 python3 tensorflow_synthetic_benchmark.py --no-cuda
// tow node pery-20190601142406-worker-0 and pery-20190601142406-worker-1- occur error
*** Error in `/usr/local/openmpi4/bin/orted': double free or corruption (out): 0x0000000000712dc0 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7ffff754c7e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x8037a)[0x7ffff755537a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7ffff755953c]
/usr/local/openmpi4/lib/libopen-rte.so.40(+0xaba1b)[0x7ffff7b67a1b]
/usr/local/openmpi4/lib/libopen-rte.so.40(+0xab4bb)[0x7ffff7b674bb]
/usr/local/openmpi4/lib/libopen-rte.so.40(orte_regx_base_extract_node_names+0x404)[0x7ffff7b67e42]
/usr/local/openmpi4/lib/libopen-rte.so.40(orte_regx_base_nidmap_parse+0xdf)[0x7ffff7b61cdc]
/usr/local/openmpi4/lib/libopen-rte.so.40(orte_ess_base_orted_setup+0x1cc2)[0x7ffff7b2f9a0]
/usr/local/openmpi4/lib/openmpi/mca_ess_env.so(+0xeb6)[0x7ffff5b3deb6]
/usr/local/openmpi4/lib/libopen-rte.so.40(orte_init+0x350)[0x7ffff7ad6286]
/usr/local/openmpi4/lib/libopen-rte.so.40(orte_daemon+0x5e5)[0x7ffff7b04aff]
/usr/local/openmpi4/bin/orted[0x40090b]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7ffff74f5830]
/usr/local/openmpi4/bin/orted[0x4007a9]
======= Memory map: ========
00400000-00401000 r-xp 00000000 08:11 538048528 /usr/local/openmpi4/bin/orted
00600000-00601000 r--p 00000000 08:11 538048528 /usr/local/openmpi4/bin/orted
00601000-00602000 rw-p 00001000 08:11 538048528 /usr/local/openmpi4/bin/orted
00602000-00734000 rw-p 00000000 00:00 0 [heap]
7fffe0000000-7fffe0021000 rw-p 00000000 00:00 0
7fffe0021000-7fffe4000000 ---p 00000000 00:00 0
7fffe4000000-7fffe4021000 rw-p 00000000 00:00 0
7fffe4021000-7fffe8000000 ---p 00000000 00:00 0
7fffeb7d6000-7fffeb7ec000 r-xp 00000000 08:11 4027015748 /lib/x86_64-linux-gnu/libgcc_s.so.1
7fffeb7ec000-7fffeb9eb000 ---p 00016000 08:11 4027015748 /lib/x86_64-linux-gnu/libgcc_s.so.1
7fffeb9eb000-7fffeb9ec000 rw-p 00015000 08:11 4027015748 /lib/x86_64-linux-gnu/libgcc_s.so.1
7fffeb9ec000-7fffeb9f0000 r-xp 00000000 08:11 135111172 /usr/local/openmpi4/lib/openmpi/mca_regx_fwd.so
7fffeb9f0000-7fffebbef000 ---p 00004000 08:11 135111172 /usr/local/openmpi4/lib/openmpi/mca_regx_fwd.so
7fffebbef000-7fffebbf0000 r--p 00003000 08:11 135111172 /usr/local/openmpi4/lib/openmpi/mca_regx_fwd.so
7fffebbf0000-7fffebbf1000 rw-p 00004000 08:11 135111172 /usr/local/openmpi4/lib/openmpi/mca_regx_fwd.so
7fffebbf1000-7fffebbf7000 r-xp 00000000 08:11 135111186 /usr/local/openmpi4/lib/openmpi/mca_rmaps_seq.so
7fffebbf7000-7fffebdf6000 ---p 00006000 08:11 135111186 /usr/local/openmpi4/lib/openmpi/mca_rmaps_seq.so
7fffebdf6000-7fffebdf7000 r--p 00005000 08:11 135111186 /usr/local/openmpi4/lib/openmpi/mca_rmaps_seq.so
7fffebdf7000-7fffebdf8000 rw-p 00006000 08:11 135111186 /usr/local/openmpi4/lib/openmpi/mca_rmaps_seq.so
7fffebdf8000-7fffebdff000 r-xp 00000000 08:11 135111184 /usr/local/openmpi4/lib/openmpi/mca_rmaps_round_robin.so
7fffebdff000-7fffebffe000 ---p 00007000 08:11 135111184 /usr/local/openmpi4/lib/openmpi/mca_rmaps_round_robin.so
7fffebffe000-7fffebfff000 r--p 00006000 08:11 135111184 /usr/local/openmpi4/lib/openmpi/mca_rmaps_round_robin.so
7fffebfff000-7fffec000000 rw-p 00007000 08:11 135111184 /usr/local/openmpi4/lib/openmpi/mca_rmaps_round_robin.so
7fffec000000-7fffec021000 rw-p 00000000 00:00 0
7fffec021000-7ffff0000000 ---p 00000000 00:00 0
7ffff00e9000-7ffff00f1000 r-xp 00000000 08:11 135111182 /usr/local/openmpi4/lib/openmpi/mca_rmaps_resilient.so
7ffff00f1000-7ffff02f0000 ---p 00008000 08:11 135111182 /usr/local/openmpi4/lib/openmpi/mca_rmaps_resilient.so
7ffff02f0000-7ffff02f1000 r--p 00007000 08:11 135111182 /usr/local/openmpi4/lib/openmpi/mca_rmaps_resilient.so
7ffff02f1000-7ffff02f2000 rw-p 00008000 08:11 135111182 /usr/local/openmpi4/lib/openmpi/mca_rmaps_resilient.so
7ffff02f2000-7ffff02fc000 r-xp 00000000 08:11 135111180 /usr/local/openmpi4/lib/openmpi/mca_rmaps_rank_file.so
7ffff02fc000-7ffff04fb000 ---p 0000a000 08:11 135111180 /usr/local/openmpi4/lib/openmpi/mca_rmaps_rank_file.so
7ffff04fb000-7ffff04fc000 r--p 00009000 08:11 135111180 /usr/local/openmpi4/lib/openmpi/mca_rmaps_rank_file.so
7ffff04fc000-7ffff04fd000 rw-p 0000a000 08:11 135111180 /usr/local/openmpi4/lib/openmpi/mca_rmaps_rank_file.so
7ffff04fd000-7ffff0502000 r-xp 00000000 08:11 135111178 /usr/local/openmpi4/lib/openmpi/mca_rmaps_ppr.so
7ffff0502000-7ffff0701000 ---p 00005000 08:11 135111178 /usr/local/openmpi4/lib/openmpi/mca_rmaps_ppr.so
7ffff0701000-7ffff0702000 r--p 00004000 08:11 135111178 /usr/local/openmpi4/lib/openmpi/mca_rmaps_ppr.so
7ffff0702000-7ffff0703000 rw-p 00005000 08:11 135111178 /usr/local/openmpi4/lib/openmpi/mca_rmaps_ppr.so
7ffff0703000-7ffff0708000 r-xp 00000000 08:11 135111176 /usr/local/openmpi4/lib/openmpi/mca_rmaps_mindist.so
7ffff0708000-7ffff0907000 ---p 00005000 08:11 135111176 /usr/local/openmpi4/lib/openmpi/mca_rmaps_mindist.so
7ffff0907000-7ffff0908000 r--p 00004000 08:11 135111176 /usr/local/openmpi4/lib/openmpi/mca_rmaps_mindist.so
7ffff0908000-7ffff0909000 rw-p 00005000 08:11 135111176 /usr/local/openmpi4/lib/openmpi/mca_rmaps_mindist.so
7ffff0909000-7ffff090d000 r-xp 00000000 08:11 135111188 /usr/local/openmpi4/lib/openmpi/mca_rml_oob.so
7ffff090d000-7ffff0b0c000 ---p 00004000 08:11 135111188 /usr/local/openmpi4/lib/openmpi/mca_rml_oob.so
7ffff0b0c000-7ffff0b0d000 r--p 00003000 08:11 135111188 /usr/local/openmpi4/lib/openmpi/mca_rml_oob.so
7ffff0b0d000-7ffff0b0e000 rw-p 00004000 08:11 135111188 /usr/local/openmpi4/lib/openmpi/mca_rml_oob.so
7ffff0b0e000-7ffff0b0f000 ---p 00000000 00:00 0
7ffff0b0f000-7ffff130f000 rw-p 00000000 00:00 0 [stack:14165]
7ffff130f000-7ffff1328000 r-xp 00000000 08:11 134822968 /usr/local/openmpi4/lib/openmpi/mca_oob_tcp.so
7ffff1328000-7ffff1528000 ---p 00019000 08:11 134822968 /usr/local/openmpi4/lib/openmpi/mca_oob_tcp.so
7ffff1528000-7ffff1529000 r--p 00019000 08:11 134822968 /usr/local/openmpi4/lib/openmpi/mca_oob_tcp.so
7ffff1529000-7ffff152a000 rw-p 0001a000 08:11 134822968 /usr/local/openmpi4/lib/openmpi/mca_oob_tcp.so
7ffff152a000-7ffff152e000 r-xp 00000000 08:11 135111196 /usr/local/openmpi4/lib/openmpi/mca_routed_radix.so
7ffff152e000-7ffff172d000 ---p 00004000 08:11 135111196 /usr/local/openmpi4/lib/openmpi/mca_routed_radix.so
7ffff172d000-7ffff172e000 r--p 00003000 08:11 135111196 /usr/local/openmpi4/lib/openmpi/mca_routed_radix.so
7ffff172e000-7ffff172f000 rw-p 00004000 08:11 135111196 /usr/local/openmpi4/lib/openmpi/mca_routed_radix.so
7ffff172f000-7ffff1733000 r-xp 00000000 08:11 135111194 /usr/local/openmpi4/lib/openmpi/mca_routed_direct.so
7ffff1733000-7ffff1932000 ---p 00004000 08:11 135111194 /usr/local/openmpi4/lib/openmpi/mca_routed_direct.so
7ffff1932000-7ffff1933000 r--p 00003000 08:11 135111194 /usr/local/openmpi4/lib/openmpi/mca_routed_direct.so
7ffff1933000-7ffff1934000 rw-p 00004000 08:11 135111194 /usr/local/openmpi4/lib/openmpi/mca_routed_direct.so
7ffff1934000-7ffff1937000 r-xp 00000000 08:11 135111192 /usr/local/openmpi4/lib/openmpi/mca_routed_debruijn.so
7ffff1937000-7ffff1b37000 ---p 00003000 08:11 135111192 /usr/local/openmpi4/lib/openmpi/mca_routed_debruijn.so
7ffff1b37000-7ffff1b38000 r--p 00003000 08:11 135111192 /usr/local/openmpi4/lib/openmpi/mca_routed_debruijn.so
7ffff1b38000-7ffff1b39000 rw-p 00004000 08:11 135111192 /usr/local/openmpi4/lib/openmpi/mca_routed_debruijn.so
7ffff1b39000-7ffff1b3d000 r-xp 00000000 08:11 135111190 /usr/local/openmpi4/lib/openmpi/mca_routed_binomial.so
7ffff1b3d000-7ffff1d3c000 ---p 00004000 08:11 135111190 /usr/local/openmpi4/lib/openmpi/mca_routed_binomial.so
7ffff1d3c000-7ffff1d3d000 r--p 00003000 08:11 135111190 /usr/local/openmpi4/lib/openmpi/mca_routed_binomial.so
7ffff1d3d000-7ffff1d3e000 rw-p 00004000 08:11 135111190 /usr/local/openmpi4/lib/openmpi/mca_routed_binomial.so
7ffff1d3e000-7ffff1d3f000 ---p 00000000 00:00 0
7ffff1d3f000-7ffff253f000 rw-p 00000000 00:00 0 [stack:14164]
7ffff253f000-7ffff2544000 r-xp 00000000 08:11 403521570 /usr/local/openmpi4/lib/pmix/mca_psensor_heartbeat.so
7ffff2544000-7ffff2743000 ---p 00005000 08:11 403521570 /usr/local/openmpi4/lib/pmix/mca_psensor_heartbeat.so
7ffff2743000-7ffff2744000 r--p 00004000 08:11 403521570 /usr/local/openmpi4/lib/pmix/mca_psensor_heartbeat.so
7ffff2744000-7ffff2745000 rw-p 00005000 08:11 403521570 /usr/local/openmpi4/lib/pmix/mca_psensor_heartbeat.so
7ffff2745000-7ffff274a000 r-xp 00000000 08:11 403521568 /usr/local/openmpi4/lib/pmix/mca_psensor_file.so
7ffff274a000-7ffff2949000 ---p 00005000 08:11 403521568 /usr/local/openmpi4/lib/pmix/mca_psensor_file.so
7ffff2949000-7ffff294a000 r--p 00004000 08:11 403521568 /usr/local/openmpi4/lib/pmix/mca_psensor_file.so
7ffff294a000-7ffff294b000 rw-p 00005000 08:11 403521568 /usr/local/openmpi4/lib/pmix/mca_psensor_file.so
7ffff294b000-7ffff294c000 ---p 00000000 00:00 0
7ffff294c000-7ffff314c000 rw-p 00000000 00:00 0 [stack:14163]
7ffff314c000-7ffff3153000 r-xp 00000000 08:11 403521562 /usr/local/openmpi4/lib/pmix/mca_preg_native.so
7ffff3153000-7ffff3353000 ---p 00007000 08:11 403521562 /usr/local/openmpi4/lib/pmix/mca_preg_native.so
7ffff3353000-7ffff3354000 r--p 00007000 08:11 403521562 /usr/local/openmpi4/lib/pmix/mca_preg_native.so
7ffff3354000-7ffff3355000 rw-p 00008000 08:11 403521562 /usr/local/openmpi4/lib/pmix/mca_preg_native.so
7ffff3355000-7ffff3357000 r-xp 00000000 08:11 403521572 /usr/local/openmpi4/lib/pmix/mca_pshmem_mmap.so
7ffff3357000-7ffff3556000 ---p 00002000 08:11 403521572 /usr/local/openmpi4/lib/pmix/mca_pshmem_mmap.so
7ffff3556000-7ffff3557000 r--p 00001000 08:11 403521572 /usr/local/openmpi4/lib/pmix/mca_pshmem_mmap.so
7ffff3557000-7ffff3558000 rw-p 00002000 08:11 403521572 /usr/local/openmpi4/lib/pmix/mca_pshmem_mmap.so
7ffff3558000-7ffff3566000 r-xp 00000000 08:11 403521560 /usr/local/openmpi4/lib/pmix/mca_gds_hash.so
7ffff3566000-7ffff3765000 ---p 0000e000 08:11 403521560 /usr/local/openmpi4/lib/pmix/mca_gds_hash.so
7ffff3765000-7ffff3766000 r--p 0000d000 08:11 403521560 /usr/local/openmpi4/lib/pmix/mca_gds_hash.so
7ffff3766000-7ffff3767000 rw-p 0000e000 08:11 403521560 /usr/local/openmpi4/lib/pmix/mca_gds_hash.so
7ffff3767000-7ffff3779000 r-xp 00000000 08:11 403521558 /usr/local/openmpi4/lib/pmix/mca_gds_ds12.so
7ffff3779000-7ffff3979000 ---p 00012000 08:11 403521558 /usr/local/openmpi4/lib/pmix/mca_gds_ds12.so
7ffff3979000-7ffff397a000 r--p 00012000 08:11 403521558 /usr/local/openmpi4/lib/pmix/mca_gds_ds12.so
7ffff397a000-7ffff397b000 rw-p 00013000 08:11 403521558 /usr/local/openmpi4/lib/pmix/mca_gds_ds12.so
7ffff397b000-7ffff397c000 r-xp 00000000 08:11 403521566 /usr/local/openmpi4/lib/pmix/mca_psec_none.so
7ffff397c000-7ffff3b7b000 ---p 00001000 08:11 403521566 /usr/local/openmpi4/lib/pmix/mca_psec_none.so
7ffff3b7b000-7ffff3b7c000 r--p 00000000 08:11 403521566 /usr/local/openmpi4/lib/pmix/mca_psec_none.so
7ffff3b7c000-7ffff3b7d000 rw-p 00001000 08:11 403521566 /usr/local/openmpi4/lib/pmix/mca_psec_none.so
7ffff3b7d000-7ffff3b8a000 r-xp 00000000 08:11 403521576 /usr/local/openmpi4/lib/pmix/mca_ptl_usock.so
7ffff3b8a000-7ffff3d89000 ---p 0000d000 08:11 403521576 /usr/local/openmpi4/lib/pmix/mca_ptl_usock.so
7ffff3d89000-7ffff3d8a000 r--p 0000c000 08:11 403521576 /usr/local/openmpi4/lib/pmix/mca_ptl_usock.so
7ffff3d8a000-7ffff3d8b000 rw-p 0000d000 08:11 403521576 /usr/local/openmpi4/lib/pmix/mca_ptl_usock.so
7ffff3d8b000-7ffff3d9d000 r-xp 00000000 08:11 403521574 /usr/local/openmpi4/lib/pmix/mca_ptl_tcp.so
7ffff3d9d000-7ffff3f9c000 ---p 00012000 08:11 403521574 /usr/local/openmpi4/lib/pmix/mca_ptl_tcp.so
7ffff3f9c000-7ffff3f9d000 r--p 00011000 08:11 403521574 /usr/local/openmpi4/lib/pmix/mca_ptl_tcp.so
7ffff3f9d000-7ffff3f9e000 rw-p 00012000 08:11 403521574 /usr/local/openmpi4/lib/pmix/mca_ptl_tcp.so
7ffff3f9e000-7ffff3fa5000 r-xp 00000000 08:11 403521556 /usr/local/openmpi4/lib/pmix/mca_bfrops_v21.so
7ffff3fa5000-7ffff41a4000 ---p 00007000 08:11 403521556 /usr/local/openmpi4/lib/pmix/mca_bfrops_v21.so
7ffff41a4000-7ffff41a5000 r--p 00006000 08:11 403521556 /usr/local/openmpi4/lib/pmix/mca_bfrops_v21.so
7ffff41a5000-7ffff41a6000 rw-p 00007000 08:11 403521556 /usr/local/openmpi4/lib/pmix/mca_bfrops_v21.so
7ffff41a6000-7ffff41c5000 r-xp 00000000 08:11 403521554 /usr/local/openmpi4/lib/pmix/mca_bfrops_v20.so
7ffff41c5000-7ffff43c4000 ---p 0001f000 08:11 403521554 /usr/local/openmpi4/lib/pmix/mca_bfrops_v20.so
7ffff43c4000-7ffff43c5000 r--p 0001e000 08:11 403521554 /usr/local/openmpi4/lib/pmix/mca_bfrops_v20.so
7ffff43c5000-7ffff43c6000 rw-p 0001f000 08:11 403521554 /usr/local/openmpi4/lib/pmix/mca_bfrops_v20.so
7ffff43c6000-7ffff43da000 r-xp 00000000 08:11 403521552 /usr/local/openmpi4/lib/pmix/mca_bfrops_v12.so
7ffff43da000-7ffff45da000 ---p 00014000 08:11 403521552 /usr/local/openmpi4/lib/pmix/mca_bfrops_v12.so
7ffff45da000-7ffff45db000 r--p 00014000 08:11 403521552 /usr/local/openmpi4/lib/pmix/mca_bfrops_v12.so
7ffff45db000-7ffff45dc000 rw-p 00015000 08:11 403521552 /usr/local/openmpi4/lib/pmix/mca_bfrops_v12.so
7ffff45dc000-7ffff46f8000 r-xp 00000000 08:11 134820542 /usr/local/openmpi4/lib/openmpi/mca_pmix_pmix2x.so
7ffff46f8000-7ffff48f7000 ---p 0011c000 08:11 134820542 /usr/local/openmpi4/lib/openmpi/mca_pmix_pmix2x.so
7ffff48f7000-7ffff48f8000 r--p 0011b000 08:11 134820542 /usr/local/openmpi4/lib/openmpi/mca_pmix_pmix2x.so
7ffff48f8000-7ffff48fd000 rw-p 0011c000 08:11 134820542 /usr/local/openmpi4/lib/openmpi/mca_pmix_pmix2x.so
7ffff48fd000-7ffff48ff000 rw-p 00000000 00:00 0
7ffff4902000-7ffff4904000 r-xp 00000000 08:11 403521564 /usr/local/openmpi4/lib/pmix/mca_psec_native.so
7ffff4904000-7ffff4b03000 ---p 00002000 08:11 403521564 /usr/local/openmpi4/lib/pmix/mca_psec_native.so
7ffff4b03000-7ffff4b04000 r--p 00001000 08:11 403521564 /usr/local/openmpi4/lib/pmix/mca_psec_native.so
7ffff4b04000-7ffff4b05000 rw-p 00002000 08:11 403521564 /usr/local/openmpi4/lib/pmix/mca_psec_native.so
7ffff4b05000-7ffff4b11000 r-xp 00000000 08:11 134822972 /usr/local/openmpi4/lib/openmpi/mca_plm_rsh.so
7ffff4b11000-7ffff4d10000 ---p 0000c000 08:11 134822972 /usr/local/openmpi4/lib/openmpi/mca_plm_rsh.so
7ffff4d10000-7ffff4d11000 r--p 0000b000 08:11 134822972 /usr/local/openmpi4/lib/openmpi/mca_plm_rsh.so
7ffff4d11000-7ffff4d12000 rw-p 0000c000 08:11 134822972 /usr/local/openmpi4/lib/openmpi/mca_plm_rsh.so
7ffff4d14000-7ffff4d18000 r-xp 00000000 08:11 134822966 /usr/local/openmpi4/lib/openmpi/mca_odls_default.so
7ffff4d18000-7ffff4f17000 ---p 00004000 08:11 134822966 /usr/local/openmpi4/lib/openmpi/mca_odls_default.so
7ffff4f17000-7ffff4f18000 r--p 00003000 08:11 134822966 /usr/local/openmpi4/lib/openmpi/mca_odls_default.so
7ffff4f18000-7ffff4f19000 rw-p 00004000 08:11 134822966 /usr/local/openmpi4/lib/openmpi/mca_odls_default.so
7ffff4f19000-7ffff4f1f000 r-xp 00000000 08:11 135111217 /usr/local/openmpi4/lib/openmpi/mca_state_orted.so
7ffff4f1f000-7ffff511e000 ---p 00006000 08:11 135111217 /usr/local/openmpi4/lib/openmpi/mca_state_orted.so
7ffff511e000-7ffff511f000 r--p 00005000 08:11 135111217 /usr/local/openmpi4/lib/openmpi/mca_state_orted.so
7ffff511f000-7ffff5120000 rw-p 00006000 08:11 135111217 /usr/local/openmpi4/lib/openmpi/mca_state_orted.so
7ffff5323000-7ffff532a000 r-xp 00000000 08:11 134822935 /usr/local/openmpi4/lib/openmpi/mca_errmgr_default_orted.so
7ffff532a000-7ffff5529000 ---p 00007000 08:11 134822935 /usr/local/openmpi4/lib/openmpi/mca_errmgr_default_orted.so
7ffff5529000-7ffff552a000 r--p 00006000 08:11 134822935 /usr/local/openmpi4/lib/openmpi/mca_errmgr_default_orted.so
7ffff552a000-7ffff552b000 rw-p 00007000 08:11 134822935 /usr/local/openmpi4/lib/openmpi/mca_errmgr_default_orted.so
7ffff552b000-7ffff552e000 r-xp 00000000 08:11 135111198 /usr/local/openmpi4/lib/openmpi/mca_rtc_hwloc.so
7ffff552e000-7ffff572d000 ---p 00003000 08:11 135111198 /usr/local/openmpi4/lib/openmpi/mca_rtc_hwloc.so
7ffff572d000-7ffff572e000 r--p 00002000 08:11 135111198 /usr/local/openmpi4/lib/openmpi/mca_rtc_hwloc.so
7ffff572e000-7ffff572f000 rw-p 00003000 08:11 135111198 /usr/local/openmpi4/lib/openmpi/mca_rtc_hwloc.so
7ffff572f000-7ffff5737000 r-xp 00000000 08:11 134822956 /usr/local/openmpi4/lib/openmpi/mca_grpcomm_direct.so
7ffff5737000-7ffff5936000 ---p 00008000 08:11 134822956 /usr/local/openmpi4/lib/openmpi/mca_grpcomm_direct.so
7ffff5936000-7ffff5937000 r--p 00007000 08:11 134822956 /usr/local/openmpi4/lib/openmpi/mca_grpcomm_direct.so
7ffff5937000-7ffff5938000 rw-p 00008000 08:11 134822956 /usr/local/openmpi4/lib/openmpi/mca_grpcomm_direct.so
7ffff5938000-7ffff593c000 r-xp 00000000 08:11 134822912 /usr/local/openmpi4/lib/openmpi/mca_pstat_linux.so
7ffff593c000-7ffff5b3b000 ---p 00004000 08:11 134822912 /usr/local/openmpi4/lib/openmpi/mca_pstat_linux.so
7ffff5b3b000-7ffff5b3c000 r--p 00003000 08:11 134822912 /usr/local/openmpi4/lib/openmpi/mca_pstat_linux.so
7ffff5b3c000-7ffff5b3d000 rw-p 00004000 08:11 134822912 /usr/local/openmpi4/lib/openmpi/mca_pstat_linux.so
7ffff5b3d000-7ffff5b3f000 r-xp 00000000 08:11 134822941 /usr/local/openmpi4/lib/openmpi/mca_ess_env.so
7ffff5b3f000-7ffff5d3e000 ---p 00002000 08:11 134822941 /usr/local/openmpi4/lib/openmpi/mca_ess_env.so
7ffff5d3e000-7ffff5d3f000 r--p 00001000 08:11 134822941 /usr/local/openmpi4/lib/openmpi/mca_ess_env.so
7ffff5d3f000-7ffff5d40000 rw-p 00002000 08:11 134822941 /usr/local/openmpi4/lib/openmpi/mca_ess_env.so
7ffff5d40000-7ffff5d42000 r-xp 00000000 08:11 135111207 /usr/local/openmpi4/lib/openmpi/mca_schizo_slurm.so
7ffff5d42000-7ffff5f41000 ---p 00002000 08:11 135111207 /usr/local/openmpi4/lib/openmpi/mca_schizo_slurm.so
7ffff5f41000-7ffff5f42000 r--p 00001000 08:11 135111207 /usr/local/openmpi4/lib/openmpi/mca_schizo_slurm.so
7ffff5f42000-7ffff5f43000 rw-p 00002000 08:11 135111207 /usr/local/openmpi4/lib/openmpi/mca_schizo_slurm.so
7ffff5f43000-7ffff5f44000 r-xp 00000000 08:11 135111205 /usr/local/openmpi4/lib/openmpi/mca_schizo_orte.so
7ffff5f44000-7ffff6144000 ---p 00001000 08:11 135111205 /usr/local/openmpi4/lib/openmpi/mca_schizo_orte.so
7ffff6144000-7ffff6145000 r--p 00001000 08:11 135111205 /usr/local/openmpi4/lib/openmpi/mca_schizo_orte.so
7ffff6145000-7ffff6146000 rw-p 00002000 08:11 135111205 /usr/local/openmpi4/lib/openmpi/mca_schizo_orte.so
7ffff6146000-7ffff614e000 r-xp 00000000 08:11 135111202 /usr/local/openmpi4/lib/openmpi/mca_schizo_ompi.so
7ffff614e000-7ffff634e000 ---p 00008000 08:11 135111202 /usr/local/openmpi4/lib/openmpi/mca_schizo_ompi.so
7ffff634e000-7ffff634f000 r--p 00008000 08:11 135111202 /usr/local/openmpi4/lib/openmpi/mca_schizo_ompi.so
7ffff634f000-7ffff6351000 rw-p 00009000 08:11 135111202 /usr/local/openmpi4/lib/openmpi/mca_schizo_ompi.so
7ffff6351000-7ffff6352000 r-xp 00000000 08:11 135111200 /usr/local/openmpi4/lib/openmpi/mca_schizo_flux.so
7ffff6352000-7ffff6552000 ---p 00001000 08:11 135111200 /usr/local/openmpi4/lib/openmpi/mca_schizo_flux.so
7ffff6552000-7ffff6553000 r--p 00001000 08:11 135111200 /usr/local/openmpi4/lib/openmpi/mca_schizo_flux.so
7ffff6553000-7ffff6554000 rw-p 00002000 08:11 135111200 /usr/local/openmpi4/lib/openmpi/mca_schizo_flux.so
7ffff6554000-7ffff6556000 r-xp 00000000 08:11 134822916 /usr/local/openmpi4/lib/openmpi/mca_reachable_weighted.so
7ffff6556000-7ffff6755000 ---p 00002000 08:11 134822916 /usr/local/openmpi4/lib/openmpi/mca_reachable_weighted.so
7ffff6755000-7ffff6756000 r--p 00001000 08:11 134822916 /usr/local/openmpi4/lib/openmpi/mca_reachable_weighted.so
7ffff6756000-7ffff6757000 rw-p 00002000 08:11 134822916 /usr/local/openmpi4/lib/openmpi/mca_reachable_weighted.so
7ffff6757000-7ffff675b000 r-xp 00000000 08:11 134822919 /usr/local/openmpi4/lib/openmpi/mca_shmem_mmap.so
7ffff675b000-7ffff695a000 ---p 00004000 08:11 134822919 /usr/local/openmpi4/lib/openmpi/mca_shmem_mmap.so
7ffff695a000-7ffff695b000 r--p 00003000 08:11 134822919 /usr/local/openmpi4/lib/openmpi/mca_shmem_mmap.so
7ffff695b000-7ffff695c000 rw-p 00004000 08:11 134822919 /usr/local/openmpi4/lib/openmpi/mca_shmem_mmap.so
7ffff695c000-7ffff695e000 r-xp 00000000 08:11 805806653 /lib/x86_64-linux-gnu/libutil-2.23.so
7ffff695e000-7ffff6b5d000 ---p 00002000 08:11 805806653 /lib/x86_64-linux-gnu/libutil-2.23.so
7ffff6b5d000-7ffff6b5e000 r--p 00001000 08:11 805806653 /lib/x86_64-linux-gnu/libutil-2.23.so
7ffff6b5e000-7ffff6b5f000 rw-p 00002000 08:11 805806653 /lib/x86_64-linux-gnu/libutil-2.23.so
7ffff6b5f000-7ffff6b66000 r-xp 00000000 08:11 805806634 /lib/x86_64-linux-gnu/librt-2.23.so
7ffff6b66000-7ffff6d65000 ---p 00007000 08:11 805806634 /lib/x86_64-linux-gnu/librt-2.23.so
7ffff6d65000-7ffff6d66000 r--p 00006000 08:11 805806634 /lib/x86_64-linux-gnu/librt-2.23.so
7ffff6d66000-7ffff6d67000 rw-p 00007000 08:11 805806634 /lib/x86_64-linux-gnu/librt-2.23.so
7ffff6d67000-7ffff6d6a000 r-xp 00000000 08:11 805806636 /lib/x86_64-linux-gnu/libdl-2.23.so
7ffff6d6a000-7ffff6f69000 ---p 00003000 08:11 805806636 /lib/x86_64-linux-gnu/libdl-2.23.so
7ffff6f69000-7ffff6f6a000 r--p 00002000 08:11 805806636 /lib/x86_64-linux-gnu/libdl-2.23.so
7ffff6f6a000-7ffff6f6b000 rw-p 00003000 08:11 805806636 /lib/x86_64-linux-gnu/libdl-2.23.so
7ffff6f6b000-7ffff6f84000 r-xp 00000000 08:11 4027015826 /lib/x86_64-linux-gnu/libz.so.1.2.8
7ffff6f84000-7ffff7183000 ---p 00019000 08:11 4027015826 /lib/x86_64-linux-gnu/libz.so.1.2.8
7ffff7183000-7ffff7184000 r--p 00018000 08:11 4027015826 /lib/x86_64-linux-gnu/libz.so.1.2.8
7ffff7184000-7ffff7185000 rw-p 00019000 08:11 4027015826 /lib/x86_64-linux-gnu/libz.so.1.2.8
7ffff7185000-7ffff72c3000 r-xp 00000000 08:11 1206287 /usr/local/openmpi4/lib/libopen-pal.so.40.10.4
7ffff72c3000-7ffff74c3000 ---p 0013e000 08:11 1206287 /usr/local/openmpi4/lib/libopen-pal.so.40.10.4
7ffff74c3000-7ffff74c6000 r--p 0013e000 08:11 1206287 /usr/local/openmpi4/lib/libopen-pal.so.40.10.4
7ffff74c6000-7ffff74cd000 rw-p 00141000 08:11 1206287 /usr/local/openmpi4/lib/libopen-pal.so.40.10.4
7ffff74cd000-7ffff74d5000 rw-p 00000000 00:00 0
7ffff74d5000-7ffff7695000 r-xp 00000000 08:11 805806651 /lib/x86_64-linux-gnu/libc-2.23.so
7ffff7695000-7ffff7895000 ---p 001c0000 08:11 805806651 /lib/x86_64-linux-gnu/libc-2.23.so
7ffff7895000-7ffff7899000 r--p 001c0000 08:11 805806651 /lib/x86_64-linux-gnu/libc-2.23.so
7ffff7899000-7ffff789b000 rw-p 001c4000 08:11 805806651 /lib/x86_64-linux-gnu/libc-2.23.so
7ffff789b000-7ffff789f000 rw-p 00000000 00:00 0
7ffff789f000-7ffff78b7000 r-xp 00000000 08:11 805806638 /lib/x86_64-linux-gnu/libpthread-2.23.so
7ffff78b7000-7ffff7ab6000 ---p 00018000 08:11 805806638 /lib/x86_64-linux-gnu/libpthread-2.23.so
7ffff7ab6000-7ffff7ab7000 r--p 00017000 08:11 805806638 /lib/x86_64-linux-gnu/libpthread-2.23.so
7ffff7ab7000-7ffff7ab8000 rw-p 00018000 08:11 805806638 /lib/x86_64-linux-gnu/libpthread-2.23.so
7ffff7ab8000-7ffff7abc000 rw-p 00000000 00:00 0
7ffff7abc000-7ffff7bce000 r-xp 00000000 08:11 1206295 /usr/local/openmpi4/lib/libopen-rte.so.40.10.4
7ffff7bce000-7ffff7dce000 ---p 00112000 08:11 1206295 /usr/local/openmpi4/lib/libopen-rte.so.40.10.4
7ffff7dce000-7ffff7dcf000 r--p 00112000 08:11 1206295 /usr/local/openmpi4/lib/libopen-rte.so.40.10.4
7ffff7dcf000-7ffff7dd4000 rw-p 00113000 08:11 1206295 /usr/local/openmpi4/lib/libopen-rte.so.40.10.4
7ffff7dd4000-7ffff7dd7000 rw-p 00000000 00:00 0
7ffff7dd7000-7ffff7dfd000 r-xp 00000000 08:11 805806637 /lib/x86_64-linux-gnu/ld-2.23.so
7ffff7f5e000-7ffff7ff4000 rw-p 00000000 00:00 0
7ffff7ff8000-7ffff7ffa000 rw-p 00000000 00:00 0
7ffff7ffa000-7ffff7ffc000 r-xp 00000000 00:00 0 [vdso]
7ffff7ffc000-7ffff7ffd000 r--p 00025000 08:11 805806637 /lib/x86_64-linux-gnu/ld-2.23.so
7ffff7ffd000-7ffff7ffe000 rw-p 00026000 08:11 805806637 /lib/x86_64-linux-gnu/ld-2.23.so
7ffff7ffe000-7ffff7fff000 rw-p 00000000 00:00 0
7ffffffdb000-7ffffffff000 rw-p 00000000 00:00 0 [stack]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
- debug orted
$gdb /usr/local/openmpi4/bin/orted
$set args-mca ess "env" -mca ess_base_jobid "3834183680" -mca ess_base_vpid 1 -mca ess_base_num_procs "2" -mca orte_node_regex "pery-[14:4254712551]-worker-1,[2:10].42.141.4@0(2)" -mca orte_hnp_uri "3834183680.0;tcp://10.42.138.4:42355" -mca plm "rsh" -mca pmix "^s1,s2,cray,isolated"
- found probelem
in/notebooks/openmpi-4.0.1/orte/mca/regx/base/regx_base_default_fns.c
static int regex_parse_node_range(char *base, char *range, int num_digits, char *suffix, char ***names) {
// .........
/* Look for the beginning of the first number */
for (found = false, i = 0; i < len; ++i) {
if (isdigit((int) range[i])) {
if (!found) {
start = atoi(range + i); // here,integer will overflow,that can case memory problem
found = true;
break;
}
}
}
//...........
}
- solution
/* Look for the beginning of the first number */
for (found = false, i = 0; i < len; ++i) {
if (isdigit((int) range[i])) {
if (!found) {
start = strtol(range + i, NULL, 10);
//start = atoi(range + i); //old code
found = true;
break;
}
}
}
Metadata
Metadata
Assignees
Labels
No labels