Skip to content

Conversation

@hjelmn
Copy link
Member

@hjelmn hjelmn commented May 27, 2015

Quieting coverity issues in opal.

@mellanox-github
Copy link

@mellanox-github
Copy link

Refer to this link for build results (access rights to CI server needed):
http://bgate.mellanox.com/jenkins/job/gh-ompi-master-pr/558/

Build Log
last 50 lines

[...truncated 22056 lines...]
 * Run your application with MPI_THREAD_SINGLE.
 * Set the MCA parameter btl_base_verbose to 100 (or mtl_base_verbose,
   if using MTL-based communications) to see exactly which
   communication plugins were considered and/or discarded.
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[jenkins01:24595] 7 more processes have sent help message help-mpi-btl-openib.txt / invalid pp qp specification
[jenkins01:24595] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
[jenkins01:24595] 7 more processes have sent help message help-mca-bml-r2.txt / unreachable proc
[jenkins01:24595] 7 more processes have sent help message help-mpi-runtime.txt / mpi_init:startup:pml-add-procs-fail
Build step 'Execute shell' marked build as failure
[htmlpublisher] Archiving HTML reports...
[htmlpublisher] Archiving at BUILD level /var/lib/jenkins/jobs/gh-ompi-master-pr/workspace/cov_build to /var/lib/jenkins/jobs/gh-ompi-master-pr/builds/558/htmlreports/Coverity_Report
Setting commit status on GitHub for https://github.com/open-mpi/ompi/commit/a12c47800acc18d29f6c27cffd095a68739b83cf
[BFA] Scanning build for known causes...
[BFA] No failure causes found
[BFA] Done. 0s
Setting status of 73d45bb30a1b737d73b3fe913a8290a50b0bf761 to FAILURE with url http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr/558/ and message: Build finished.

Test FAILed.

@mellanox-github
Copy link

@mellanox-github
Copy link

Refer to this link for build results (access rights to CI server needed):
http://bgate.mellanox.com/jenkins/job/gh-ompi-master-pr/559/

Build Log
last 50 lines

[...truncated 22126 lines...]
 6 0x000000000001cb20 mca_btl_openib_endpoint_post_rr_nolock()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/openib/btl_openib_endpoint.h:424
 7 0x000000000001f740 mca_btl_openib_endpoint_post_recvs()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/openib/btl_openib_endpoint.c:441
 8 0x000000000003cb1b udcm_rc_qps_to_rts()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/openib/connect/btl_openib_connect_udcm.c:1309
 9 0x00000000000398ec udcm_endpoint_init_self()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/openib/connect/btl_openib_connect_udcm.c:590
10 0x0000000000039b08 udcm_endpoint_init()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/openib/connect/btl_openib_connect_udcm.c:624
11 0x000000000000cb9a mca_btl_openib_add_procs()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/openib/btl_openib.c:1033
12 0x0000000000002043 mca_bml_r2_add_procs()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/bml/r2/bml_r2.c:219
13 0x0000000000005e1d mca_pml_ob1_add_procs()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/pml/ob1/pml_ob1.c:333
14 0x0000000000053881 ompi_mpi_init()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/runtime/ompi_mpi_init.c:757
15 0x00000000000930ec PMPI_Init()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mpi/c/profile/pinit.c:84
16 0x0000000000400826 main()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi_install1/examples/hello_c.c:18
17 0x0000003d6901ed1d __libc_start_main()  ??:0
18 0x0000000000400739 _start()  ??:0
===================
15 0x00000000000930ec PMPI_Init()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mpi/c/profile/pinit.c:84
16 0x0000000000400826 main()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi_install1/examples/hello_c.c:18
17 0x0000003d6901ed1d __libc_start_main()  ??:0
18 0x0000000000400739 _start()  ??:0
===================
==== backtrace ====
 2 0x00000000000597fc mxm_handle_error()  /var/tmp/OFED_topdir/BUILD/mxm-3.3.3052/src/mxm/util/debug/debug.c:641
 3 0x000000000005996c mxm_error_signal_handler()  /var/tmp/OFED_topdir/BUILD/mxm-3.3.3052/src/mxm/util/debug/debug.c:616
 4 0x0000003d690329a0 killpg()  ??:0
 5 0x000000000001c777 post_recvs()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/openib/btl_openib_endpoint.h:380
 6 0x000000000001cb20 mca_btl_openib_endpoint_post_rr_nolock()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/openib/btl_openib_endpoint.h:424
 7 0x000000000001f740 mca_btl_openib_endpoint_post_recvs()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/openib/btl_openib_endpoint.c:441
 8 0x000000000003cb1b udcm_rc_qps_to_rts()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/openib/connect/btl_openib_connect_udcm.c:1309
 9 0x00000000000398ec udcm_endpoint_init_self()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/openib/connect/btl_openib_connect_udcm.c:590
10 0x0000000000039b08 udcm_endpoint_init()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/openib/connect/btl_openib_connect_udcm.c:624
11 0x000000000000cb9a mca_btl_openib_add_procs()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/openib/btl_openib.c:1033
12 0x0000000000002043 mca_bml_r2_add_procs()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/bml/r2/bml_r2.c:219
13 0x0000000000005e1d mca_pml_ob1_add_procs()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/pml/ob1/pml_ob1.c:333
14 0x0000000000053881 ompi_mpi_init()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/runtime/ompi_mpi_init.c:757
15 0x00000000000930ec PMPI_Init()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mpi/c/profile/pinit.c:84
16 0x0000000000400826 main()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi_install1/examples/hello_c.c:18
17 0x0000003d6901ed1d __libc_start_main()  ??:0
18 0x0000000000400739 _start()  ??:0
===================
--------------------------------------------------------------------------
mpirun noticed that process rank 5 with PID 0 on node jenkins01 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
Build step 'Execute shell' marked build as failure
[htmlpublisher] Archiving HTML reports...
[htmlpublisher] Archiving at BUILD level /var/lib/jenkins/jobs/gh-ompi-master-pr/workspace/cov_build to /var/lib/jenkins/jobs/gh-ompi-master-pr/builds/559/htmlreports/Coverity_Report
Setting commit status on GitHub for https://github.com/open-mpi/ompi/commit/915db19cdf186355ae02d6325f7aa8c3ffb96ce6
[BFA] Scanning build for known causes...
[BFA] No failure causes found
[BFA] Done. 0s
Setting status of a50780d93f9c8d4b64178f1a447afdab15f86bf5 to FAILURE with url http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr/559/ and message: Build finished.

Test FAILed.

@hjelmn
Copy link
Member Author

hjelmn commented May 27, 2015

d'oh. i messed up the dead code fix. now fixed.

@mellanox-github
Copy link

@mellanox-github
Copy link

Refer to this link for build results (access rights to CI server needed):
http://bgate.mellanox.com/jenkins/job/gh-ompi-master-pr/560/

Build Log
last 50 lines

[...truncated 22076 lines...]
15 0x0000000000004b36 mca_coll_hcoll_comm_query()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/hcoll/coll_hcoll_module.c:294
16 0x00000000000e4916 query_2_0_0()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/base/coll_base_comm_select.c:382
17 0x00000000000e48d5 query()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/base/coll_base_comm_select.c:365
18 0x00000000000e47df check_one_component()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/base/coll_base_comm_select.c:327
19 0x00000000000e4619 check_components()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/base/coll_base_comm_select.c:291
20 0x00000000000dcd70 mca_coll_base_comm_select()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/base/coll_base_comm_select.c:123
21 0x0000000000053a56 ompi_mpi_init()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/runtime/ompi_mpi_init.c:855
22 0x00000000000930ec PMPI_Init()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mpi/c/profile/pinit.c:84
23 0x0000000000400826 main()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi_install1/examples/hello_c.c:18
24 0x0000003d6901ed1d __libc_start_main()  ??:0
25 0x0000000000400739 _start()  ??:0
===================
==== backtrace ====
 2 0x00000000000597fc mxm_handle_error()  /var/tmp/OFED_topdir/BUILD/mxm-3.3.3052/src/mxm/util/debug/debug.c:641
 3 0x000000000005996c mxm_error_signal_handler()  /var/tmp/OFED_topdir/BUILD/mxm-3.3.3052/src/mxm/util/debug/debug.c:616
 4 0x0000003d690329a0 killpg()  ??:0
 5 0x000000000001a768 handle_wc()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/openib/btl_openib_component.c:3524
 6 0x000000000001ad1c poll_device()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/openib/btl_openib_component.c:3645
 7 0x000000000001b1c4 progress_one_device()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/openib/btl_openib_component.c:3755
 8 0x000000000001b25c btl_openib_component_progress()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/openib/btl_openib_component.c:3778
 9 0x0000000000034eff opal_progress()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/runtime/opal_progress.c:189
10 0x0000000000005eb1 progress()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/hcoll/coll_hcoll_rte.c:89
11 0x000000000006ec5c wait_completion()  hcoll_collectives.c:0
12 0x0000000000025d61 comm_allreduce_hcolrte()  ??:0
13 0x00000000000332f4 hmca_coll_ml_comm_query()  ??:0
14 0x000000000006ee29 hcoll_create_context()  ??:0
15 0x0000000000004b36 mca_coll_hcoll_comm_query()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/hcoll/coll_hcoll_module.c:294
16 0x00000000000e4916 query_2_0_0()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/base/coll_base_comm_select.c:382
17 0x00000000000e48d5 query()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/base/coll_base_comm_select.c:365
18 0x00000000000e47df check_one_component()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/base/coll_base_comm_select.c:327
19 0x00000000000e4619 check_components()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/base/coll_base_comm_select.c:291
20 0x00000000000dcd70 mca_coll_base_comm_select()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/base/coll_base_comm_select.c:123
21 0x0000000000053a56 ompi_mpi_init()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/runtime/ompi_mpi_init.c:855
22 0x00000000000930ec PMPI_Init()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mpi/c/profile/pinit.c:84
23 0x0000000000400826 main()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi_install1/examples/hello_c.c:18
24 0x0000003d6901ed1d __libc_start_main()  ??:0
25 0x0000000000400739 _start()  ??:0
===================
--------------------------------------------------------------------------
mpirun noticed that process rank 7 with PID 0 on node jenkins01 exited on signal 13 (Broken pipe).
--------------------------------------------------------------------------
Build step 'Execute shell' marked build as failure
[htmlpublisher] Archiving HTML reports...
[htmlpublisher] Archiving at BUILD level /var/lib/jenkins/jobs/gh-ompi-master-pr/workspace/cov_build to /var/lib/jenkins/jobs/gh-ompi-master-pr/builds/560/htmlreports/Coverity_Report
Setting commit status on GitHub for https://github.com/open-mpi/ompi/commit/3c39051b8c1da00ad14c64a53794f442f2a0511a
[BFA] Scanning build for known causes...
[BFA] No failure causes found
[BFA] Done. 0s
Setting status of 964debba814e7085fa841bf57196d6857fbf4fe1 to FAILURE with url http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr/560/ and message: Build finished.

Test FAILed.

@mellanox-github
Copy link

Refer to this link for build results (access rights to CI server needed):
http://bgate.mellanox.com/jenkins/job/gh-ompi-master-pr/562/

Build Log
last 50 lines

[...truncated 6232 lines...]
   - add LIBDIR to the `LD_RUN_PATH' environment variable
     during linking
   - use the `-Wl,-rpath -Wl,LIBDIR' linker flag
   - have your system administrator add LIBDIR to `/etc/ld.so.conf'

See any operating system documentation about shared libraries for
more information, such as the ld(1) and ld.so(8) manual pages.
----------------------------------------------------------------------
make[3]: Leaving directory `/scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/vader'
make[2]: Leaving directory `/scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/vader'
Making install in mca/btl/openib
make[2]: Entering directory `/scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/openib'
  CC       btl_openib_component.lo
  CC       btl_openib.lo
  CC       btl_openib_endpoint.lo
  CC       btl_openib_frag.lo
  CC       btl_openib_proc.lo
  LEX      btl_openib_lex.c
  CC       btl_openib_mca.lo
  CC       btl_openib_ini.lo
  CC       btl_openib_async.lo
  CC       btl_openib_xrc.lo
  CC       btl_openib_fd.lo
  CC       btl_openib_ip.lo
  CC       btl_openib_put.lo
  CC       btl_openib_get.lo
  CC       btl_openib_atomic.lo
  CC       connect/btl_openib_connect_base.lo
  CC       connect/btl_openib_connect_rdmacm.lo
  CC       connect/btl_openib_connect_empty.lo
  CC       connect/btl_openib_connect_udcm.lo
  CC       connect/btl_openib_connect_sl.lo
  CC       btl_openib_lex.lo
btl_openib_mca.c: In function 'reg_string':
btl_openib_mca.c:122: error: lvalue required as left operand of assignment
make[2]: *** [btl_openib_mca.lo] Error 1
make[2]: *** Waiting for unfinished jobs....
make[2]: Leaving directory `/scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/openib'
make[1]: *** [install-recursive] Error 1
make[1]: Leaving directory `/scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal'
make: *** [install-recursive] Error 1
Build step 'Execute shell' marked build as failure
[htmlpublisher] Archiving HTML reports...
[htmlpublisher] Archiving at BUILD level /var/lib/jenkins/jobs/gh-ompi-master-pr/workspace/cov_build to /var/lib/jenkins/jobs/gh-ompi-master-pr/builds/562/htmlreports/Coverity_Report
Setting commit status on GitHub for https://github.com/open-mpi/ompi/commit/8fc9e5f57a664c802dd0d6243bfeaa250dc99c38
[BFA] Scanning build for known causes...
[BFA] No failure causes found
[BFA] Done. 0s
Setting status of 74a8b5fce3a9cdfa0b2622cc63a32e35c08c994d to FAILURE with url http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr/562/ and message: Build finished.

Test FAILed.

@mellanox-github
Copy link

@mellanox-github
Copy link

Refer to this link for build results (access rights to CI server needed):
http://bgate.mellanox.com/jenkins/job/gh-ompi-master-pr/563/

Build Log
last 50 lines

[...truncated 22160 lines...]
15 0x0000000000004b36 mca_coll_hcoll_comm_query()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/hcoll/coll_hcoll_module.c:294
16 0x00000000000e4916 query_2_0_0()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/base/coll_base_comm_select.c:382
17 0x00000000000e48d5 query()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/base/coll_base_comm_select.c:365
18 0x00000000000e47df check_one_component()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/base/coll_base_comm_select.c:327
19 0x00000000000e4619 check_components()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/base/coll_base_comm_select.c:291
20 0x00000000000dcd70 mca_coll_base_comm_select()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/base/coll_base_comm_select.c:123
21 0x0000000000053a56 ompi_mpi_init()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/runtime/ompi_mpi_init.c:855
22 0x00000000000930ec PMPI_Init()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mpi/c/profile/pinit.c:84
23 0x0000000000400826 main()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi_install1/examples/hello_c.c:18
24 0x0000003d6901ed1d __libc_start_main()  ??:0
25 0x0000000000400739 _start()  ??:0
===================
==== backtrace ====
 2 0x00000000000597fc mxm_handle_error()  /var/tmp/OFED_topdir/BUILD/mxm-3.3.3052/src/mxm/util/debug/debug.c:641
 3 0x000000000005996c mxm_error_signal_handler()  /var/tmp/OFED_topdir/BUILD/mxm-3.3.3052/src/mxm/util/debug/debug.c:616
 4 0x0000003d690329a0 killpg()  ??:0
 5 0x000000000001a70d handle_wc()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/openib/btl_openib_component.c:3524
 6 0x000000000001acc1 poll_device()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/openib/btl_openib_component.c:3645
 7 0x000000000001b169 progress_one_device()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/openib/btl_openib_component.c:3755
 8 0x000000000001b201 btl_openib_component_progress()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/mca/btl/openib/btl_openib_component.c:3778
 9 0x0000000000034eff opal_progress()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/opal/runtime/opal_progress.c:189
10 0x0000000000005eb1 progress()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/hcoll/coll_hcoll_rte.c:89
11 0x000000000006ec5c wait_completion()  hcoll_collectives.c:0
12 0x0000000000025d61 comm_allreduce_hcolrte()  ??:0
13 0x00000000000332f4 hmca_coll_ml_comm_query()  ??:0
14 0x000000000006ee29 hcoll_create_context()  ??:0
15 0x0000000000004b36 mca_coll_hcoll_comm_query()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/hcoll/coll_hcoll_module.c:294
16 0x00000000000e4916 query_2_0_0()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/base/coll_base_comm_select.c:382
17 0x00000000000e48d5 query()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/base/coll_base_comm_select.c:365
18 0x00000000000e47df check_one_component()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/base/coll_base_comm_select.c:327
19 0x00000000000e4619 check_components()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/base/coll_base_comm_select.c:291
20 0x00000000000dcd70 mca_coll_base_comm_select()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mca/coll/base/coll_base_comm_select.c:123
21 0x0000000000053a56 ompi_mpi_init()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/runtime/ompi_mpi_init.c:855
22 0x00000000000930ec PMPI_Init()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/mpi/c/profile/pinit.c:84
23 0x0000000000400826 main()  /scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi_install1/examples/hello_c.c:18
24 0x0000003d6901ed1d __libc_start_main()  ??:0
25 0x0000000000400739 _start()  ??:0
===================
--------------------------------------------------------------------------
mpirun noticed that process rank 4 with PID 0 on node jenkins01 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
Build step 'Execute shell' marked build as failure
[htmlpublisher] Archiving HTML reports...
[htmlpublisher] Archiving at BUILD level /var/lib/jenkins/jobs/gh-ompi-master-pr/workspace/cov_build to /var/lib/jenkins/jobs/gh-ompi-master-pr/builds/563/htmlreports/Coverity_Report
Setting commit status on GitHub for https://github.com/open-mpi/ompi/commit/3cb6ced4f898dcabb6d86768068941c45437b2b2
[BFA] Scanning build for known causes...
[BFA] No failure causes found
[BFA] Done. 0s
Setting status of 8999c66a415cab8c80f541a3b07871b367eb49bb to FAILURE with url http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr/563/ and message: Build finished.

Test FAILed.

@hjelmn
Copy link
Member Author

hjelmn commented May 27, 2015

Fixed for real now.

@mellanox-github
Copy link

@mellanox-github
Copy link

Refer to this link for build results (access rights to CI server needed):
http://bgate.mellanox.com/jenkins/job/gh-ompi-master-pr/564/
Test PASSed.

@hjelmn hjelmn force-pushed the opal_coverity branch 2 times, most recently from 2e59b0a to dcb4767 Compare May 27, 2015 21:51
@mellanox-github
Copy link

@mellanox-github
Copy link

Refer to this link for build results (access rights to CI server needed):
http://bgate.mellanox.com/jenkins/job/gh-ompi-master-pr/566/
Test PASSed.

@mellanox-github
Copy link

@mellanox-github
Copy link

Refer to this link for build results (access rights to CI server needed):
http://bgate.mellanox.com/jenkins/job/gh-ompi-master-pr/567/
Test PASSed.

hjelmn added 5 commits May 28, 2015 08:38
CID 1292738 Dereference after null check (FORWARD_NULL)

It is an error if NULL is passed for val in add_to_env_str. Removed
the NULL-check @ keyval_parse.c:253 and added a NULL check and an
error return.

CID 1292737 Logically dead code (DEADCODE)

Coverity is correct, the error code at the end of parse_line_new is
never reached. This means we fail to report parsing errors when
parsing -x and -mca lines in keyval files. I moved the error code into
the loop and removed the checks @ keyval_parse.c:314.

I also named the parse state enum type and updated parse_line_new to
use this type.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
CID 1292483 Uninitialized pointer read (UNINIT)

Initialize the method and credential members of the opal_sec_cred_t to
avoid possible invalid read when calling cleanup_cred.

CID 1292484 Double free (USE_AFTER_FREE)

Set method and credential members to NULL after freeing in
cleanup_cred.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
CID 1269933 Uninitialized scalar variable (UNINIT)

This CID isn't really an error but it is best for both valgrind and
coverity cleanness to not write uninitialized data. Added an
initializer for async_command in btl_openib_component_close.

CID 1269930 Uninitialized scalar variable (UNINIT)

Same as above. Best not to write uninitialized data. Added an
initializer for async_command.

CID 1269701 Logically dead code (DEADCODE)

Coverity is correct. The smallest_pp_qp will always be 0. Changed the
initial value so that the smallest_pp_qp is set as intended. If no
per-per queue pair exists then use the last shared queue pair. This
queue pair should have the smallest message size. This will reduce
buffer waste.

CID 1269713 Logically dead code (DEADCODE)

False positive but easy to silence. The two check are meaningless if
HAVE_XRC is 0 so protect them with #if HAVE_XRC.

CID 1269726 Division or modulo by zero (DIVIDE_BY_ZERO)

Indeed an issue. If we get an invalid value for rd_win then this will
cause a divide-by-zero exception. Added a check to ensure rd_win is >
0. Also updated the help message to reflect this requirement.

CID 1269672 Ignoring number of bytes read (CHECKED_RETURN)

This error was somewhat intentional. Linux parameter files are
probably not empty but it is safer to check the return code of read to
make sure we got something. If 0 bytes are read this code could SEGV
whe running strtoull.

CID 1269836 Unintentional integer overflow (OVERFLOW_BEFORE_WIDEN)

Add a range check to read_module_param to ensure we do not
overflow. In the future it might be worthwhile to report an error
because these parameters should never cause overflow in this
calculation.

CID 1269692 Calling risky function (DC.WEAK_CRYPTO)

??? This call was added in 2006 but I see no calls to the rest of the
rand48 family of functions. Anyway, we SHOULD NEVER be calling seed48,
srand, etc because it messes with user code. Removed the call to
seed48.

CID 1269823 Dereference null return value (NULL_RETURNS)

This is likely a false positive. The endpoint lock is being held so no
other thread should be able to remove fragments from the list. Also,
mca_btl_openib_endpoint_post_send should not be removing items from
the list. If a NULL fragment is ever returned it will likely be a
coding error on the part of an Open MPI developer. Added an assert()
to catch this and quiet the coverity error.

CID 1269671 Unchecked return value (CHECKED_RETURN)

Added a check for the return code of mca_btl_openib_endpoint_post_send
to quiet the coverity error. It is unlikely this error path will be
traversed.

CID 1270229 Missing break in switch (MISSING_BREAK)

Add a comment to indicate that the fall-through is intentional.

CID 1269735 Dereference after null check (FORWARD_NULL)

There should always be an endpoint when handling a work
completion. The endpoint is either stored on the fragment or can be
looked up using the immediate data. Move the immediate data code up
and add an assert for a NULL endpoint.

CID 1269740 Dereference after null check (FORWARD_NULL)
CID 1269741 Explicit null dereferenced (FORWARD_NULL)

Similar to CID 1269735 fix.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
CID 1269988 Use after free (USE_AFTER_FREE)
CID 1269987 Use after free (USE_AFTER_FREE)

Both are false positives as convert is always overwritten by the call
to opal_dss_unpack_string(). Set convert to prevent this issue from
re-appearing.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
CID 1269931 Uninitialized scalar variable (UNINIT)

Initialize complete async message. This was not a bug but the fix
contributes to valgrind cleanness (uninitialed write).

CID 1269915 Unintended sign extension (SIGN_EXTENSION)

Should never happen. Quieting this by explicitly casting to uint64_t.

CID 1269824 Dereference null return value (NULL_RETURNS)

It is impossible for opal_list_remove_first to return NULL if
opal_list_is_empty returns false. I refactored the code in question to
not use opal_list_is_empty but loop until NULL is returned by
opal_list_remove_first. That will quiet the issue.

CID 1269913 Dereference before null check (REVERSE_INULL)

The storage parameter should never be NULL. The check intended to
check if *storage was NULL not storage.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
hjelmn added 4 commits May 28, 2015 08:38
CID 1269864 Resource leak (RESOURCE_LEAK)
CID 1269865 Resource leak (RESOURCE_LEAK)

Slightly refactored the code to remove extra goto statements and
ensure the if_include_list and if_exclude_list are actually released
on success.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
CID 996175 Dereference before null check (REVERSE_NULL)

If lims is NULL then we ran out of memory. Return an error and remove
the NULL check at cleanup.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
CID 1196720 Resource leak (RESOURCE_LEAK)
CID 1196721 Resource leak (RESOURCE_LEAK)

The code in question does leak loc_token and loc_value. Cleaned up the
code a bit and plugged the leak.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
CID 1269674 Ignoring number of bytes read (CHECKED_RETURN)

Check that we read enough bytes to get a complete async command.

CID 1269793 Missing break in switch (MISSING_BREAK)

Added comment to indicate fall through was intentional.

CID 1269702: Constant variable guards dead code (DEADCODE)

Remove an unused argument to opal_show_help. This will quiet the
coverity issue.

CID 1269675 Ignoring number of bytes read (CHECKED_RETURN)

Check that at least sizeof(int) bytes are read. If this is not the
case then it is an error.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
@mellanox-github
Copy link

@mellanox-github
Copy link

Refer to this link for build results (access rights to CI server needed):
http://bgate.mellanox.com/jenkins/job/gh-ompi-master-pr/568/
Test PASSed.

hjelmn added a commit that referenced this pull request May 28, 2015
@hjelmn hjelmn merged commit 9c170a8 into open-mpi:master May 28, 2015
jsquyres added a commit to jsquyres/ompi that referenced this pull request Nov 10, 2015
Now that we have an "isolated" PLM component, we cannot just let rsh …
markalle pushed a commit to markalle/ompi that referenced this pull request Sep 12, 2020
only generate one binding for MPI_IN_PLACE etc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants