Skip to content

Conversation

@mike-dubman
Copy link
Member

No description provided.

@mellanox-github
Copy link

Refer to this link for build results (access rights to CI server needed):
http://bgate.mellanox.com/jenkins/job/gh-ompi-master-pr/254/

Build Log
last 50 lines

[...truncated 14961 lines...]
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Error" (-1) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[1424607789.097980] [jenkins01:25318:0]  proto_conn.c:846  MXM  ERROR already connected to � (uuid 0x7ffff1c28b00)
[1424607789.098016] [jenkins01:25316:0]  proto_conn.c:846  MXM  ERROR already connected to � (uuid 0x7ffff1c28b00)
[1424607789.097979] [jenkins01:25320:0]  proto_conn.c:846  MXM  ERROR already connected to � (uuid 0x7ffff1c28b00)
[1424607789.098000] [jenkins01:25314:0]  proto_conn.c:846  MXM  ERROR already connected to � (uuid 0x7ffff1c28b00)
[1424607789.097979] [jenkins01:25308:0]  proto_conn.c:846  MXM  ERROR already connected to � (uuid 0x7ffff1c28b00)
[1424607789.098022] [jenkins01:25307:0]  proto_conn.c:846  MXM  ERROR already connected to � (uuid 0x7ffff1c28b00)
[1424607789.098010] [jenkins01:25309:0]  proto_conn.c:846  MXM  ERROR already connected to � (uuid 0x7ffff1c28b00)
[1424607789.097982] [jenkins01:25311:0]  proto_conn.c:846  MXM  ERROR already connected to � (uuid 0x7ffff1c28b00)
[jenkins01:25305] 7 more processes have sent help message help-mpi-runtime.txt / mpi_init:startup:internal-failure
[jenkins01:25305] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
Build step 'Execute shell' marked build as failure
[BFA] Scanning build for known causes...

[BFA] Done. 0s
Setting status of 00d416ba9de29a266365d2bfc7a549d1b9229438 to FAILURE with url http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr/254/ and message: Merged build finished.

Test FAILed.

@mellanox-github
Copy link

Refer to this link for build results (access rights to CI server needed):
http://bgate.mellanox.com/jenkins/job/gh-ompi-master-pr/255/

Build Log
last 50 lines

[...truncated 19295 lines...]
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Error" (-1) instead of "Success" (0)
--------------------------------------------------------------------------
[1424610935.170027] [jenkins01:16419:0]  proto_conn.c:846  MXM  ERROR already connected to � (uuid 0x7ffff1c28b00)
[1424610935.170026] [jenkins01:16422:0]  proto_conn.c:846  MXM  ERROR already connected to � (uuid 0x7ffff1c28b00)
[1424610935.170021] [jenkins01:16425:0]  proto_conn.c:846  MXM  ERROR already connected to � (uuid 0x7ffff1c28b00)
[1424610935.170020] [jenkins01:16430:0]  proto_conn.c:846  MXM  ERROR already connected to � (uuid 0x7ffff1c28b00)
[1424610935.170030] [jenkins01:16420:0]  proto_conn.c:846  MXM  ERROR already connected to � (uuid 0x7ffff1cf6b00)
[1424610935.169958] [jenkins01:16432:0]  proto_conn.c:846  MXM  ERROR already connected to � (uuid 0x7ffff1c28b00)
[1424610935.169961] [jenkins01:16423:0]  proto_conn.c:846  MXM  ERROR already connected to � (uuid 0x7ffff1c28b00)
[1424610935.170014] [jenkins01:16428:0]  proto_conn.c:846  MXM  ERROR already connected to � (uuid 0x7ffff1c28b00)
[jenkins01:16417] 7 more processes have sent help message help-mpi-runtime.txt / mpi_init:startup:internal-failure
[jenkins01:16417] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
Build step 'Execute shell' marked build as failure
[BFA] Scanning build for known causes...

[BFA] Done. 0s
Setting status of 00d416ba9de29a266365d2bfc7a549d1b9229438 to FAILURE with url http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr/255/ and message: Merged build finished.

Test FAILed.

@mike-dubman
Copy link
Member Author

bot:retest

@mellanox-github
Copy link

Refer to this link for build results (access rights to CI server needed):
http://bgate.mellanox.com/jenkins/job/gh-ompi-master-pr/258/

Build Log
last 50 lines

[...truncated 11 lines...]
 > /usr/bin/git config --local credential.helper store --file=/tmp/git4848829375087258053.credentials # timeout=60
 > /usr/bin/git -c core.askpass=true fetch --tags --progress https://github.com/open-mpi/ompi +refs/heads/*:refs/remotes/origin/* # timeout=60
 > /usr/bin/git config --local --remove-section credential # timeout=60
 > /usr/bin/git config remote.origin.url https://github.com/open-mpi/ompi # timeout=60
 > /usr/bin/git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=60
 > /usr/bin/git config remote.origin.url https://github.com/open-mpi/ompi # timeout=60
Pruning obsolete local branches
Fetching upstream changes from https://github.com/open-mpi/ompi
using .gitcredentials to set credentials
 > /usr/bin/git config --local credential.helper store --file=/tmp/git5240092005651093958.credentials # timeout=60
 > /usr/bin/git -c core.askpass=true fetch --tags --progress https://github.com/open-mpi/ompi +refs/pull/*:refs/remotes/origin/pr/* --prune # timeout=60
 > /usr/bin/git config --local --remove-section credential # timeout=60
 > /usr/bin/git rev-parse refs/remotes/origin/pr/411/merge^{commit} # timeout=60
 > /usr/bin/git rev-parse refs/remotes/origin/origin/pr/411/merge^{commit} # timeout=60
Merging Revision 04b04c528c368fe8bb996e0f08c88da0b3a58b97 (refs/remotes/origin/pr/411/merge) to /master, UserMergeOptions{mergeRemote='', mergeTarget='master', mergeStrategy='default', fastForwardMode='--ff'}
 > /usr/bin/git rev-parse /master^{commit} # timeout=60
FATAL: Command "/usr/bin/git rev-parse /master^{commit}" returned status code 128:
stdout: /master^{commit}

stderr: fatal: ambiguous argument '/master^{commit}': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'

hudson.plugins.git.GitException: Command "/usr/bin/git rev-parse /master^{commit}" returned status code 128:
stdout: /master^{commit}

stderr: fatal: ambiguous argument '/master^{commit}': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'

    at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:1591)
    at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:1567)
    at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:1563)
    at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommand(CliGitAPIImpl.java:1249)
    at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommand(CliGitAPIImpl.java:1261)
    at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.revParse(CliGitAPIImpl.java:622)
    at hudson.plugins.git.GitAPI.revParse(GitAPI.java:316)
    at hudson.plugins.git.extensions.impl.PreBuildMerge.decorateRevisionToBuild(PreBuildMerge.java:64)
    at hudson.plugins.git.GitSCM.determineRevisionToBuild(GitSCM.java:925)
    at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1017)
    at hudson.scm.SCM.checkout(SCM.java:484)
    at hudson.model.AbstractProject.checkout(AbstractProject.java:1270)
    at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:609)
    at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
    at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:531)
    at hudson.model.Run.execute(Run.java:1718)
    at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
    at hudson.model.ResourceController.execute(ResourceController.java:89)
    at hudson.model.Executor.run(Executor.java:240)

Test FAILed.

@mike-dubman
Copy link
Member Author

bot:retest

@mellanox-github
Copy link

Refer to this link for build results (access rights to CI server needed):
http://bgate.mellanox.com/jenkins/job/gh-ompi-master-pr/259/

Build Log
last 50 lines

[...truncated 31046 lines...]
Thread 3, message rate 222116.860076 messages/sec
Thread 1, message rate 187606.321098 messages/sec
Thread 4, message rate 264107.897882 messages/sec
+ btl_openib=yes
+ btl_tcp=yes
+ btl_sm=yes
+ btl_vader=yes
++ echo /var/lib/jenkins/jobs/gh-ompi-master-pr/workspace/ompi_install1
+ for OMPI_HOME in '$(echo $ompi_home_list)'
+ echo 'check if mca_base_env_list parameter is supported in /var/lib/jenkins/jobs/gh-ompi-master-pr/workspace/ompi_install1'
check if mca_base_env_list parameter is supported in /var/lib/jenkins/jobs/gh-ompi-master-pr/workspace/ompi_install1
++ /var/lib/jenkins/jobs/gh-ompi-master-pr/workspace/ompi_install1/bin/ompi_info --param mca base --level 9
++ grep mca_base_env_list
++ wc -l
+ val=2
+ '[' 2 -gt 0 ']'
+ echo 'test mca_base_env_list option in /var/lib/jenkins/jobs/gh-ompi-master-pr/workspace/ompi_install1'
test mca_base_env_list option in /var/lib/jenkins/jobs/gh-ompi-master-pr/workspace/ompi_install1
+ export XXX_C=3 XXX_D=4 XXX_E=5
+ XXX_C=3
+ XXX_D=4
+ XXX_E=5
++ /var/lib/jenkins/jobs/gh-ompi-master-pr/workspace/ompi_install1/bin/mpirun -np 2 -mca mca_base_env_list 'XXX_A=1;XXX_B=2;XXX_C;XXX_D;XXX_E' env
++ grep '^XXX_'
++ wc -l
+ val=10
+ '[' 10 -ne 10 ']'
TAP Reports Processing: START
Looking for TAP results report in workspace using pattern: **/*.tap
Saving reports...
Processing '/var/lib/jenkins/jobs/gh-ompi-master-pr/builds/259/tap-master-files/cov_stat.tap'
Parsing TAP test result [/var/lib/jenkins/jobs/gh-ompi-master-pr/builds/259/tap-master-files/cov_stat.tap].
not ok - coverity detected 769 failures in all_259 # SKIP http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr//ws/cov_build/all_259/c/output/errors/index.html
not ok - coverity detected 20 failures in oshmem_259 #  http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr//ws/cov_build/oshmem_259/c/output/errors/index.html
ok - coverity found no issues for yalla_259
not ok - coverity detected 2 failures in mxm_259 #  http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr//ws/cov_build/mxm_259/c/output/errors/index.html
ok - coverity found no issues for fca_259
ok - coverity found no issues for hcoll_259

There are failed test cases and the job is configured to mark the build as failure. Marking build as FAILURE
TAP Reports Processing: FINISH
Build step 'Publish TAP Results' changed build result to FAILURE
coverity_for_all    http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr//ws/cov_build/all_259/c/output/errors/index.html
coverity_for_oshmem http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr//ws/cov_build/oshmem_259/c/output/errors/index.html
coverity_for_mxm    http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr//ws/cov_build/mxm_259/c/output/errors/index.html
[BFA] Scanning build for known causes...

[BFA] Done. 0s
Setting status of 00d416ba9de29a266365d2bfc7a549d1b9229438 to FAILURE with url http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr/259/ and message: Merged build finished.

Test FAILed.

@mike-dubman
Copy link
Member Author

bot:retest

@mellanox-github
Copy link

@mellanox-github
Copy link

@mellanox-github
Copy link

@mellanox-github
Copy link

Refer to this link for build results (access rights to CI server needed):
http://bgate.mellanox.com/jenkins/job/gh-ompi-master-pr/261/
Test PASSed.

mike-dubman added a commit that referenced this pull request Feb 23, 2015
@mike-dubman mike-dubman merged commit e02121a into open-mpi:master Feb 23, 2015
@mike-dubman mike-dubman deleted the topic/fix_cov branch February 23, 2015 10:01
jsquyres pushed a commit to jsquyres/ompi that referenced this pull request Nov 10, 2015
…t-minor-fixes

orte/test/system: fix compiler warnings
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants