Skip to content

Conversation

@jsquyres
Copy link
Member

Thanks to Lev Givon for the suggestion.

See #515 for details.

Thanks to @lebedov for the suggestion.

@jsquyres jsquyres added this to the Open MPI 1.9.0 milestone Apr 14, 2015
@mellanox-github
Copy link

Refer to this link for build results (access rights to CI server needed):
http://bgate.mellanox.com/jenkins/job/gh-ompi-master-pr/432/
Test PASSed.

@jsquyres
Copy link
Member Author

@lebedov Could you give this PR a whirl and see if it works for you? If so, I'll merge it into master (i.e., for the upcoming Open MPI v1.9).

@lebedov
Copy link

lebedov commented Apr 19, 2015

@jsquyres: when I tried to build the PR on Ubuntu 14.04, the build failed with the following error:

In file included from pml_ob1.h:38:0,
                 from pml_ob1.c:46:
pml_ob1_hdr.h: In function 'ob1_hdr_ntoh':
pml_ob1_hdr.h:204:12: error: 'mca_pml_ob1_rget_hdr_t' has no member named 'hdr_seg_cnt'
         (h).hdr_seg_cnt = ntohl((h).hdr_seg_cnt);       \
            ^
pml_ob1_hdr.h:440:13: note: in expansion of macro 'MCA_PML_OB1_RGET_HDR_NTOH'
             MCA_PML_OB1_RGET_HDR_NTOH(hdr->hdr_rget);
             ^
pml_ob1_hdr.h:204:36: error: 'mca_pml_ob1_rget_hdr_t' has no member named 'hdr_seg_cnt'
         (h).hdr_seg_cnt = ntohl((h).hdr_seg_cnt);       \
                                    ^
pml_ob1_hdr.h:440:13: note: in expansion of macro 'MCA_PML_OB1_RGET_HDR_NTOH'
             MCA_PML_OB1_RGET_HDR_NTOH(hdr->hdr_rget);
             ^
pml_ob1_hdr.h:357:12: error: 'mca_pml_ob1_rdma_hdr_t' has no member named 'hdr_seg_cnt'
         (h).hdr_seg_cnt = ntohl((h).hdr_seg_cnt);          \
            ^
pml_ob1_hdr.h:449:13: note: in expansion of macro 'MCA_PML_OB1_RDMA_HDR_NTOH'
             MCA_PML_OB1_RDMA_HDR_NTOH(hdr->hdr_rdma);
             ^
pml_ob1_hdr.h:357:36: error: 'mca_pml_ob1_rdma_hdr_t' has no member named 'hdr_seg_cnt'
         (h).hdr_seg_cnt = ntohl((h).hdr_seg_cnt);          \
                                    ^
pml_ob1_hdr.h:449:13: note: in expansion of macro 'MCA_PML_OB1_RDMA_HDR_NTOH'
             MCA_PML_OB1_RDMA_HDR_NTOH(hdr->hdr_rdma);
             ^
pml_ob1_hdr.h: In function 'ob1_hdr_hton_intr':
pml_ob1_hdr.h:211:12: error: 'mca_pml_ob1_rget_hdr_t' has no member named 'hdr_seg_cnt'
         (h).hdr_seg_cnt = htonl((h).hdr_seg_cnt);       \
            ^
pml_ob1_hdr.h:486:13: note: in expansion of macro 'MCA_PML_OB1_RGET_HDR_HTON'
             MCA_PML_OB1_RGET_HDR_HTON(hdr->hdr_rget);
             ^
pml_ob1_hdr.h:211:36: error: 'mca_pml_ob1_rget_hdr_t' has no member named 'hdr_seg_cnt'
         (h).hdr_seg_cnt = htonl((h).hdr_seg_cnt);       \
                                    ^
pml_ob1_hdr.h:486:13: note: in expansion of macro 'MCA_PML_OB1_RGET_HDR_HTON'
             MCA_PML_OB1_RGET_HDR_HTON(hdr->hdr_rget);
             ^
pml_ob1_hdr.h:366:12: error: 'mca_pml_ob1_rdma_hdr_t' has no member named 'hdr_seg_cnt'
         (h).hdr_seg_cnt = htonl((h).hdr_seg_cnt);          \
            ^
pml_ob1_hdr.h:495:13: note: in expansion of macro 'MCA_PML_OB1_RDMA_HDR_HTON'
             MCA_PML_OB1_RDMA_HDR_HTON(hdr->hdr_rdma);
             ^
pml_ob1_hdr.h:366:36: error: 'mca_pml_ob1_rdma_hdr_t' has no member named 'hdr_seg_cnt'
         (h).hdr_seg_cnt = htonl((h).hdr_seg_cnt);          \
                                    ^
pml_ob1_hdr.h:495:13: note: in expansion of macro 'MCA_PML_OB1_RDMA_HDR_HTON'
             MCA_PML_OB1_RDMA_HDR_HTON(hdr->hdr_rdma);
             ^
pml_ob1.c: In function 'mca_pml_ob1_send_fin':
pml_ob1.c:665:18: error: 'hdr' undeclared (first use in this function)
     ob1_hdr_hton(hdr, MCA_PML_OB1_HDR_TYPE_FIN, proc);
                  ^
pml_ob1_hdr.h:465:43: note: in definition of macro 'ob1_hdr_hton'
     ob1_hdr_hton_intr((mca_pml_ob1_hdr_t*)h, t, p)
                                           ^
pml_ob1.c:665:18: note: each undeclared identifier is reported only once for each function it appears in
     ob1_hdr_hton(hdr, MCA_PML_OB1_HDR_TYPE_FIN, proc);
                  ^
pml_ob1_hdr.h:465:43: note: in definition of macro 'ob1_hdr_hton'
     ob1_hdr_hton_intr((mca_pml_ob1_hdr_t*)h, t, p)
                                           ^
make[2]: *** [pml_ob1.lo] Error 1
make[2]: Leaving directory `/home/lev/ompi/ompi/mca/pml/ob1'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/lev/ompi/ompi'
make: *** [all-recursive] Error 1

I used the following config parameters:

LDFLAGS=-L/opt/ompi-1.8.4/lib ./configure --prefix=$OPENMPI_DIR \
    --libdir=$OPENMPI_DIR/lib --includedir=$OPENMPI_DIR/include \
    --sysconfdir=$OPENMPI_DIR/etc  --with-cuda=/usr/local/cuda  --with-threads=posix \
    --with-slurm --with-pmi  --disable-silent-rules --with-hwloc=/usr \
    --disable-mca-dso --with-devel-headers --with-sge \
    --enable-heterogeneous --disable-vt

@rhc54
Copy link
Contributor

rhc54 commented Apr 19, 2015

Please remove the —enable-heterogeneous flag - it isn’t working at the moment

On Apr 19, 2015, at 4:38 AM, Lev Givon notifications@github.com wrote:

@jsquyres https://github.com/jsquyres: when I tried to build the PR on Ubuntu 14.04, the build failed with the following error:

In file included from pml_ob1.h:38:0,
from pml_ob1.c:46:
pml_ob1_hdr.h: In function 'ob1_hdr_ntoh':
pml_ob1_hdr.h:204:12: error: 'mca_pml_ob1_rget_hdr_t' has no member named 'hdr_seg_cnt'
(h).hdr_seg_cnt = ntohl((h).hdr_seg_cnt);
^
pml_ob1_hdr.h:440:13: note: in expansion of macro 'MCA_PML_OB1_RGET_HDR_NTOH'
MCA_PML_OB1_RGET_HDR_NTOH(hdr->hdr_rget);
^
pml_ob1_hdr.h:204:36: error: 'mca_pml_ob1_rget_hdr_t' has no member named 'hdr_seg_cnt'
(h).hdr_seg_cnt = ntohl((h).hdr_seg_cnt);
^
pml_ob1_hdr.h:440:13: note: in expansion of macro 'MCA_PML_OB1_RGET_HDR_NTOH'
MCA_PML_OB1_RGET_HDR_NTOH(hdr->hdr_rget);
^
pml_ob1_hdr.h:357:12: error: 'mca_pml_ob1_rdma_hdr_t' has no member named 'hdr_seg_cnt'
(h).hdr_seg_cnt = ntohl((h).hdr_seg_cnt);
^
pml_ob1_hdr.h:449:13: note: in expansion of macro 'MCA_PML_OB1_RDMA_HDR_NTOH'
MCA_PML_OB1_RDMA_HDR_NTOH(hdr->hdr_rdma);
^
pml_ob1_hdr.h:357:36: error: 'mca_pml_ob1_rdma_hdr_t' has no member named 'hdr_seg_cnt'
(h).hdr_seg_cnt = ntohl((h).hdr_seg_cnt);
^
pml_ob1_hdr.h:449:13: note: in expansion of macro 'MCA_PML_OB1_RDMA_HDR_NTOH'
MCA_PML_OB1_RDMA_HDR_NTOH(hdr->hdr_rdma);
^
pml_ob1_hdr.h: In function 'ob1_hdr_hton_intr':
pml_ob1_hdr.h:211:12: error: 'mca_pml_ob1_rget_hdr_t' has no member named 'hdr_seg_cnt'
(h).hdr_seg_cnt = htonl((h).hdr_seg_cnt);
^
pml_ob1_hdr.h:486:13: note: in expansion of macro 'MCA_PML_OB1_RGET_HDR_HTON'
MCA_PML_OB1_RGET_HDR_HTON(hdr->hdr_rget);
^
pml_ob1_hdr.h:211:36: error: 'mca_pml_ob1_rget_hdr_t' has no member named 'hdr_seg_cnt'
(h).hdr_seg_cnt = htonl((h).hdr_seg_cnt);
^
pml_ob1_hdr.h:486:13: note: in expansion of macro 'MCA_PML_OB1_RGET_HDR_HTON'
MCA_PML_OB1_RGET_HDR_HTON(hdr->hdr_rget);
^
pml_ob1_hdr.h:366:12: error: 'mca_pml_ob1_rdma_hdr_t' has no member named 'hdr_seg_cnt'
(h).hdr_seg_cnt = htonl((h).hdr_seg_cnt);
^
pml_ob1_hdr.h:495:13: note: in expansion of macro 'MCA_PML_OB1_RDMA_HDR_HTON'
MCA_PML_OB1_RDMA_HDR_HTON(hdr->hdr_rdma);
^
pml_ob1_hdr.h:366:36: error: 'mca_pml_ob1_rdma_hdr_t' has no member named 'hdr_seg_cnt'
(h).hdr_seg_cnt = htonl((h).hdr_seg_cnt);
^
pml_ob1_hdr.h:495:13: note: in expansion of macro 'MCA_PML_OB1_RDMA_HDR_HTON'
MCA_PML_OB1_RDMA_HDR_HTON(hdr->hdr_rdma);
^
pml_ob1.c: In function 'mca_pml_ob1_send_fin':
pml_ob1.c:665:18: error: 'hdr' undeclared (first use in this function)
ob1_hdr_hton(hdr, MCA_PML_OB1_HDR_TYPE_FIN, proc);
^
pml_ob1_hdr.h:465:43: note: in definition of macro 'ob1_hdr_hton'
ob1_hdr_hton_intr((mca_pml_ob1_hdr_t_)h, t, p)
^
pml_ob1.c:665:18: note: each undeclared identifier is reported only once for each function it appears in
ob1_hdr_hton(hdr, MCA_PML_OB1_HDR_TYPE_FIN, proc);
^
pml_ob1_hdr.h:465:43: note: in definition of macro 'ob1_hdr_hton'
ob1_hdr_hton_intr((mca_pml_ob1_hdr_t_)h, t, p)
^
make[2]: *** [pml_ob1.lo] Error 1
make[2]: Leaving directory /home/lev/ompi/ompi/mca/pml/ob1' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory/home/lev/ompi/ompi'
make: *** [all-recursive] Error 1
I used the following config parameters:

LDFLAGS=-L/opt/ompi-1.8.4/lib ./configure --prefix=$OPENMPI_DIR
--libdir=$OPENMPI_DIR/lib --includedir=$OPENMPI_DIR/include
--sysconfdir=$OPENMPI_DIR/etc --with-cuda=/usr/local/cuda --with-threads=posix
--with-slurm --with-pmi --disable-silent-rules --with-hwloc=/usr
--disable-mca-dso --with-devel-headers --with-sge
--enable-heterogeneous --disable-vt

Reply to this email directly or view it on GitHub #527 (comment).

@lebedov
Copy link

lebedov commented Apr 19, 2015

The new ompi_info output looks mostly good, although there are some additional parameters whose values contain colon-separated sets of paths that do not appear to be quoted:

mca:mca:base:param:mca_param_files:value:/home/lev/.openmpi/mca-params.conf:/opt/ompi-1.8.4/etc/openmpi-mca-params.conf
mca:mca:base:param:mca_base_param_files:value:/home/lev/.openmpi/mca-params.conf:/opt/ompi-1.8.4/etc/openmpi-mca-params.conf
mca:mca:base:param:mca_base_component_path:value:/opt/ompi-1.8.4/lib/openmpi:/home/lev/.openmpi/components

@jsquyres jsquyres force-pushed the topic/ompi-info-parsable-value-quoting branch from 3e54bba to 4b8583f Compare April 20, 2015 15:07
@jsquyres
Copy link
Member Author

@lebedov Hmmm -- how did that happen... checking... Ah; found the place I missed. It should work now (I rebased/forced pushed to this tree -- you'll likely need to re-pull).

@mellanox-github
Copy link

Refer to this link for build results (access rights to CI server needed):
http://bgate.mellanox.com/jenkins/job/gh-ompi-master-pr/460/

Build Log
last 50 lines

[...truncated 32887 lines...]
[jenkins01:28816] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
+ val=54
+ '[' 54 -ne 54 ']'
+ echo '-mca mca_base_env_list "XXX_A=1;XXX_B=2;XXX_C;XXX_D;XXX_E"'
+ echo 'mca_base_env_list=XXX_A=7;XXX_B=8'
++ /var/lib/jenkins/jobs/gh-ompi-master-pr/workspace/ompi_install1/bin/mpirun -np 2 -tune /var/lib/jenkins/jobs/gh-ompi-master-pr/workspace/test_tune.conf -am /var/lib/jenkins/jobs/gh-ompi-master-pr/workspace/test_amca.conf /var/lib/jenkins/jobs/gh-ompi-master-pr/workspace/env_mpi
++ sed -n -e 's/^XXX_.*=//p'
++ sed -e ':a;N;$!ba;s/\n/+/g'
++ bc
malloc debug: Request for 1 zeroed elements of size 0 (mca_base_var.c, 170)
[jenkins01:28845] 1 more process has sent help message help-mpi-btl-openib.txt / default subnet prefix
[jenkins01:28845] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
+ val=30
+ '[' 30 -ne 30 ']'
+ echo '-mca mca_base_env_list "XXX_A=1;XXX_B=2;XXX_C;XXX_D;XXX_E"'
+ echo 'mca_base_env_list=XXX_A=7;XXX_B=8'
++ sed -n -e 's/^XXX_.*=//p'
++ /var/lib/jenkins/jobs/gh-ompi-master-pr/workspace/ompi_install1/bin/mpirun -np 2 -tune /var/lib/jenkins/jobs/gh-ompi-master-pr/workspace/test_tune.conf -am /var/lib/jenkins/jobs/gh-ompi-master-pr/workspace/test_amca.conf -mca mca_base_env_list 'XXX_A=9;XXX_B=10' /var/lib/jenkins/jobs/gh-ompi-master-pr/workspace/env_mpi
++ sed -e ':a;N;$!ba;s/\n/+/g'
++ bc
malloc debug: Request for 1 zeroed elements of size 0 (mca_base_var.c, 170)
[jenkins01:28868] 1 more process has sent help message help-mpi-btl-openib.txt / default subnet prefix
[jenkins01:28868] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
+ val=58
+ '[' 58 -ne 62 ']'
+ exit 1
Build step 'Execute shell' marked build as failure
TAP Reports Processing: START
Looking for TAP results report in workspace using pattern: **/*.tap
Saving reports...
Processing '/var/lib/jenkins/jobs/gh-ompi-master-pr/builds/460/tap-master-files/cov_stat.tap'
Parsing TAP test result [/var/lib/jenkins/jobs/gh-ompi-master-pr/builds/460/tap-master-files/cov_stat.tap].
not ok - coverity detected 905 failures in all_460 # SKIP http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr//ws/cov_build/all_460/output/errors/index.html
not ok - coverity detected 5 failures in oshmem_460 # TODO http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr//ws/cov_build/oshmem_460/output/errors/index.html
ok - coverity found no issues for yalla_460
ok - coverity found no issues for mxm_460
not ok - coverity detected 2 failures in fca_460 # TODO http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr//ws/cov_build/fca_460/output/errors/index.html
ok - coverity found no issues for hcoll_460

TAP Reports Processing: FINISH
coverity_for_all    http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr//ws/cov_build/all_460/output/errors/index.html
coverity_for_oshmem http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr//ws/cov_build/oshmem_460/output/errors/index.html
coverity_for_fca    http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr//ws/cov_build/fca_460/output/errors/index.html
[copy-to-slave] The build is taking place on the master node, no copy back to the master will take place.
Setting commit status on GitHub for https://github.com/open-mpi/ompi/commit/98c38633c1f1ba2194e78e8dc543b7a6bfb0f003
[BFA] Scanning build for known causes...
[BFA] No failure causes found
[BFA] Done. 0s
Setting status of 4b8583f07d2deb1819592efba82d1ecb94fd8a66 to FAILURE with url http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr/460/ and message: Merged build finished.

Test FAILed.

@jsquyres jsquyres force-pushed the topic/ompi-info-parsable-value-quoting branch from 4b8583f to 1e6a558 Compare April 20, 2015 15:56
@mellanox-github
Copy link

Refer to this link for build results (access rights to CI server needed):
http://bgate.mellanox.com/jenkins/job/gh-ompi-master-pr/462/
Test PASSed.

@lebedov
Copy link

lebedov commented Apr 20, 2015

@jsquyres Looks good now.

@jsquyres
Copy link
Member Author

@lebedov Thanks for the sanity check!

jsquyres added a commit that referenced this pull request Apr 20, 2015
…quoting

ompi_info: quote parsable values if they contain colons
@jsquyres jsquyres merged commit 58c9a45 into open-mpi:master Apr 20, 2015
@jsquyres jsquyres deleted the topic/ompi-info-parsable-value-quoting branch April 20, 2015 17:21
jsquyres pushed a commit to jsquyres/ompi that referenced this pull request Nov 10, 2015
…link-fix

v2.x: ofi mtl: also link in mtl_ofi_LIBS in the static case
markalle pushed a commit to markalle/ompi that referenced this pull request Sep 12, 2020
in-place conversion macro writes into INPUT argument

NOTE - this is merge was done instead of Pull Request 527 because
github.ibm.com was down, and I need to build rtm0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants