Skip to content

Conversation

@hjelmn
Copy link
Member

@hjelmn hjelmn commented Apr 8, 2015

There are two commits in this RFC and I am willing to break it down into two pull requests if requested.

The first commit is a rewrite of the component find/repository code. The rewrite removes the unused component dependency code and changes the way the component repository is handled.

The second commit is based off an RFC from last year. I originally proposed adding a destructor function to handle tearing down the opal class system. This unfortunately could not be supported by PGI. I have updated the idea to use -Wl,-fini to designate the destructor when the compiler has no support. This has been tested with all the major compilers.

@hjelmn
Copy link
Member Author

hjelmn commented Apr 8, 2015

Part of the reason for the component repository update is in preparation for using the framework structure to validate components.

@hjelmn hjelmn changed the title Repository update RFC: Repository update Apr 8, 2015
@hjelmn
Copy link
Member Author

hjelmn commented Apr 8, 2015

@bosilca, @jsquyres, @rhc54, and @hppritcha: please comment.

The opal destructor function could be used to clean up the tsd and fix the java problem that is currently being tracked in the mailing list.

@mellanox-github
Copy link

Refer to this link for build results (access rights to CI server needed):
http://bgate.mellanox.com/jenkins/job/gh-ompi-master-pr/423/
Test PASSed.

@hppritcha
Copy link
Member

I agree that this approach could be used for resolving the use of destructors in the code's use of pthread_key_create. I'll try out using this branch.

@bosilca
Copy link
Member

bosilca commented Apr 9, 2015

I prefer to have the list of components outside the framework itself, but I can live with the current approach. Other than this minor issue, everything looks OK (mostly as a code removal)

opal/Makefile.am Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should treat this separately from whether the compiler supports attribute destructor.

I.e. you should be testing whether the linker supports the -fini option in configure. And it should only be added to LDFLAGS here if the compiler does not support attribute destructor and the linker supports -fini.

If the compiler does not support attribute destructor and the linker does not support -fini, then we'll just have to leak memory / refuse to re-init / opal_show_help("your system sucks; we can't re-init") / whatever.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes. Good point. There may be compiler/linker combinations that will not support this. I have no problem leaking memory on those systems. It is only the constructor/destructor tables.

@hjelmn hjelmn force-pushed the repository_update branch from d65e8ca to a916fac Compare April 14, 2015 20:59
@mellanox-github
Copy link

Refer to this link for build results (access rights to CI server needed):
http://bgate.mellanox.com/jenkins/job/gh-ompi-master-pr/434/

Build Log
last 50 lines

[...truncated 20968 lines...]
../../config/test-driver: line 107:  7135 Segmentation fault      (core dumped) "$@" > $log_file 2>&1
FAIL: dlopen_test
============================================================================
Testsuite summary for Open MPI gitclone
============================================================================
# TOTAL: 3
# PASS:  2
# SKIP:  0
# XFAIL: 0
# FAIL:  1
# XPASS: 0
# ERROR: 0
============================================================================
See ompi/debuggers/test-suite.log
Please report to http://www.open-mpi.org/community/help/
============================================================================
make[4]: *** [test-suite.log] Error 1
make[4]: Leaving directory `/scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/debuggers'
make[3]: *** [check-TESTS] Error 2
make[3]: Leaving directory `/scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/debuggers'
make[2]: *** [check-am] Error 2
make[2]: Leaving directory `/scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi/debuggers'
make[1]: *** [check-recursive] Error 1
make[1]: Leaving directory `/scrap/jenkins/jenkins/jobs/gh-ompi-master-pr/workspace/ompi'
make: *** [check-recursive] Error 1
+ exit 12
Build step 'Execute shell' marked build as failure
TAP Reports Processing: START
Looking for TAP results report in workspace using pattern: **/*.tap
Saving reports...
Processing '/var/lib/jenkins/jobs/gh-ompi-master-pr/builds/434/tap-master-files/cov_stat.tap'
Parsing TAP test result [/var/lib/jenkins/jobs/gh-ompi-master-pr/builds/434/tap-master-files/cov_stat.tap].
not ok - coverity detected 912 failures in all_434 # SKIP http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr//ws/cov_build/all_434/output/errors/index.html
not ok - coverity detected 5 failures in oshmem_434 # TODO http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr//ws/cov_build/oshmem_434/output/errors/index.html
ok - coverity found no issues for yalla_434
ok - coverity found no issues for mxm_434
not ok - coverity detected 2 failures in fca_434 # TODO http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr//ws/cov_build/fca_434/output/errors/index.html
ok - coverity found no issues for hcoll_434

TAP Reports Processing: FINISH
coverity_for_all    http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr//ws/cov_build/all_434/output/errors/index.html
coverity_for_oshmem http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr//ws/cov_build/oshmem_434/output/errors/index.html
coverity_for_fca    http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr//ws/cov_build/fca_434/output/errors/index.html
[copy-to-slave] The build is taking place on the master node, no copy back to the master will take place.
Setting commit status on GitHub for https://api.github.com/repos/open-mpi/ompi/commit/8b3e4526b1a96fed621f78158205ce8a15762fa2
[BFA] Scanning build for known causes...

[BFA] Done. 0s
Setting status of a916fac7ed3bf0e47639130a1e39fcf982ece743 to FAILURE with url http://bgate.mellanox.com:8888/jenkins/job/gh-ompi-master-pr/434/ and message: Merged build finished.

Test FAILed.

This commit is a rework of the component repository. The changes
included in this commit are:

 - Remove the component dependency code based off .ompi_info
   files. This code is legacy code dating back 10 years that and is no
   longer used.

 - Move the plugin scanning code to the component repository. New
   calls have been added to add new scanning paths, query available
   components, and dlopen/load components.

 - Pass the framework down to mca_base_component_find/filter. Eventually
   the framework structure will be used to further validate components
   before they are used.

 - Add support to the MCA framework system to disable scanning for
   dlopened components on open (support already existed in
   register). This is really only relevant to installdirs as it has no
   register function and no DSO components.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
@hjelmn hjelmn force-pushed the repository_update branch from a916fac to c954f45 Compare April 14, 2015 21:55
@mellanox-github
Copy link

Refer to this link for build results (access rights to CI server needed):
http://bgate.mellanox.com/jenkins/job/gh-ompi-master-pr/436/
Test PASSed.

@hjelmn
Copy link
Member Author

hjelmn commented Apr 14, 2015

@hppritcha I moved the destructor function into PR #528.

hjelmn added a commit that referenced this pull request Apr 15, 2015
@hjelmn hjelmn merged commit e794658 into open-mpi:master Apr 15, 2015
jsquyres added a commit to jsquyres/ompi that referenced this pull request Nov 10, 2015
…rtran-libs

fortran: link the opal-pal.la library directly
@hjelmn hjelmn deleted the repository_update branch May 23, 2016 17:44
markalle pushed a commit to markalle/ompi that referenced this pull request Sep 12, 2020
STG 214162 - naive fix for non-blocking io calls
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants