Skip to content
Jeff Squyres edited this page Jan 27, 2015 · 93 revisions

January 2015 OMPI Developer's Meeting

This is a standalone meeting; it is not being held in conjunction with an MPI Forum meeting.

Logistics

Doodle for choosing the date: https://doodle.com/zzaupgxge9y6medu

  • Date: 9am Tuesday, January 27 through 3pm Thursday, January 29, 2015
  • Location: Cisco Richardson facility (outside Dallas), building 4:

Cisco Building 4
2200 East President George Bush Highway
Richardson, Texas 75082-3550

Google maps link: https://goo.gl/maps/SNrbu

Attendees

Local attendees:

  • (*) Jeff Squyres - Cisco
  • (*) Howard Pritchard - Los Alamos
  • (*) Ralph Castain - Intel
  • (*) George Bosilca - U. Tennessee, Knoxville
  • (*) Dave Goodell - Cisco
  • (*) Edgar Gabriel - U. Houston
  • (*) Vish Venkatesan (not Tuesday) - Intel
  • (*) Geoff Paulsen - IBM
  • (*) Joshua Ladd - Mellanox Technologies
  • (*) Rayaz Jagani - IBM
  • (*) Dave Solt - IBM
  • (*) Perry Schmidt - IBM
  • (*) Naoyuki Shida - Fujitsu
  • (*) Shinji Sumimoto - Fujitsu
  • (*) Stan Graves - IBM
  • (*) Mark Allen - IBM
  • ...please add your name if you plan to attend...

(*) = Registered (by Jeff)

Remote attendees

  • Nathan Hjelm - Los Alamos
  • Ryan Grant - Sandia (planning to attend for the MTL and 1.9 branch discussions)

Topics

  • Jeff/Howard: Branch for v1.9
  • Jeff/Howard: Jenkins integration with Github:
    • how do we do multiple Jenkins servers? (e.g., running at different organizations)
  • Ralph: Review: v1.8 series / RM experience with Github and Jenkins and the release process
    • Ralph's feedback: lots more PRs than we used to have CMRs
    • Ralph's feedback: people seem to be relying on Jenkins for correctness, when Jenkins is really just a smoke test
  • Nathan: Performance of freelists and other common OPAL classes with OPAL_ENABLE_MULTI_THREADS==1 (as discussed in [GitHub]). Part of this is done already -- LIFO is a bit faster now (with threads), etc.
  • Ralph: Scalable startup, including:
    • Async modex, static endpoint support
    • Re-define the role of PML/BTL add_procs: need to move to a more lazy-based setup of peers
    • Memory footprint reduction
  • Ralph: RTE-MPI sharing of BTLs
  • Ralph: Data passing down to OPAL
    • Revising process naming scheme
    • MPI_Info
  • Howard: Progress on async progress
  • Jeff: Progress on thread-multiple support
  • Ralph/Nathan: MTL overhead reduction
  • Intel/LANL: MTL selection issue (PSM vs. OFI)
  • Nathan: Enhance MTL interface to include one-sided and atomics
  • Ralph: Error response planning (e.g., BTL error propagation up from OPAL into ORTE and OMPI, particularly in the presence of async progress).
  • Ralph: Collective switching points & MPI tuning params - what is required to change them. Had a discussion brought up by Mellanox, and we never finished this.
  • Vish: Memkind integration: see http://www.open-mpi.org/community/lists/devel/2014/11/16320.php
  • Jeff: MPI extensions: MPIX_ prefix, or OMPI_ prefix?
  • Ralph: PMIx and ORCM updates
  • Nathan: --disable-smp-locks: remove this option?
  • Fujitsu: future plans for Open MPI development
  • extracting libnbc core from the collective component into a standalone directory such that it can be used from OMPIO and other locations
  • Jeff: libtool 2.4.4 bug / libltdl may no longer be embeddable. Should we embed manually, or should we just tell people to have libltdl-devel installed?
  • Howard/George: fate of coll ML
  • see http://www.open-mpi.org/community/lists/devel/2015/01/16820.php
  • who owns it?
  • should we try to fix it or disable by default?

Since this will be a full meeting in itself, we'll have a good amount of time for discussion, design, and for hacking!

Resolved

  • Jeff: libtool 2.4.4 bug / libltdl may no longer be embeddable. Should we embed manually, or should we just tell people to have libltdl-devel installed?

    • Resolved: let's stop embedding; we'll always link against external libltdl.
    • However: this means people need to have the libltdl headers installed (e.g., libltdl-devel RPM). We don't care about telling developers to do this, but we are a little worried about telling users to do this (because it raises the bar for building Open MPI -- the assumption that libltldl-devel is almost certainly not installed on most user machines).
    • The question becomes: what is configure's default behavior when it can't find ltdl.h?
      1. Abort
      2. Just fall back to --disable-dlopen behavior (i.e., slurp in plugins)
    • Let's bring up the "default behavior" issue as an RFC / beer discussion.
  • Branch for v1.9

    • We need to make a feature list for v1.9.0 and decide when it makes sense to branch

Clone this wiki locally