Skip to content

Meeting 2018 09

Matias Cabral edited this page Nov 9, 2018 · 70 revisions

Open MPI fall/winter 2018 developer meeting

(yes, the "2018 09" title on this wiki page is a bit of a lie -- cope)

  • 9am, Tuesday, Oct 16, 2018 - noonish Thursday, Oct 18, 2018.
  • Cisco buildings 2 and 3 (right next to each other), San Jose, CA, USA.
    • Tuesday: Cisco building 2 (Google Maps link)
      • NOTE 1: The Tuesday meeting is immediately after the weekly Webex. You're welcome to show up after 7:30am US Pacific to be in the Cisco conference room for the weekly Webex.
      • NOTE 2: There is no Lobby Ambassador in Cisco building 2. You need to iMessage or SMS text Jeff Squyres, and I'll come escort you to the meeting room.
    • Wednesday, Thursday: Cisco building 3 (Google Maps link)
      • There is a Lobby Ambassador in Cisco building 3; they will alert me when you arrive (but iMessaging or SMS texting me wouldn't hurt, either).

Attendees

Please put your name on this list if you plan to attend any/all of the meeting so that Jeff can register you for a Cisco guest badge+wifi.

  1. Jeff Squyres, Cisco
  2. George Bosilca, UTK
  3. Arm Thananon Patinyasakdikul, UTK
  4. Andrew Friedley, Intel
  5. Geoff Paulsen, IBM
  6. Howard Pritchard, LANL
  7. Edgar Gabriel, UH
  8. Shinji Sumimoto, Fujitsu
  9. Thomas Naughton, ORNL
  10. Matias Cabral, Intel (16th only)
  11. Neil Spruit , Intel (16th only)
  12. Brian Barrett, Amazon (16th only)
  13. Akshay Venkatesh, NVIDIA
  14. Xin Zhao, Mellanox
  15. Artem Polyakov, Mellanox

Agenda items

Didn't get to

  • Nathan/Brian: Vader bug cleanups
    • Want to strengthen the recent vader fixes to be fully bulletproof

Done

Meeting minutes

  • OFI (Libfabric):
    • OFI Presentation
    • Scalable endpoints support in MTL and BTL
    • Registering specialized communication functions based on provider capabilities
      • m4 generated C code to avoid code duplication?
    • Discussion: OFI components to set their priority based on the provider found.
    • OFI Common module creation.
  • PR 5241: Add MCA param for multithread opal_progress() (George, Arm)
  • Multithreading stuff (George, Arm)
  • r2 / BTLs are initialized even when they are not used (Jeff)
  • 4.0.x status / roadmap
  • TCP bric-a-brac:
  • Should we limit the number of C compilers that can be used to compile the OMPI core (e.g., limit the amount of assembly/atomic stuff we need to support).
    • E.g., PGI doesn't give us the guarantees we need
    • Probably need to add some extra wrapper glue: e.g., compile OMPI core with C compiler X and use C compiler Y in mpicc.
      • Are there any implications for Fortran? Probably not, but Jeff worries that there may be some assumption(s) about LDFLAGS/LIBS (and/or other things?) such that: "if it works for the C compiler, it works for the Fortran compiler".
  • PMIx as "first class" citizen?
    • Shall we remove the OPAL pmix framework and directly call PMIx functions?
      • Require all with non-PMIx environments to provide a plugin that implements PMIx functions with their non-PMIx library
      • In other words, invert the current approach that abstracted all PMIx-related interfaces
  • ORTE support model
  • Should we publish Open MPI release tarballs to https://github.com/open-mpi/ompi/releases?
  • Remove orte-dvm and redirect users to PRRTE? (Ralph)
  • public ompi-tests repository for easier sharing of testsuites among collaborators (Edgar)
  • Ralph+Jeff: discuss PMIx compatibility issues and how to communicate them
  • Discuss memory utilization/scalability (ThomasN)
  • Mellanox/Xin: Performance optimization on OMPI/OSC/UCX multithreading
  • Fujitsu's status
  • 5.0.x roadmap
  • PMIx Roadmap
    • Review of v3 and v4 features
    • Outline changes in OMPI required to support them
    • Outline changes for minimizing footprint (modex pointers instead of copies)
    • Decide which features OMPI wants to use
  • Mail service (mailman, etc.) discussion - here are the lists we could consolidate down to:
  • Debugger transition from MPIR to PMIx
    • How to orchestrate it?
  • ABI-changing commit on master (after v4.0.x branch) which will affect future v5.0.x branch: https://github.com/open-mpi/ompi/commit/11ab621555876e3f116d65f954a6fe184ff9d522.
    • Do we want to keep it? It's a minor update / could easily be deferred.
  • Need vendors to reply to their issues on the github issue tracker
  • openib: persistent error reported by multiple users
Clone this wiki locally
You can’t perform that action at this time.