Skip to content

Meeting 2017 07

KAWASHIMA Takahiro edited this page Jul 18, 2017 · 121 revisions

July 2017 Open MPI Developer's Meeting


9am US Central time July 11 - noon US Central time July 13, 2017


Cisco, Chicago (pretty much directly next to O'Hare airport, google maps link), 9501 Technology Blvd, West Office Center, Rosemont, Illinois 60018

We are in the "Midway" conference room, which is outside Cisco reception.

Meaning: you don't have to check-in with reception / get a badge.

Just take the first hallway off to your left and Midway is clearly marked immediately on the left.


There are no registration fees to attend this meeting.

Please add your name to the wiki list below if you are coming to the meeting:

  1. Ralph Castain (Intel)
  2. Jeff Squyres (Cisco)
  3. Brice Goglin (Inria)
  4. Brian Barrett (AWS) [only Tuesday and Wednesday]
  5. Mohan Gandhi (AWS)
  6. Shinji Sumimoto (Fujitsu)
  7. Takahiro Kawashima (Fujitsu)
  8. Nathan Hjelm (LANL)
  9. Howard Prichard (LANL)
  10. George Bosilca (UTK) [at least partially] (I hope we get the good half)
  11. Edgar Gabriel (UH)
  12. Artem Polyakov (Mellanox)
  13. Matthew Dosanjh (SNL)
  14. Geoff Paulsen (IBM)
  15. Geoffroy Vallee (ORNL)
  16. If you sign up after this point, be sure to let Jeff Squyres know so that he can get you a guest badge and wifi access!

Attending Remotely:

  1. Josh Hursey - IBM (Available from 8:30am-5pm Central) (Added a :phone: icon next to the items I'd like to call in for, if possible)
  2. David Bernholdt - ORNL (around other commitments)

Topics to discuss


  • THURSDAY/SO GEORGE CAN BE HERE: Shall we link components against their native main library - e.g., ORTE components to libopen-rte?

    • See
    • Required reading before the discussion:
    • Remember: there is a workaround -- --disable-dlopen (i.e., cases 4 and 16 in the tables on that wiki). But that doesn't help if the OS/distro installs a "case 2" Open MPI by default.
      • Per the email thread in Oct 2010, the problem that we fixed in was that we were inconsistent about linking in libPROJECT to components (i.e., some did and some did not). We resolved the situation by making them all not link against libPROJECT. In that thread, Brian cited that there was some platform -- he unfortunately did not cite which platform -- did not support the components linking against libPROJECT. :frown:
      • So let's run a small/non-MPI test and see if we can find a platform where this is not supported these days (that is on our current list of supported platforms).
      • If we can't find a platform where this is a problem, then let's link all the DSOs against libPROJECT again and solve the problem for mpi4py and friends.
  • :phone: [George & Nathan] IMB Unidir_Get with Vader issue -

      • @hjelmn to look at this in the immediate future
      • This is a blocker for v3.0.0
      • May also necessitate a release in v2.0.x and v2.1.x -- need to investigate further
  • :phone: Should we forward all OMPI_ env vars from mpirun environments to started process environments?

    • If so, should we also for ORTE_ and OPAL_ env vars?
    • Or should we only forward OMPI_MCA_ env vars?
      • NOTE: current master forwards all OMPI_ env vars
    • Should we make a non-OMPI_MCA_ prefix that we also forward, but something less than all of OMPI_? (E.g., OMPI_FORWARD_, or something better)
    • What about non-OMPI MCA params (e.g., PMIX_MCA)?
      • Just envars, or do we add a registration function for cmd line support (e.g., -pmca foo x)?
      • Yes, we want to forward non-OMPI_MCA env vars.
      • Ralph:
        • Will make a PR that will enable components to register what env vars they want forwarded. At max, we will support a single * for a wildcard (not full regexps) -- e.g. PSM2_* -- for forwarding all names that match.
        • Will probably be something like: a component that wants to register for this stuff will write something to a text file somewhere (e.g., write PSM2_* to a text file somewhere) that ORTE/PMIX/whatever will see later and do the forward. This makes it possible for orterun to forward whatever env vars it needs to, without having to open all their corresponding components (e.g., orterun doesn't know anything about PSM2 components, but can still forward PSM2 env vars.
  • MPI_File backing file location

  • :phone: Release branch status:

    • v1.10
    • v2.0.x
    • v2.x (i.e., v2.1.x)
    • v3.0.x
      • Talked through all of these -- basically the normal content of a Tuesday webex.
  • Release processes / Brian

      • Coming soon: make nightly and release tarballs exactly the same
      • AUTHORS: we should automate these updates. Brian will work on this.
        • should we keep the orgs in there? It's somewhat of a pain. And it's also a bit of a relic -- from before we did the "signed off by" stuff.
        • Should we remove it from git and just auto-generate the file during make dist? Yes, this seems like a good idea.
      • NEWS: this is a problem. Want to change this to only top-level / broad-strokes of features. Do not include individual bug fixes -- there will be a line in there saying "Here's the URL where all the Github fixed issues and PRs can be found for this release".
        • The intent is that NEWS will be fairly short -- we have short-window releases, so there won't be huge giant lists of new features.
        • Big change: RM's will not assemble NEWS. If a dev wants an item in NEWS, they need to PR it.
      • Commit messages: we need to get better about "Reported by helpful user" in commit messages. If we're not going to cite people in NEWS any more, then we want to make sure to cite them in commit messages.
  • :phone: Can we get a NEWS decoration to commit messages on branches so that we know what to put in NEWS?

      • This is now moot, per above.
  • :phone: Revisit this old discussion: should we continue cherry-picking from master to release branches?

    • The Git Way is usually to merge from master to release branches
      • (Artem) Few comments: my impression is that Git way is vice-versa ( It assumes following types of branches:
        • develop (persistent, where all new features go),
        • master (persistent, where all the releases are, each marked with the tag)
        • feature (temporal, branched from develop, merged back: for the temp work on new feature)
        • hotfix (temporal, branched from master, merged back: to fix post-release bugs). (!) Once the hotfix is merged to master, master is merged back to develop, not vice-versa to keep develop consistent with master.
        • release (temporal, branched from develop, merger to master: to harden before next release)
      • Currently: a) our master = developer; b) we don't have master equivalent; c) we keep release branches which force us to do cherry-picking and we sometimes have problems with lost commits.
      • This is not to say that we should follow this, one disadvantage I already see - not easy to support the old releases as release branches are eliminated after it stabilized. Just to keep in mind.
    • This puts more emphasis on master to be more stable. But maybe with all of our new CI, master is more stable these days...?
    • There are pros, cons, and differences: e.g., things wouldn't go on master unless we intend to merge them to release branches.
      • Main proposal from Brian:
        • Shorten time between branch and release, merge from master->release branch during that time (instead of cherry pick), and then cherry pick after release.
        • There is some discussion still needed about exactly when we want to stop merging and start cherry picking, because what about new features that come to master that aren't destined for that release
        • Brian will be posting a proposal about this
  • :phone: CI:

    • Release process updates
      • Where should Open MPI downloads be:
        • OMPI web site (probably not)
        • S3
        • Github
      • Leave the plans in place for all downloads going to S3 (not Github)
  • :phone: We now have options for merging PRs:

    • Continue the way we do now (merge at current head)
    • Rebase and merge (i.e., much more of linear history)
    • Rebase and squash
      • On master: ...
        • Brian thinks rebase and merge is good
        • Howard thinks merge @HEAD is good (i.e., what we do today)
      • On release branches: continue merging @HEAD
  • (Artem) UCX/OSC component status update (ready for PR)

      • Seems like a no-brainer: a vendor wants to commit a component that supports their hardware. Go for it.
      • This will bring up the network selection discussions again, though. We'll need to figure those out.
  • Howard's bug scrub / issue roundup

      • move some issues out of 2.0.4 milestone
      • update README to reflect not supporting PGI/OS-X and not support aarch64
  • What to do about the pathscale compiler support?

      • Jeff to file a PR that pathscale is no longer supported after OMPI v3.0.x.
  • UCX packaging in OMPI sources (Mellanox)

    • Want this in OMPI v 4.0
    • Configuration prerequisites
      • When we turn it on (check available fabrics, tcp should be available soon, then UCX can be always on)
    • How new versions are updated
    • Placement inside the sources: needs to be available for both MPI and SHMEM layers.
    • INITIAL Got some push-back about adding more embedded packages. Will revisit tomorrow.
    • Motivation:
      • We see issues on the mailing list related to bad user experience with OMPI on Mellanox fabrics for both performance and stability.
      • Definitely need UCX for OSHMEM
    • Goal: improve the OOB experience on IB stacks:
      • by auto detecting UCX when available (as is done with SLURM autodetection).
      • by using the internal version when it is not available (for IB networks).
      • Adding UCX-based components that auto-configure/build/install when UCX libraries are present: no problem. Go for it. The rest of OMPI is like this.
      • Request: for at least the time being, for MPI, please de-activate the UCX components if only the TCP transport is available for.
      • Embedding UCX source code directly in Open MPI: the community has many concerns about this, and is generally not in favor of it.
      • This led to a very lengthy discussion about selection logic of which transport/network stack to use at run-time. The short version is that there is a half-baked proposal on the table for a new OPAL transport MCA framework that can mashall this information together in all processes (including mpirun and orted and friends). There will be follow up webexes to flesh out this idea (first webex scheduled for July 31: Mellanox, AWS, Cisco, Intel, LANL, IBM).
  • Is the next version of OMPI 3.1 or 4.0?

      • Since the recent DDT changes moved from master to 3.0.x, we may well be ABI compatible between master and v3.0.x.
      • Someone needs to do an audit
      • Geoff Paulsen volunteers Josh Hursey to do this audit.
      • If all is well, we may well be able to call the next version of 3.1.
  • :phone: Signal forwarding

    • Came up on the user list again, this time wanting a way to signal only child procs that call MPI_Init (and not any intermediate procs such as shell scripts)
    • Ralph added an MCA param to either hit only direct children, or all descendants of those children - but not exactly what the user requested
      • This does not sound like it's in the high-priority list.
  • :phone: What is the plan for OMPI-NEXT and beyond regarding embedding of:

    • hwloc v2 and v1
      • Easy way to disable hwloc internals such as NVML from OMPI's configure?
      • How to deal with hwloc 2.0 ABI break (2 components?)
    • libevent v2.1 and v2.0
    • pmix 2.0, 3.0, and 1.x
    • One suggestion: should we make the external components higher priority than the embedded components? This might naturally start deprecating / phasing out the embedded versions.
      • Yes, this seems like a good idea.
      • Also seems like a good idea to put some kind of "deprecated" notice in configure output if the internal versions are used.
      • We reserve the right to keep the internal versions as long as we want. :smile:
  • (Artem) List of features for OMPI-NEXT?

      • Planned timeline: Oct 2017. Assume branch Sep 1.
      • Features that people are planning:
        • UCX OSC
        • UCX auto-configure/build
        • Would like to do PMIx 2.1. But 2.0 is fine.
        • Would be very nice to have better tuned collective selections updates.
        • AWS to do some TCP BTL updates
        • IBM: protocol report functionality probably won't be ready in this timeframe
        • MPI Forum-based communicator asserts. ob1 registers optimizes for no-any-source, allow-overtake Other PMLs could do more.
        • Possibly (could only be 4.0, though, so not targeted at 3.1): have a configure option to actually remove all the dead MPI functions (C++ bindings, ...etc.).
        • Would be good to shoot for having some of the new discovery/--mca transport stuff. Probably won't get all of it, but some of it may be possible in OMPI-NEXT.
        • ...?
  • :phone: How to better track PRs across multiple release branches?

    • E.g., ensure it has already been merged to master
    • E.g., ensure that we merge at vX only when it has been merged at all desired versions < vX
    • One possibility: should we always make an issue, and put a tag on it for each version that a given PR is merged against?
    • Can this be automated via bot somehow?
      • We talked through 3 bots that would be helpful
      • Brian will upload pictures of what we talked through
      • Brian points out:
  • Move the entire Open MPI web site behind a CDN?

    • If so, we can remove the mirrors program
      • Yes, let's do this. But let's wait until the tarballs move to S3.
  • Investigate shared location for OMPI organization secrets/keys/passwords (e.g., LastPass? 1Password? ...?)

      • Yes, Jeff will do this within the next few weeks. First 3 people on it will be Brian, Jeff, Ralph.
  • :phone: Proposal for OMPI signed-off-by policy:

    1. Old commits that are not signed-off
    2. If you cherry pick someone else's commit, you need to sign off
      • Old commits that are not signed off and are cherry-picked -- these will need to be signed (not for legal reasons, but for "you need to pass CI" reasons)
      • Not worrying about signing other people's commits that you cherry pick

    • Recomendations for new git, github, and issue/PR conventions.
    • ...need to collect relevant bullets from above.
  • :phone: Threading model

      • This is now mostly moot. Artem found one more place; Ralph and Artem discussed and seemed to have a way forward.
  • :phone: Rankfile mapper: Ralph can no longer maintain it. Who will become the maintainer? (IBM volunteered)

      • They are meeting next week to discuss.
  • Strict C99 stuff (e.g., pointer to constant)

    • Per Paul Hargrove's discovery; adapted in PR
    • Note: there's non-C99 elsewhere in OMPI (i.e., if you enable "strict C99", OPAL fails to compile in at least a few places)
    • Do we really care about strict C99?
      • No, we do not care about strict C99.
  • Automate reduction of symbol name pollution?

  • SPI: Any updates / action items?

    • (This is an open question)
    • Nope, nothing to report.
  • :phone: CI:

    • What can we do about the fragility of the Jenkins infrastructure?
      • It seems like one or more of the CI's is broken every week due to lost connections or changed protocols, thereby blocking all commits.
    • Other random CI updates
      • Things seem to be getting better.
  • Can we delete BTL SM?

    • Probably? Want to get Geroge's approval first.
    • Can we remove BTL SM in 3.0.0?
      • It would have been nice to alias "sm" to "vader".
      • In principle, RM's are ok to delete for 3.0.
      • Too late to bring in any meaningful aliasing support in 3.0.0. Maybe add it in OMPI-NEXT...?
      • Yes, it's ok to remove the sm BTL.
      • We still have the SMCUDA BTL. Would take a bit of work to do all the things that SMCUDA is in VADER.
      • Delete the content of sm BTL and put in an opal_show_help to say "hey, you should be using vader" (call ompi_abort()). Jeff will do this.
  • :phone: THURSDAY/GEORGE Fujitsu Status

    • PR #3700 Hang-up detection feature
    • PR #3701 Non-PML persistent requests
      • Will have slides uploaded.
  • THURSDAY/GEORGE Old issue about BTL progress functions:

      • This is done! Issue closed.
  • :phone: THURSDAY/GEORGE Remove CR from master before we branch for v4.0.x

      • See Geoff Paulsen's notes.
  • :phone: THURSDAY/GEORGE PMIx working group meetings

    • Network
    • Tiered Storage
    • OpenMP/MPI coordination
    • Language bindings as apps begin using PMIx? (Josh Hursey volunteers to do Fortran!)
      • Resolved offline with Geoffray, George. Will continue offline.
  • :phone: Multithreaded Onesided - It's buggy, just fix bugs or refactor?

  • :phone: THURSDAY/GEORGE Plans for v4.0.x (recall: new datatype stuff on master is backwards incompatible with v3.0.x --

    • Remove MPI symbols removed in MPI-3.0
      • Can we do this in a way to default to being a compiler error (showing the exact file / linenumber of removed symbol).
      • Any value in providing a non-default way to turn this into a warning to allow customers to make progress without changing their code? Many of these changes are straight forward, is this even worth the effort?
      • IBM may want to do this in the future. Probably not any time soon. Gets a little complicated with removing the C++ bindings. If not done in 3.0.0, can't do this until 4.0 until the earliest.
  • Any other Binary incompatible changes we want to do for v4.0.x (ASAP)?

      • Some datatype thingy. See Geoff Paulsen's notes.
  • What to do with OPAL? We have other projects that use OPAL. But they all have slightly different copies of OPAL.

    • PMIx, for example, name shifts his entire OPAL.
    • But what about those who don't name shift? You can't have 2 OPALs in a single process. :frown:
      • ...long complicated discussion...
      • Would be a good idea over time to add an OPAL context argument to many/most/all OPAL function calls for all the current OPAL global variables. Nathan and George may work on this over time.
      • Becomes very, very chanllenging to separate OPAL as an independent project for all kinds of practical release engineering issues
      • Ralph and George will work together to see if there's any "helper" scripts or support that we can put in OPAL/upstream that will help downstream OPAL consumers (e.g., to help with OPAL symbol renaming, etc.)
      • More detail in Geoff Paulsen notes.

Presentation Material

Clone this wiki locally
You can’t perform that action at this time.