Skip to content

WeeklyTelcon_20220118

Geoffrey Paulsen edited this page Jan 19, 2022 · 1 revision

Open MPI Weekly Telecon ---

  • Dialup Info: (Do not post to public mailing list or public wiki)

Attendees (on Web-ex)

  • Geoffrey Paulsen (IBM)
  • Jeff Squyres (Cisco)
  • Akshay Venkatesh (NVIDIA)
  • Austen Lauria (IBM)
  • Brendan Cunningham (Cornelis Networks)
  • Brian Barrett (AWS)
  • Christoph Niethammer (HLRS)
  • Harumi Kuno (HPE)
  • Hessam Mirsadeghi (UCX/nVidia)
  • Howard Pritchard (LANL)
  • Joseph Schuchart
  • Josh Hursey (IBM)
  • Matthew Dosanjh (Sandia)
  • Michael Heinz (Cornelis Networks)
  • Todd Kordenbrock (Sandia)
  • Tomislav Janjusic (nVidia)

not there today (I keep this for easy cut-n-paste for future notes)

  • Artem Polyakov (nVidia)
  • Aurelien Bouteiller (UTK)
  • Brandon Yates (Intel)
  • Charles Shereda (LLNL)
  • David Bernhold (ORNL)
  • Edgar Gabriel (UoH)
  • Erik Zeiske
  • Geoffroy Vallee (ARM)
  • George Bosilca (UTK)
  • Joshua Ladd (nVidia)
  • Marisa Roman (Cornelius)
  • Mark Allen (IBM)
  • Matias Cabral (Intel)
  • Nathan Hjelm (Google)
  • Noah Evans (Sandia)
  • Raghu Raja (AWS)
  • Ralph Castain (Intel)
  • Sam Gutierrez (LLNL)
  • Scott Breyer (Sandia?)
  • Shintaro iwasaki
  • Thomas Naughton (ORNL)
  • William Zhang (AWS)
  • Xin Zhao (nVidia)

NEW Discussion

  • MPI_Reduce_local - Which PR do we want?
  • Do we still support cross-compilation?
  • Update just few lines of ROMIO without full update for release branch?
    • Regarding https://github.com/open-mpi/ompi/pull/9855
    • Note: ROMIO v3.2.1 was the latest of the v3.2.x series, so future ROMIO updates would include this fix.
    • Because this fix is a 1-liner, and it comes from the very next ROMIO version, update the commit with the ROMIO hash, and it's low enough risk for release branches.

4.0.x

  • Schedule: No schedule for v4.0.8 yet
    • bugfixes case-by-case basis
  • Winding down v4.0.x, and after v5.0.x will stop
  • Really only want small changes reported by users.
  • Otherwise, point users to v4.1.x release.
  • Howard and Geoff will meet Feb 28th

v4.1.x

  • Schedule: Shooting for v4.1.3 end of March/Q1.
  • No other update.

v5.0.x

  • Need a full ROMIO update [Geoff to file issue}
    • Open an issue to track this.
  • What's the status of MPI Sessions?
    • Howard sent out email to devel-core.
    • MPI_Group - question in MPI v4.0 for sessions is incorrect.
      • Howard wants to remove this incorrect argument checking in code, and fix the test.
      • Might still be a contradiction in the MPI Standard. Should we check or not?
        • Howard will follow up with Sessions working group.
      • Feb 1st - might be asperational date for review to get in.
      • There is a basic test suite. Also a modified version of OSU (might need updates).
        • Some tests in ompi-test-public suite.
    • Nathan pulled attributes out
    • v5.0 release managers should look at it closely. Two main things in it:
      1. Reorganization for MPI_Init/Finalize to be able to be called multiple times
      • Attributes is used for this
      1. Extended CID thing. Read this section and this link for cluster 19 paper.
      • Allow us to use PMIx process set stuff more efficently.
        • Request a unique 64bit PMIx CID as part of a PMIx group join.
          • Expensive to get this from PMIx for large jobs.
      • UCX isn't able to do Sessions (because not using extended CID thing now)
        • For these PMLs just need to look at Init/Finalize work.
      • If you get a Comm outside of MPI_Comm_from_Group, goes back to
    • Nice to have in v5.0, but since we won't have all other
    • Howard is fixing one conflict on PR.
    • Deadline of Feb 1st Review - Howard Sent email to devel-core.
  • Thinking about an RC before and after Sessions.
    • Well as far as tracking, we have nightly tarballs, and it'll be clear in git
  • Docs rework
    • We made a lot of progress on revamping the docs with restructured text.
    • Might actually be able to get this done by v5.0.x
    • Dont go review yet, but lots of good progress.

Master

  • No new Gnus

MTT

  • A fix pending to workaround the IBM XL MTT build failure (compiler abort)
Clone this wiki locally