Skip to content

WeeklyTelcon_20151208

Geoff Paulsen edited this page Dec 15, 2015 · 3 revisions

Open MPI Weekly Telcon Minutes 12/8/2015


  • Dialup Info: (Do not post to public mailing list or public wiki)

Attendees

  • Todd Kordenbrock
  • Edgar Gabriel
  • Geoffrey Paulsen
  • Geoffrey Vallee
  • Ralph Castain
  • David Solt

Agenda

Review 1.10

  • Milestones: https://github.com/open-mpi/ompi-release/milestones/v1.10.2
    • a bunch of PRs open against master, and some of them may need fixed on 1.10.x
    • Detecting failures None of the MTT we run nightly, are running MPICH at night.
    • We should add MPICH to nightly runs
  • 1.10.2 Release Canidate anticipating week of 12/14 - 12/18
    • Integrate PR #811?

Review 2.0.x

  • Wiki: https://github.com/open-mpi/ompi/wiki/Releasev20
  • Blocker issues: https://github.com/open-mpi/ompi/issues?utf8=%E2%9C%93&q=is%3Aopen+milestone%3Av2.0.0+label%3Ablocker
    • 1064 - Ralph / Jeff, is this do-able by december?
    • Dynamic add procs is busted now when set value to 0 (not related to PMI-x)
      • A bunch of paths, that may fix this... don't know yet.
    • External Linage issue
      • Distros won't accept Open MPI without external linkage to ?
        • first time we have a part of Open MPI that has an external linkage to libevent.
      • in the past the App was using libevent (version different than internal usage)
        • Solved by renaming symbols in internal version.
      • Problem now is that we have another part of Open MPI (PMPI-x) that also links against libevent.
      • If we remove embedded version of libevent, that would be okay.
      • it's when we build against internal version of libevent, then PMI-x ???
      • One solution is to no longer package libevent with Open MPI.
      • Reason we package it, is so that people don't have to download libevent along with Open MPI.
        • Now libevent-pthread is available on most intros.
      • We DO require libevent to be thread enabled. and check for this in configuration,
        • No performance implications.
    • Debugger support is broken on Master and on 2.x
      • Ralph needs version PMIx 1.2 to do it.
      • He thinks he can come up with a solution in the January time frame.
      • If we start to use PMI-x for the debugger, then the proc could tell a resource manager that they want to be debugged.
  • Milestones: https://github.com/open-mpi/ompi-release/milestones/v2.0.0
    • One of us will go through ALL Issues for 2.0.0 to ask if they can be moved out to future release, or apply blocker status.
    • RFC on embedded PMIx version handling
    • RFC process wiki page?

MTT status:

  • should add MPICH tests to nightly tests to improve
    • still a lot of failures on nvidia test cluster... still in transition.
    • Couple of errors on Jeff's cluster, not worth worrying about yet in Comm Dup stuff. KNOWN
    • Geoffrey Vallee has a cluster that is almost ready to add. Right now it's just -in trial.
      • Geoffrey found
      • next year will add new clusters in lab.
    • Ralph has a new cluster that he's working to get added to Jenkins.

Status Updates:

  • LANL - Not Present
  • Houston
    • Edgar ran a number of performance numbers.
    • Also ran into Issue #1191. Since Mid November. Issue #1191
  • HLRS - Not Present
  • IBM -
    • working on: supporting multiple versions of LSF at runtime (rather than build time)
    • setting up internal git mirror.
    • running internal code scans on master branch.
    • Dave's ompi_comm_info work
    • like to use new RTC process for init / finialize vendor components (licensing)

Status Update Rotation

  1. Cisco, ORNL, UTK, NVIDIA
  2. Mellanox, Sandia, Intel
  3. LANL, Houston, HLRS, IBM

Back to 2015 WeeklyTelcon-2015

Clone this wiki locally