forked from open-mpi/ompi
-
Notifications
You must be signed in to change notification settings - Fork 4
WeeklyTelcon_20160614
Geoff Paulsen edited this page Jun 14, 2016
·
14 revisions
- Dialup Info: (Do not post to public mailing list or public wiki)
- Geoff Paulsen
- Jeff Squyres
- Arm Patinyasakdikul
- Edgar Gabriel
- Howard
- Joshua Ladd
- Nathan Hjelm
- Ralph
- Ryan Grant
- Sylvain Jeaugey
- Todd Kordenbrock
- Milestones: https://github.com/open-mpi/ompi-release/milestones/v1.10.3
- Appears to be ready to go, but there is this PSM signal issue, we'll discuss in new item,
- Comm Spawn issue that came up.
- On disconnect, child is trying to send signal to parent and is getting an unreachable error.
- Wiki: https://github.com/open-mpi/ompi/wiki/Releasev20
- Blocker Issues: https://github.com/open-mpi/ompi/issues?utf8=%E2%9C%93&q=is%3Aopen+milestone%3Av2.0.0+label%3Ablocker *
- Milestones: https://github.com/open-mpi/ompi-release/milestones/v2.0.0 *
-
Timing of v1.10.3 vs v2.0.0 releases
- coordination of NEWS bullets
-
PSM/PSM2 signal hijacking: fix for v1.10.x and v2.0.0
- Jeff filed PR - note to Mathias, Ralph, etc. Make sure wording is good (mentioned vendor).
- PR looks in env, for either PSM or PSM2 variable. If env var is NOT set, it sets it to disabled.
- Default is don't do PSM backtrace files unless user asks for it via env var.
- in JNI onload they dlopen libmpi, so do it for this.
- Open MPI has always had a backtrace handler in Open MPI, and never understood where the signalhandler was failing.
- PR looks in env, for either PSM or PSM2 variable. If env var is NOT set, it sets it to disabled.
- For debugging in PSM and PSM2 libraries, getenv (var) register sigtraps for various handlers.
- In PSM2 they handled correctly to chain the signal handers, and put the old handlers back when they're done.
- Only reason need protection here for HFI - PSM2 library. Discovered a type-o in there at finalize it was resetting the signal handler for random point in memory.
- Pushing fixes back, aiming for latest Fedora 25 (small windows) to eventually get picked up by RHEL 7.3?
- In PSM2 they handled correctly to chain the signal handers, and put the old handlers back when they're done.
- Jeff filed PR - note to Mathias, Ralph, etc. Make sure wording is good (mentioned vendor).
-
next developer’s meeting
-
begin planning for 3.0 branch
-
MTT development
-
do we need an MTT telecon for awhile (biweekly?)
-
non-member access to ompi-tests
-
MTT community database
Review Master MTT testing (https://mtt.open-mpi.org/)
- Cisco, ORNL, UTK, NVIDIA
- Mellanox, Sandia, Intel
- LANL, Houston, IBM