forked from open-mpi/ompi
-
Notifications
You must be signed in to change notification settings - Fork 4
WeeklyTelcon_20160426
Geoff Paulsen edited this page Apr 26, 2016
·
8 revisions
- Dialup Info: (Do not post to public mailing list or public wiki)
- Jeff Squyres
- Todd Kordenbrock
- Sylvain Jeaugey
- Ralph
- Nysal
- Nathan Hjelm
- Joshua Ladd
- Howard
- Geoff Paulsen
- Milestones: https://github.com/open-mpi/ompi-release/milestones/v1.10.3
- PR 1097 - for 1.10 may be mute.
- PSM2 issue short version. PSM2 API - uses a fixed UUID - so all jobs across cluster use same UUID (bad)
- Jeff will check 1.10.3 lib versions. Ralph already updated for 1.10.3, but jeff will check
- Wiki: https://github.com/open-mpi/ompi/wiki/Releasev20
- Blocker Issues: https://github.com/open-mpi/ompi/issues?utf8=%E2%9C%93&q=is%3Aopen+milestone%3Av2.0.0+label%3Ablocker *
- Milestones: https://github.com/open-mpi/ompi-release/milestones/v2.0.0 *
Review Master MTT testing (https://mtt.open-mpi.org/)
- Widespread failure of mpool / rcache failure on usNIC last night.
- Ralph is seeing a bunch of attribute failures on 1.10.
- Jeff is passing in BTL parameters that limits him to a shared memory component, but it's going across nodes. So the attribute thinks it's failing, because some of them can't communicate.
- 1.10 is hanging if it doesn't get enough slots.
- Cisco, ORNL, UTK, NVIDIA
- Mellanox, Sandia, Intel
- LANL, Houston, IBM