Skip to content
Jeff Squyres edited this page Jun 13, 2016 · 6 revisions

Meeting

Slides for today

Notes from meeting

Copied the notes from the Forum feedback and just started going through it:

  • Pavan: MPI_CONCURRENT -> MPI_THREAD_CONCURRENT
    • Much discussion about this. See slides.
  • Jeff: Is the info object created for you in MPI_Session_get_info?
    • We follow the convention from MPI_COMM_GET_INFO. Meaning: yes.
  • Pavan: Has concerns about bootstrapping a communicator without a parent to help set it up. How do we build up connection information and determine a context id without a pre-existing communication channel?
    • We convinced ourselves that this is not a problem -- e.g., get help from the underlying runtime.
  • Martin: Do we still need the tag given that sessions are isolated?
    • Wesley: Yes, it's still needed because sessions are local and don't know how to match across processes.
  • Charles: Clarify at each step which resources are assigned to the application (e.g. context ids, NIC queues, etc.)
    • Some discussion about how sessions can be allocated different types of hardware (e.g., the 97th session in a process could fallback to TCP). Add AtoI comment in slide 11.
    • THIS IS AS FAR AS WE GOT

  • Martin: Can we eliminate the session object and use the group as the unit of isolation?
    • Wesley/Pavan: This would mean adding all the extra stuff (thread safety, error handlers, info objects, etc.) to groups which would be gross when merging groups together along with all the definition changes.
  • Martin: Can we remove the session argument from set based functions?
    • Howard: You might need thread level or info to call MPI_Session_get_names.
  • Pavan: MPI_Session_create must be thread safe.
  • Wesley: Can you MPI_Create_win_from_group and MPI_Create_file_from_group in MPI 3.1 world?
    • Jeff: Yes, MPI_Init has an implicit session.
    • Wesley: Then these functions (and probably many others?) could potentially be presented completely separately from the sessions proposal.
  • Hubert: What is the error case in doing things in the left column? Does it have to abort (there's no assigned error handler or scope)?
  • Pavan: If only part of the application is MPI (uses MPI for 10 hours, does something else for 10 hours), how do we force MPI to clean up its resources so we can have them back later?
    • Jeff/Room: Could require the user to keep some object around to keep MPI from cleaning up. This would mean adding the requirement that MPI will clean itself up when everything is freed.
  • Martin: Why can't we extract a session from MPI_Init?
    • Jeff: It's gross and could require a bunch of existing semantics to change. For example, what if you finalize the implicit session? Can you still call MPI_Finalize?
  • Dan: Can we have a function to translate a group from one session to another? (these notes could be wrong here, Wesley didn't follow all of this discussion)
  • Pavan: Calling finalize hooks in threads other than their own could cause problems.
  • Pavan: MPI_Session_finalize as presented is kind of collective. We don't want that. Who would it be collective with?
    • Tony: If we say that send cancel with sessions is illegal, does that make MPI_Session_finalize non-collective?
    • Hubert: MPI_Request_free
    • Everyone: Crap...
    • Martin: What if we say that all communication taking place in the session must be done?
    • Aurelien: What about sends where the data is buffered but not transferred?
  • Jeff: How do you abort "all connected processes" when you may not have connected to all processes in mpi://WORLD?
    • Wesley: This would make the new error handler definitions very gross (leverages "all connected processes" to mean everyone in MPI_COMM_WORLD + connected dynamics when defining MPI_ERRORS_ARE_FATAL.
  • Martin/Pavan: If you can't create the global address table at init time, that could make the common case of address tracking expensive because you may have to have per-communicator arrays to track all addressing info.
    • Pavan: You may be able to recreate this by allocating the big array to potentially hold all procs at MPI_Session_create time.
    • Jeff: This already isn't a problem for OMPI because it uses a dynamically growing array of pointers to proc structs.
  • Martin: In MPI 3.1, does MPI_Init still need to be collective?
  • Pavan: MPI_IO can't be the same on all communicators. In fact, many of the built in attribute keys may not want to be the same on all communicators.
    • All: Should we make the special attributes be allowed to be different per communicator? Probably, especially for MPI_TAG_UB and MPI_IO.
  • Aurelien: Instead of using a parent_comm for MPI_Exec, why not use a group and tag like other communicator creation functions?
  • Wesley: The new runtime sets from MPI_Exec will not be visible everywhere (can only see the sets you're in). Any one involved process will see at most two out of three.
    • You can construct the other with group subtraction.
  • All: Is there a good use case for needing all three exec sets anyway? We can derive the set we are in (parent vs. children). We can't get the other one (because we're not in it).
    • The only one we need is the new big set that includes all processes in parent and children.
  • Pavan: How do we know when processes are done so it's safe to spawn again?
  • Pavan: MPI doesn't need replace because it can MPI_Session_finalize and execvp.
    • Anh: That doesn't exist in Windows.
  • Jeff: Add thread safety to MPI_Session_init_comm.
    • Wesley: What about error handler and info?
  • Pavan: Multithreading may be a problem where the tag isn't enough because the threads can be executed in any order.
    • Pavan: However, one MPI call can't block the entire stack so maybe it's ok.
  • Martin: The wording around set_name on MPI_Session_init_comm needs to get cleaned up.
Clone this wiki locally