nemesis ext_procs optimization #29

mpichbot · 2016-10-14T14:52:01Z

Originally by goodell on 2008-08-01 08:42:34 -0500

In [de6e5ee] I committed a rough cut of dynamic processes for nemesis
newtcp. In mpid_nem_inline.h I commented out an optimization that
uses MPID_nem_mem_region.ext_procs because it prevents the proper
operation of dynamic processes. Unfortunately, removing it adds
~100ns to our zero-byte message latencies. So there is a FIXME in
the code that reads like this:

 /* FIXME the ext_procs bit is an optimization for the all-local-procs case.
    This has been commented out for now because it breaks dynamic processes.
    Some other solution should be implemented eventually, possibly using a
    flag that is set whenever a port is opened. [goodell@ 2008-06-18] */

In general, this won't affect real uses who run any inter-node jobs,
since they were already polling every time anyway. However, it does
hurt those wonderful microbenchmarks. A hack fix is to leave this in
but also check to see if a port has been opened. A possibly better
fix is to only poll the network every X iterations of "poll
everything", where X is some tunable parameter.

This req is a reminder for this FIXME.

-Dave

The text was updated successfully, but these errors were encountered:

mpichbot · 2016-10-14T14:52:02Z

Originally by Dave Goodell on 2008-08-01 08:42:34 -0500

This message has 0 attachment(s)

mpichbot · 2016-10-14T14:52:03Z

Originally by thakur on 2008-09-15 14:34:01 -0500

Darius will check if this is already fixed.

mpichbot · 2016-10-14T14:52:03Z

Originally by buntinas on 2008-09-16 12:37:32 -0500

This is still an issue in 1.0 and 1.1, but since it's performance issue, I think we shouldn't hold up 1.0.8 for this.

When we add support for multiple netmods, we'll have a list of "active" netmods, and only call poll on those netmods. Doing that will resolve this issue, so we should just leave this until then.

-d

mpichbot · 2016-10-14T14:52:05Z

Originally by balaji on 2009-03-03 23:21:41 -0600

OSU reported nearly a 0.5us increase in latency here. The additional latency could be because of other reasons too, but this note is to make sure we do a compare against 1.0.8 for performance.

mpichbot · 2016-10-14T14:52:05Z

Originally by buntinas on 2009-05-07 14:39:55 -0500

Fixed in [888cb39].

The original plan was to poll the network only when there is an external process (not on this node), or while a port is open (by MPI_Open_port). The problem is that through some communicator creation magic, a process may end up belonging to a communicator with spawned or connected processes, even though it has never called spawn or connect. So keeping track of whether there are external processes ends up being pretty hairy.

Instead, we took a different approach, and reduced the polling frequency for the tcp module. If a process hasn't had any network activity (nothing from the listener socket and no connect requests), then the poll period is very large (1<<22 for now). As soon as some activity is detected, we reduce the polling period to something smaller (currently 128). Note that in this method, because we don't know whether another process might try to connect to us, we still need to poll once in a while, even if we haven't initiated a network connection ourselves.

-d

mpichbot · 2016-10-14T14:52:06Z

Originally by jayesh on 2009-05-07 14:44:44 -0500

Since the changes are in the tcp network module, I need to port the changes to wintcp.

Regards,
Jayesh

mpichbot · 2016-10-14T14:52:07Z

Originally by jayesh on 2009-05-11 14:38:19 -0500

Ported the changes to windows netmod in [56c991e]

-jayesh

mpichbot self-assigned this Oct 14, 2016

mpichbot added this to the mpich2-1.1rc1 milestone Oct 14, 2016

mpichbot modified the milestones: mpich2-1.1b1, mpich2-1.1rc1 Oct 14, 2016

mpichbot closed this as completed Oct 14, 2016

mpichbot reopened this Oct 14, 2016

mpichbot closed this as completed Oct 14, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nemesis ext_procs optimization #29

nemesis ext_procs optimization #29

mpichbot commented Oct 14, 2016 •

edited

mpichbot commented Oct 14, 2016

mpichbot commented Oct 14, 2016

mpichbot commented Oct 14, 2016

mpichbot commented Oct 14, 2016

mpichbot commented Oct 14, 2016

mpichbot commented Oct 14, 2016

mpichbot commented Oct 14, 2016

nemesis ext_procs optimization #29

nemesis ext_procs optimization #29

Comments

mpichbot commented Oct 14, 2016 • edited

mpichbot commented Oct 14, 2016

mpichbot commented Oct 14, 2016

mpichbot commented Oct 14, 2016

mpichbot commented Oct 14, 2016

mpichbot commented Oct 14, 2016

mpichbot commented Oct 14, 2016

mpichbot commented Oct 14, 2016

mpichbot commented Oct 14, 2016 •

edited