Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement scalable reduction for wireup protocol #781

Merged
merged 9 commits into from Aug 24, 2016

Conversation

Projects
None yet
4 participants
@garlick
Copy link
Member

garlick commented Aug 23, 2016

This PR should speed up the "hello" protocol executed during wireup a little bit.
The old protocol was for every node to send a small message directly to rank 0.
Once all the messages were received, wireup was complete.

The new protocol uses TBON topology-informed reduction to reduce a count. Once the count reaches the target size wireup is complete.

I guess this is also another simple example of how the reduction handle works.

See long comment in ddc83ef

@garlick garlick added the review label Aug 23, 2016

@coveralls

This comment has been minimized.

Copy link

coveralls commented Aug 23, 2016

Coverage Status

Coverage decreased (-0.02%) to 75.1% when pulling 68f289e on garlick:hello_reduce into 752bcbf on flux-framework:master.

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Aug 23, 2016

Nice work!

@garlick garlick force-pushed the garlick:hello_reduce branch from 68f289e to 7dd9267 Aug 23, 2016

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Aug 23, 2016

Current coverage is 74.75% (diff: 78.43%)

Merging #781 into master will decrease coverage by 0.03%

@@             master       #781   diff @@
==========================================
  Files           145        145          
  Lines         25037      25038     +1   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
- Hits          18723      18716     -7   
- Misses         6314       6322     +8   
  Partials          0          0          
Diff Coverage File Path
••••• 50% src/broker/broker.c
••••••• 79% src/broker/hello.c
•••••••••• 100% src/common/libflux/reduce.c
•••••••••• 100% src/modules/live/live.c

Powered by Codecov. Last update 569565e...6c23dff

@coveralls

This comment has been minimized.

Copy link

coveralls commented Aug 24, 2016

Coverage Status

Coverage decreased (-0.05%) to 75.071% when pulling 7dd9267 on garlick:hello_reduce into 752bcbf on flux-framework:master.

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Aug 24, 2016

Is this one ready to merge, or were you still investigating the hello reduction functionality?

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Aug 24, 2016

Still working on it - let's hold off a bit.

@garlick garlick force-pushed the garlick:hello_reduce branch 2 times, most recently from 5c3416f to e6d39e5 Aug 24, 2016

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Aug 24, 2016

I had noted a problem with this PR where the timeout values were not being calculated properly in the reduction handle. The topology information was used to scale the timeout based on the level in the TBON and the topology was read using the wrong attribute interface for broker-resident services.

I decided the scaling of timeout values was counter-intuitive for the user and ought to be the user's responsibility to set up if it is needed, and so removed it from the flux_reduce class and docs.

I also rebased on current master, and now I think it's ready for merge consideration.

@coveralls

This comment has been minimized.

Copy link

coveralls commented Aug 24, 2016

Coverage Status

Coverage decreased (-0.02%) to 75.084% when pulling e6d39e5 on garlick:hello_reduce into 569565e on flux-framework:master.

@coveralls

This comment has been minimized.

Copy link

coveralls commented Aug 24, 2016

Coverage Status

Coverage increased (+0.05%) to 75.15% when pulling e6d39e5 on garlick:hello_reduce into 569565e on flux-framework:master.

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Aug 24, 2016

Sorry, just pushed one last change to allow the HWM to be set via attribute. That allows reduction to be turned off completely for performance comparisons, e.g.

$ srun [options] flux start -o,-Shello.hwm=0,-Shello.timeout=0,-Slog-stderr-level=6 /bin/true
@coveralls

This comment has been minimized.

Copy link

coveralls commented Aug 24, 2016

Coverage Status

Coverage decreased (-0.02%) to 75.081% when pulling 5a31c05 on garlick:hello_reduce into 569565e on flux-framework:master.

broker/hello: use reduction for wireup scalability
Problem: "hello" wireup protocol doesn't scale.

Instead of having each rank send a message directly to rank 0
to show that it has wired up, implement a reduction handle so
messages are collected at each node of the TBON and a sum is
passed to rank 0.

The computed topology is used to establish a HWM value at each node
of the TBON.  The "tbon.descendants" broker attribute was previously
set up in anticipation of this need.  If all goes well, wireup
completes in lock step, with each level of the TBON reporting upstream
once its downstream nodes have reported in.  The computed HWM
value may be overridden by setting the hello.hwm broker attribute
on the broker command line.  setting it to zero disables the HWM
based reduction.

In addition, a timeout is set on the reduction handle so that after
a scaled fraction of the timeout based on the TBON level, the HWM
value is ignored and any messages are sent upstream without delay.
The default timeout (10s) can be manipulated by setting the
hello.timeout broker attribute on the broker command line.
Setting it to zero disables the timeout.

Note that the broker -T,--timeout option was removed and had a
different meaning; it would cause the wireup to abort once the
timeout was exceeded.  If thus functionality is required at some
point it should be re-added as a broker attribute.

Progress at rank 0 is logged at LOG_INFO, at each reduction sink
operation.

Before wireup starts, PMI exchanges endpoint information through
the enclosing instance.  Although there is a PMI_Barrier() between
the PMI write and read phases, one was needed after the read phase
so that PMI and wireup times reported on rank 0 properly reflect
instance-wide PMI and wireup times; otherwise wireup might start
at different times on different ranks.

@garlick garlick force-pushed the garlick:hello_reduce branch from 5a31c05 to 6c23dff Aug 24, 2016

garlick added some commits Aug 23, 2016

libflux/reduce: drop internal timer scaling
Problem: timeout scaling is unnecessarily complicated and internally
use flux_attr_get() to obtain TBON geometry, which doesn't currently
work in broker-resident services.

Drop the internal scaling.  The purpose of the scaling was to maximize
reduction benefit for a particular monitoring use case, where batches
of items to be reduced were generated simultaneously across the instance.
In that case, the timeout can be calculated externally and passed to
flux_reduce_create() or flux_reduce_opt_set().  In other use cases
such as when the timeout is used as a backup for slow HWM convergence,
scaling is inappropriate.
modules/live: note lack of reduction scaling
The protocol for building the topology in the 'live' module
uses TIMEDFLUSH reduction.  Annotate the fact that the timeouts
are no longer scaled by TBON height, which will affect the performance
of this protocol.  It was not updated because 'live' is not
currently loaded by default, and this protocol may be reworked
in the near term.
@coveralls

This comment has been minimized.

Copy link

coveralls commented Aug 24, 2016

Coverage Status

Coverage decreased (-0.03%) to 75.074% when pulling 6c23dff on garlick:hello_reduce into 569565e on flux-framework:master.

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Aug 24, 2016

ok, this all looks fine to me of course. Merging.

@grondo grondo merged commit 181d3ca into flux-framework:master Aug 24, 2016

4 checks passed

codecov/patch 78.43% of diff hit (target 74.78%)
Details
codecov/project Absolute coverage decreased by -0.03% but relative coverage increased by +3.65% compared to 569565e
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage decreased (-0.03%) to 75.074%
Details

@grondo grondo removed the review label Aug 24, 2016

@garlick garlick referenced this pull request Oct 26, 2016

Closed

0.5.0 release notes #879

@garlick garlick deleted the garlick:hello_reduce branch Oct 26, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.