Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

broker: support binomial tree topology #4730

Merged
merged 4 commits into from Oct 27, 2022

Conversation

garlick
Copy link
Member

@garlick garlick commented Oct 27, 2022

I think @trws suggested we try a binomial tree overlay topology. This adds an option to play with:

$ ./flux start -s16 -o,-Stbon.topo=binomial flux overlay status
0 system76-pc: full
├─ 1 system76-pc: full
├─ 2 system76-pc: full
│  └─ 3 system76-pc: full
├─ 4 system76-pc: full
│  ├─ 5 system76-pc: full
│  └─ 6 system76-pc: full
│     └─ 7 system76-pc: full
└─ 8 system76-pc: full
   ├─ 9 system76-pc: full
   ├─ 10 system76-pc: full
   │  └─ 11 system76-pc: full
   └─ 12 system76-pc: full
      ├─ 13 system76-pc: full
      └─ 14 system76-pc: full
         └─ 15 system76-pc: full

Copy link
Contributor

@grondo grondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, cool! It was pretty nice to see how simple this was to implement.

I didn't see any issues, except that I wondered if this should be documented in flux-broker-attributes(7). Since the binomial option is experimental at this stage, it isn't crucial IMO.

@garlick
Copy link
Member Author

garlick commented Oct 27, 2022

Thanks, I went ahead and added an update to flux-broker-attributes(7) as suggested, and corrected a typo in the unit test.

BTW, it was your suggestion of using a URI for the topology that made this so simple. That was a good one!

@codecov
Copy link

codecov bot commented Oct 27, 2022

Codecov Report

Merging #4730 (6b9e84e) into master (1e9fcff) will increase coverage by 3.30%.
The diff coverage is 90.47%.

❗ Current head 6b9e84e differs from pull request most recent head 7b4c8d2. Consider uploading reports for the commit 7b4c8d2 to get more accurate results

@@            Coverage Diff             @@
##           master    #4730      +/-   ##
==========================================
+ Coverage   80.05%   83.35%   +3.30%     
==========================================
  Files         413      413              
  Lines       69309    69756     +447     
==========================================
+ Hits        55483    58147    +2664     
+ Misses      13826    11609    -2217     
Impacted Files Coverage Δ
src/broker/topology.c 92.51% <90.47%> (-0.16%) ⬇️
src/bindings/python/flux/job/JobID.py 85.71% <0.00%> (-14.29%) ⬇️
src/bindings/python/flux/memoized_property.py 88.88% <0.00%> (-11.12%) ⬇️
src/bindings/python/flux/message.py 76.04% <0.00%> (-8.85%) ⬇️
src/bindings/python/flux/resource/ResourceSet.py 88.69% <0.00%> (-7.54%) ⬇️
src/bindings/python/flux/job/wait.py 89.13% <0.00%> (-6.22%) ⬇️
src/bindings/python/flux/job/info.py 87.78% <0.00%> (-5.72%) ⬇️
src/bindings/python/flux/resource/list.py 88.57% <0.00%> (-5.37%) ⬇️
src/bindings/python/flux/job/Jobspec.py 83.98% <0.00%> (-5.22%) ⬇️
src/cmd/flux-admin.py 86.95% <0.00%> (-3.96%) ⬇️
... and 217 more

Problem: the binomial tree has some interesting properties for HPC
overlay networks, but we only support k-ary and custom topologies.

This topology has been explored extensively for use in MPI collectives,
such as in "Optimization of Collective Communication Operations in MPICH",
Rajeev Thakur et al, IJHPCA, 2005.

A binomial tree of order k has the following general properties:
- it has 2^k nodes
- the root has fanout k
- each level has a maximum fanout of one less than the next level up
- its height is k

See also page 4 of:
https://www3.cs.stonybrook.edu/~rezaul/Fall-2017/CSE548/CSE548-lecture-9.pdf

How these properties might affect Flux TBON performance is an open question,
but now we have an option to experiment with.
Problem: no unit tests exist for binomial topology.

Add unit tests.
Problem: no system tests exist for tbon.topo=binomial

Add one to t0001-basic.t.
Problem: tbon.topo=binomial is not documented.

Add binomial to the tbon.topo attribute description.
@mergify mergify bot merged commit fe6f5aa into flux-framework:master Oct 27, 2022
@garlick garlick deleted the binomial_tree branch November 3, 2022 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants