Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide user-facing aggregators #16963

Open
ronawho opened this issue Jan 19, 2021 · 1 comment
Open

Provide user-facing aggregators #16963

ronawho opened this issue Jan 19, 2021 · 1 comment

Comments

@ronawho
Copy link
Contributor

ronawho commented Jan 19, 2021

For applications that send many small messages, aggregation is critical to achieve high performance on commodity networks and improve performance on HPC networks as well. Copy aggregators are used extensively in Arkouda to improve small message performance. As a rough guideline when sending highly concurrent small (8-byte) messages:

  • Ethernet and non-HPC networks: generally have very low small messages rates. Aggregation can provide ~5000x speedup
  • InfiniBand: Chapel maps poorly to InfiniBand in this regard (Improve gasnet-ibv performance #14438 -- "Improve Fine-grained Comm Performance"). So while Infiniband hardware has decent small message rates Chapel over Infiniband does not. Aggregation can provide a ~1000x speedup
  • Cray Aries: Chapel maps well to Aries, which has high small messages rates, but there is still a 2-3x speedup from using aggregation

Arkouda only has copy aggregators (assignment between trivially copyable types). They must be created on a per-task basis and you have to specify if LHS (destination) or RHS (source) is remote. We would like to add aggregation to the standard library to be available for all users, but as part of that effort would like to improve the ergonomics and support arbitrary aggregators, not just copy aggregators.

Example Usage:

A copy of the aggregators was added to the test directory in #16726. For users who want to experiment. From https://github.com/chapel-lang/chapel/tree/master/test/studies/bale/aggregation, copy AggregationPrimitives.chpl and CopyAggregation.chpl and see ig.chpl (indexgather) for an example of how to use them. For a SrcAggregator (Src / RHS is remote) aggregation would look something like:

use BlockDist, Random, CopyAggregation;

const numTasks = numLocales * here.maxTaskPar;
config const N = 1000000; // number of updates per task
config const M = 10000;   // number of entries in the table per task

const numUpdates = N * numTasks;
const tableSize = M * numTasks;

proc main() {
  const D =  newBlockDom(0..#tableSize);
  var A: [D] int = D;

  const UpdatesDom = newBlockDom(0..#numUpdates);
  var Rindex: [UpdatesDom] int;

  fillRandom(Rindex, 208);
  Rindex = mod(Rindex, tableSize);
  var tmp: [UpdatesDom] int;

  // Unaggregated
  forall (t, r) in zip (tmp, Rindex) do
    t = A[r];

  // Aggregated
  forall (t, r) in zip (tmp, Rindex) with (var agg = new SrcAggregator(int)) do
    agg.copy(t, A[r]);
}

More info:

For a high level overview of existing aggregation efforts in arkouda see:

And for a more detailed history of the code in Arkouda see:

For providing arbitrary aggregators I expect to draw upon previous work from CAL (Chapel Aggregation Library):

A primary difference between this work and that effort is that these aggregators are created a per-task basis so there's no contention between competing tasks, which is important for performance.

@e-kayrakli
Copy link
Contributor

#17657 proposes adding a Communication module. The user-facing aggregation should fit well to that interface when we add them. Under that issue, I proposed a direction that is slightly different than what's outlined in the OP, where instead of

forall (t, r) in zip (tmp, Rindex) with (var agg = new SrcAggregator(int)) do
    agg.copy(t, A[r]);

we'd have

forall (t, r) in zip (tmp, Rindex) with (var agg = new SrcAggregator(int)) do
    copy(t, A[r], aggregator=agg);

I don't think too strongly about this, but I think it'd be interesting. And will probably feel more unified with the copy function proposed in that issue.

ronawho added a commit that referenced this issue Mar 11, 2022
Migrate copy aggregators to a package module

[reviewed by @bradcray and @e-kayrakli]

We've long wanted to make aggregators user-facing (#16963), but haven't
made much progress on that. Longer term we want better names, support
for third-party operations, arbitrary user-defined operations, and
probably some other things, but in the short term this just exposes the
implementation we have now and adds a short module doc with examples.

Closes Cray/chapel-private#3178
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants