Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build a library of iterator tools #12657

Open
ben-albrecht opened this issue Mar 21, 2019 · 41 comments
Open

Build a library of iterator tools #12657

ben-albrecht opened this issue Mar 21, 2019 · 41 comments

Comments

@ben-albrecht
Copy link
Member

ben-albrecht commented Mar 21, 2019

It would be useful to provide a toolkit of common serial and parallel iterators to users through a library. Python’s itertools library would make a good reference for target functionality as well as early performance comparisons. However, this task is not limited to iterators available in itertools.

Assuming we're porting python itertools, we would need to determine the following for each iterator:

  • Is this functionality already available in the base language?
  • Is this iterator parallelizable?
    • Is it embarrassingly parallel or more complex?
  • Can this iterator operate on distributed data?

There are a number of example recipes provided in the itertools documentation that demonstrate combining multiple iterators together to create powerful constructs. Being able to reproduce many (or all) of these constructs would be a good design goal.

@ben-albrecht
Copy link
Member Author

This is up for grabs as a GSoC 2019 project. Some suggestions to GSoC applicants:

  • Determining the subset of iterators to implement and which ones are parellizable/distributable is left as an exercise for the applicants, and can be included as part of the application project plan / timeline.
  • Implementing one of the iterator functions, with tests, documentation, and
    performance testing in advance would be helpful in several ways:
    • It will give you an idea of how long this process takes and should give
      you a more accurate estimate for your timeline.
    • It will also show that are ramped up on interacting with Chapel's
      testing system (start_test) and chpldoc.
  • I would expect that handling some parallel and/or distributed iterators would be part of the GSoC timeline.
    • This may limit how many iterators get implemented in the summer, and that's fine. It's OK to leave some (or many) iterators out of the timeline or push them into stretch goals section.

@ben-albrecht
Copy link
Member Author

Though not explicitly listed, this may be considered part of #6329

@akshansh2000
Copy link
Member

akshansh2000 commented Mar 22, 2019

I want to take this up as a GSoC 2019 project. I was just wondering if beginning with implementing the repeat and product itertools of python would be a good way to start. As far as I know, these tools are not available in Chapel as of now. Please let me know if I missed something, and whether to go ahead with this.

For example, the basic implementation of product for integer ranges could be given as:

iter product(x: range, y: range) {
    for i in x do for j in y do yield (i, j);
}

writeln(product(1..5, 3..5));

// (1, 3) (1, 4) (1, 5) (2, 3) (2, 4) (2, 5) (3, 3) (3, 4) (3, 5) (4, 3) (4, 4) (4, 5) (5, 3) (5, 4) (5, 5)

So what I need to do is write the tests and documentation, work on determining whether it can be made parallelizable or not, and make this process fast by reducing communication overheads, right?

Please let me know if I'm missing something.
Thank You!

@ben-albrecht
Copy link
Member Author

Hi @akshansh2000, repeat and product seem like good starts. I expect repeat to be relatively simple, but product will be a little more challenging.

Regarding your product implementation, there are a few design goals to consider:

  • We'd want to support variadic arguments
  • We'd want our arguments to be generic so that we don't have to write an overload for every iterable type
  • We'd want to support a repeat argument similar to itertools.product
  • Note that we won't be able to support heterogenous types for our arguments, due to being unable to iterate over or index into a tuple of heterogenous types in chapel.

Beyond that, documentation, evaluating performance, and determining if the implementation should/can be parallelized would be your next steps (in any order you prefer).

@akshansh2000

This comment has been minimized.

@akshansh2000
Copy link
Member

akshansh2000 commented Mar 23, 2019

The basic implementation for the repeat tool is given:

iter repeat (arg, times = 0) {
    if times == 0 then
        for count in 1.. do yield arg;
    else
        for count in 1..#times do yield arg;
}

// For integer/real/complex etc..

writeln(repeat(10, 15));

// For arrays/strings

writeln(repeat([1, 4, 6, 7], 4));
writeln(repeat("Testing", 5));

// For dictionaries

var dictDomain: domain(string);
var dict: [dictDomain] int;

dict["one"] = 1;
dict["two"] = 2;
dict["three"] = 3;

writeln(repeat((dictDomain, dict), 3));

// For infinite repetition

write(repeat("infinite"));

Output:

10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
1 4 6 7 1 4 6 7 1 4 6 7 1 4 6 7
Testing Testing Testing Testing Testing
({two, three, one}, 2 3 1) ({two, three, one}, 2 3 1) ({two, three, one}, 2 3 1)
infinite infinite infinite infinite .....

Please let me know what all I need to change in this. This sure could be parallelized (only in the case of a bounded range), and I'll work on that next. The documentation and performace evaluation is further in the queue.

@LouisJenkinsCS
Copy link
Member

@akshansh2000 You may want to make this into its own Pull Request. (also side note: If you add write the code block as ```chpl you get syntax highlighting)

iter repeat (arg, times = 0) {
    if times == 0 then
        for count in 1.. do yield arg;
    else
        for count in 1..#times do yield arg;
}

You also will want a parallel iterator. I'll do this one for you so you get the gist of how to do so in Chapel...

iter repeat (arg, times = 0, param tag : iterKind) where tag == iterKind.standalone {
    if times == 0 then
        forall count in 1.. do yield arg;
    else
        forall count in 1..#times do yield arg;
}

You'll also definitely want a leader-follower version so you can do something like this...

forall (ix, a) in zip(repeat(fixedIdx), arr) {
   a = ix;
}

In the above example, you zip over an infinite stream repeat(fixedIdx) and some finite container.

@akshansh2000
Copy link
Member

akshansh2000 commented Mar 23, 2019

Thanks for the clarification on the parallel iterator, @LouisJenkinsCS! I'll work on the leader-follower part and share my progress.

(Thanks for the syntax highlighting part as well hahaha)

Also, as for the PR, I'm supposed to make a Mason package and make a PR to the mason-registry repository, right?

@LouisJenkinsCS
Copy link
Member

Hm... I'm not sure, honestly. Previously students just submit a PR to the repository and it gets stuck somewhere (likely labeled as some kind of prototype/test), but I suppose it makes sense if @ben-albrecht would want you to implement it as a Mason package.

@akshansh2000
Copy link
Member

Alright, I'll wait for @ben-albrecht sir to reply, and work on my code until then.

Thank you!

@ben-albrecht
Copy link
Member Author

ben-albrecht commented Mar 24, 2019

Mason package vs package module is still an open question at this point. Though, I'm leaning towards a package module right now so that we can utilize the nightly performance testing. Once mason packages have this functionality, we could migrate this to a mason package, like we plan to do for other existing package modules (#10713).

@akshansh2000
Copy link
Member

The repeat iterator is almost complete, I believe.
Please take a look at my implementation and let me know if I need to change something.

Also, I'm a bit confused here. How do I implement infinite repetition for a parallel loop?
In the case of a forall loop, it needs to know to the total number of tasks to divide the repeat task into different tasks, while in the case of a coforall loop, the program terminates at runtime due to excessive memory usage. Please help me out with this, I'm a bit lost.

Here's the code; I've commented the non working code below.

// Serial Iterator

iter repeat (arg, times = 0) {
    if times == 0 then
        for count in 1.. do yield arg;
    else
        for count in 1..#times do yield arg;
}

// Standalone Parallel Iterator

iter repeat (param tag : iterKind, arg, times = 0) where tag == iterKind.standalone {
    if times == 0 then
        coforall count in 1.. do yield arg; // parallel infinite loop creates problem
    else
        coforall count in 1..#times do yield arg;
}

// Procedure to compute chunks which are to be iterated through

proc computeChunk(r : range, myChunk, numChunks) where r.stridable == false {
    const numElems = r.length;
    const elemsPerChunk = numElems / numChunks;
    const mylow = r.low + (elemsPerChunk * myChunk);

    if (myChunk != numChunks - 1) then
        return mylow..#elemsPerChunk;
    else
        return mylow..r.high;
}

const numTasks = here.maxTaskPar; // max number of tasks supported by locale

// Leader

iter repeat (param tag : iterKind, arg, times = 0) where tag == iterKind.leader {
    if times == 0 then
        coforall task_id in 0.. do yield(0..1,); // parallel infinite loop creates problem
    else
        coforall task_id in 0..#numTasks {
            const working_iters = computeChunk(0..#times, task_id, numTasks);
            yield(working_iters,);
        }
}

// Follower

iter repeat (param tag : iterKind, arg, times = 0, followThis)
        where tag == iterKind.follower && followThis.size == 1 {
    const working_iters = followThis(1);

    for idx in working_iters do yield arg;
}

for element in repeat(1, 10) do write(element, ' '); // works - 1 1 1 1 1 1 1 1 1 1
writeln();

forall element in repeat('abc', 4) do write(element, ' '); // works - abc abc abc abc
writeln();

forall element in repeat(123) do write(element, ' '); // doesn't work, parallel infinite loop
writeln();

var arr: [1..5] int;

forall (idx, element) in zip(repeat(3, 5), arr) do element = idx; // works
writeln(arr); // - 3 3 3 3 3

forall (idx, element) in zip(repeat(2), arr) do element = idx; // doesn't work, parallel infinite loop
writeln(arr);

Thank you!

@ben-albrecht
Copy link
Member Author

Also, I'm a bit confused here. How do I implement infinite repetition for a parallel loop?

This is not supported today in Chapel. As a result, we will not be able to pursue parallel implementations of any infinite iterators for this project.

@akshansh2000
Copy link
Member

Alright, so should I put the rest of the code into a module and create a PR?

@LouisJenkinsCS
Copy link
Member

Oh yeah, I forgot that you can't break from a forall loop. I'm wondering if its wise to return a 'killswitch' which the leader and follower will check to see if it should continue.

@LouisJenkinsCS
Copy link
Member

Note that if I make the array the 'leader', it works fine. It has to do with the leader not knowing when to stop, which is definitely a problem. TIO

@akshansh2000
Copy link
Member

Note that if I make the array the 'leader', it works fine. It has to do with the leader not knowing when to stop, which is definitely a problem. TIO

Noted. So for now should I exclude the infinite-parallel iteration from the code and make a module out of the rest of my code, and throw an error when trying to use infinite iteration with a parallel loop?

Also, should I open an issue regarding the above?

@LouisJenkinsCS
Copy link
Member

You could halt when you have an infinite parallel loop for now, sure. The lack of a break statement in forall is a well-known issue, so no issue needed (unless Ben thinks differently)

@ben-albrecht
Copy link
Member Author

ben-albrecht commented Mar 25, 2019

You could halt when you have an infinite parallel loop for now, sure.

That sounds good to me, or just leave the parallel infinite iterator implementations out for now.

The lack of a break statement in forall is a well-known issue, so no issue needed (unless Ben thinks differently)

Despite being well-known, I think it's useful to have an issue to reference / search for. I've created a simple one here: #12700

@akshansh2000
Copy link
Member

akshansh2000 commented Apr 12, 2019

@ben-albrecht, @LouisJenkinsCS, I completed the serial version of the cycle itertool, which returns separate elements from an iterable, eg:

writeln(cycle('ABC'));
writeln(cycle(1..4, 2));
>>  A B C A B C A B C ...
>>  1 2 3 4 1 2 3 4

I was working on the parallel version, but I am not sure if it makes sense to parallelize. I don't think that a shuffling of order would be desirable.

Is there some case that I'm missing where it might be used?

@ben-albrecht
Copy link
Member Author

I don't think a standalone parallel iterator makes sense for cycle, but a leader/follower parallel iterator might be useful in cases where zipped order matters, e.g.

forall (dayOfMonth, dayOfWeek) in zip(1..30, cycle('MWTRFSU')) {
  April[dayOfMonth] = dayOfWeek;
}

@akshansh2000
Copy link
Member

@ben-albrecht, @LouisJenkinsCS, I am having some problem implementing the parallel version of the cycle tool.

The serial version is working fine. However, the parallel version throws a zippered iterations have non-equal lengths error when the length of the iterable is more than 1. Here is the code for the leader-follower iterator:

use RangeChunk;

iter cycle(param tag: iterKind, arg, times = 0) throws
    where tag == iterKind.leader {

  var numTasks = if dataParTasksPerLocale > 0 then dataParTasksPerLocale else here.maxTaskPar;

  if numTasks > times then numTasks = times;

  if __primitive("method call resolves", arg, "these") {  // to check if `arg` is iterable
    coforall tid in 0..#numTasks {
      const working_iters = chunk(0..#times, numTasks, tid);
      yield(working_iters,);
    }
  } else
    throw new owned IllegalArgumentError(
        "non-iterable type argument passed");

}

iter cycle(param tag: iterKind, arg, times = 0, followThis)
    where tag == iterKind.follower && followThis.size == 1 {

  const working_iters = followThis(1);

  for working_iters do
    for element in arg do
      yield element;

}

On running the following code,

forall (el, id) in zip(cycle('ABCD', 2), 1..#8) do writeln(el, ' ', id);

the error for unequal lengths as described above is thrown. However, on running

forall (el, id) in zip(cycle(['ABCD'], 8), 1..#8) do writeln(el, ' ', id);

the program works fine.

What could I be missing? Thanks.

@LouisJenkinsCS
Copy link
Member

@akshansh2000

['ABCD'] creates an array literal (I.E if you do writeln(arg.type : string) you'll see this fact), while 'ABCD' creates a string. When you do ['ABCD', 'ABCD', 'ABCD'] you get the same error; the issue has to do with the fact that you are yielding more than requested.

cycle('ABCD', 2) should print out 'AB', while ['ABCD'] should print out 'ABCD', 'ABCD'. To do this, you'll want to ensure that you short-circuit the computations. I.E insert a break inside of that follower sequential for loop when you hit the upper limit early.

@akshansh2000
Copy link
Member

akshansh2000 commented Apr 21, 2019

@LouisJenkinsCS

Actually, cycle('ABCD', 2) should print A B C D A B C D, and that's why in the first example I set the times argument to 2, and let id iterate from 1..8, due to the string length being 4.

One cycle iteration means going through the whole iterable exactly once. Sorry for the missing explanation.

Similarly, in the second example, as the iterable length is 1, repeating it 8 times would require id to range from 1..8

@LouisJenkinsCS
Copy link
Member

TIO

use RangeChunk;

iter cycle(arg, times = 0, param tag: iterKind) throws
    where tag == iterKind.leader {
  writeln(arg.type : string);
  var numTasks = if dataParTasksPerLocale > 0 then dataParTasksPerLocale else here.maxTaskPar;

  if numTasks > times then numTasks = times;

  if __primitive("method call resolves", arg, "these") {  // to check if `arg` is iterable
    coforall tid in 0..#numTasks {
      const working_iters = chunk(0..#times, numTasks, tid);
      yield(working_iters,);
    }
  } else
    throw new owned IllegalArgumentError(
        "non-iterable type argument passed");

}

iter cycle(arg, times = 0, followThis, param tag: iterKind)
    where tag == iterKind.follower && followThis.size == 1 {
  
  const working_iters = followThis(1);
  writeln(working_iters);
  var upperBound = working_iters.size;
  var _times = 0;
  label outerLoop
  while true do
    for element in arg {
      _times += 1;
      yield element;
      if _times == upperBound then break outerLoop;
    }

}

iter cycle(arg, times = 0) {}

forall (el, id) in zip(cycle('ABCD', 2), 1..#8) do writeln(el, ' ', id);

If you want it to go around in a cycle, it'd have to be a bit more complicated than what we have right now. If this is what you want you should obtain the size of the arg (assert that it is resolvable just like how you did for these), and then have some additional logic to skip the first N iterations over arg. RIght now it prints 'A' 'A' because both begin at the same letter. Then again, you could make this more efficient by only creating tasks if the number of times exceeds arg.size.

@LouisJenkinsCS
Copy link
Member

I think I misunderstood the intention here. A string by itself is iterable and yields characters; if you want the string itself to be iterated over, the array literal is the best approach.

@akshansh2000
Copy link
Member

@LouisJenkinsCS, does that mean storing the string as an array of characters?

@LouisJenkinsCS
Copy link
Member

TIO That should work

@akshansh2000
Copy link
Member

That is a great approach, @LouisJenkinsCS.

I'll try to optimize it and see if we can get it done using even fewer resources. I'll get back here if I have any doubts. Thank you!

@ben-albrecht
Copy link
Member Author

TIO That should work

Nice example @LouisJenkinsCS.

I have a small suggestion - Reflection.canResolveMethod would be preferred over the __primitive call, which is not a public interface (and is therefore subject to change in the future without proper deprecation).

e.g.

use Reflection only;

...

if Reflection.canResolveMethod(arg, "these") {
  ...
} else {
   throw new owned IllegalArgumentError(...);
}

ben-albrecht added a commit that referenced this issue Apr 22, 2019
Added "Itertools" module with repeat itertool

Part of #12657 

This PR adds a draft `Itertools.chpl` file to the test directory, which contains the `Itertools` module with complete documentation and testing

- Contains serial and parallel (standalone and leader-follower) iterators for the `repeat` itertool

[Contributed by @akshansh2000]
[Reviewed by @ben-albrecht]
@akshansh2000
Copy link
Member

@LouisJenkinsCS @ben-albrecht

TIO That should work

This code doesn't work if the times argument is more than 2. TIO1, TIO2

I was trying to debug it.
I took another example of a simple iterator that yields 1 eight times. The serial part of it (TIO3) is given as

iter abc() {
  for 1..8 do
    for j in 1..1 do
      yield j;
}

I then switched to a leader-follower one, but that didn't work quite well (TIO4). I set the default value of numTasks to 4 so that the range gets divided evenly.

  • The index idx starts from 2 instead of 1
  • A halt is produced: halt reached - invoking orderToIndex on an integer 8 that is larger than the range's number of indices 8

This leader-follower one should yield 1 eight times, but it fails and gives the error as mentioned in TIO4. I can't understand why this happened. Also, to check if the range was getting divided evenly and the number of yielded 1s were eight, I wrote the following code (without an upper bound in the follower of the forall loop): TIO5.

  • The number of yielded 1s are 8, and so I can't understand what caused a problem in TIO4
  • The range being sent and the one being received is the same, so no problems there
  • idx still starts from 2 for some reason

And I guess the initial error which I got a few days back (zippered iterations have non-equal lengths), TIO6, was related to this as well?

Am I doing something wrong here?

@bradcray
Copy link
Member

Most of our standard leader-follower iterators yield a tuple of indices using 0-based indexing. This is simply a convention so that if you're zippering something that's 0..7 with something that's 1..8 with something that's 1..16 by 2, they all have a common basis for what global iteration you're on. This means that an 8-element range's follower is expecting to receive indices in the 0..7 range. I expect that things are going wrong because your leader is yielding indices in the 1..8 range (but am surprised that the range iterator doesn't complain more about being given an iteration that's OOB... that seems like a bug). Specifically, if I make your leader return working_iters-1, it seems to work as expected: TIO

@akshansh2000
Copy link
Member

akshansh2000 commented Apr 29, 2019

@bradcray, yes, I should've taken care of that. I understand now.
Thank you for the explanation!

One more thing, though. Why does an unbounded range starting from 1 not cause an error? In this TIO, the yielded 1s are eight (which shouldn't be as per my understanding), and the indexing also starts from 2. Why does this happen?

@bradcray
Copy link
Member

It would've been much easier to take care of if the range follower had complained at you that you were requesting unreasonable indices... It'd be good for one of us to look into why that was.

@bradcray
Copy link
Member

bradcray commented May 1, 2019

It'd be good for one of us to look into why that was.

I just did this and only now realized that it was complaining at you, athough the error message could've been clearer:

error: halt reached - invoking orderToIndex on an integer 8 that is larger than the range's number of indices 8

I missed it the first time because I forgot that TIO separates stdout from stderr. Running it from my console made it much clearer...

@akshansh2000
Copy link
Member

@bradcray, yes, it is warning me in the case of a BoundedRange, as in TIO4, however, I still do not get any error in the case of an UnboundedRange, as in TIO5 (not even on my console).

@bradcray
Copy link
Member

bradcray commented May 2, 2019

I think that's as expected. Unbounded ranges are treated a bit specially in that they are permitted to be zippered with anything without running into a length check validation failure. So since TIO5 wasn't 0-basing its indices, it yielded its second through #8'th element (so 2..9).

[edit: There's a long-term intention to give users the ability to write their own unbounded iterators that can similarly conform to the leader's size, but we haven't ever completed the design and implementation of that feature].

@akshansh2000
Copy link
Member

I'd love to help with it in any way possible!

@bradcray
Copy link
Member

bradcray commented May 2, 2019

@akshansh2000: I wish I knew how to advise you to do so. It's a pretty big design challenge and likely to end up with a very different world than the one we have today. The general ideas we've had (that I'm aware of) are based on increasing the amount of interaction between a leader iterator and the follower iterators such that it's up to a follower to say "mismatched zipper iteration" or just let the follower consume as much of the leader as it wants. The thinking is that this interaction would also help fix a longstanding bug in which expressions like forall (i,j) in zip(1..3, 1..4) don't currently result in size mismatch errors. (I took a stab at this latter bug in #11685 and there may be some other helpful notes there as well... or not). Again: big design challenge.

@akshansh2000
Copy link
Member

@bradcray, I guess I'd have to dig deep into the core concepts of Chapel for that, on it!

ben-albrecht added a commit that referenced this issue Jun 27, 2019
Added the cycle itertool to Itertools.chpl

Part of #12657 

The `cycle` itertool returns elements from an iterable over and over again, a specified number of times.
Refer to [python itertools](https://docs.python.org/3/library/itertools.html#itertools.cycle) for more information.

This implementation includes a serial, leader, and follower iterator for `cycle()`. Note that the unbounded variant of this iterator can not be zippered in a `for/coforall` loop due to #13239.

- [x] Add the functioning code
- [x] Fix `forall` loops error in zippered contexts
- [x] Complete the documentation
- [x] Add tests


[Contributed by @akshansh2000]
[Reviewed by @ben-albrecht]
@ben-albrecht
Copy link
Member Author

For anyone curious, the current progress of the itertools library can be found here: https://github.com/chapel-lang/chapel/tree/master/test/library/packages/Itertools

Much work remains to be done.

ben-albrecht added a commit that referenced this issue Mar 3, 2020
Add accumulate tool to Itertools

Part of #12657 

Add the [accumulate tool](https://docs.python.org/3/library/itertools.html#itertools.accumulate) to the [Itertools module](https://github.com/chapel-lang/chapel/blob/master/test/library/packages/Itertools/Itertools.chpl).

- [x] Add the tool
- [x] Write relevant documentation
- [x] Write tests

This would allow such operations in arrays:
```chpl
use Itertools;

writeln(accumulate([1, 2, 3, 4, 5], operations.add));
// 1 3 6 10 15

writeln(accumulate([1, 2, 3, 4, 5], operations.multiply));
// 1 2 6 24 120
```

Currently, the supported operations are:
* add
* subtract
* multiply
* divide
* bitwise AND
* bitwise OR
* bitwise XOR

[Contributed by @akshansh2000]
[Reviewed by @ben-albrecht]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants