What are the intended semantics of <== and <<== #27

lizmat · 2019-05-15T09:50:59Z

See rakudo/rakudo#2899 for the start of this discussion.

Kaiepi · 2019-05-17T11:09:08Z

With rakudo/rakudo#2903, I think <== and ==> are brought in line with the spec.

The question is how are <<== and ==>> supposed to work? Should code like this be allowed to run?

[4,5,6] ==>> [1,2,3] ==>> my @foo;

Or should only one appending feed operator be allowed at a time?

my @foo;
@foo <<== [1,2,3];
@foo <<== [4,5,6];

If more than one should be allowed, should they be allowed in combination with their respective assigning operators, like this?

my @even <== grep { $_ %% 2 } <== 1..^100;
@even <<== grep { $_ %% 2 } <== 100...*;

Kaiepi · 2019-05-18T02:55:54Z

Also, from the parallelization pullreq:

There's a problem with this... this benches slower than the current implementation of feed operators, even when there's blocking I/O going on at the same time. I think more discussion needs to be made about whether or not this should be implemented.

Feed operators were benching much faster in the first pullreq I made. Should we ignore the spec about parallelizing feed operators?

lizmat · 2019-05-19T08:22:08Z

FWIW, I don't think feeds need to create containers, so we can have that performance benefit. It's only the storing in the endpoint that should create containers if the receiving wants that (e.g Array vs List).

Kaiepi · 2019-05-19T15:15:20Z

Disregard what I said about ignoring the spec, I figured out how to get parallelized feed operators to run 5x faster than the current implementation

Kaiepi · 2019-05-25T19:05:44Z

Before I can continue with my pullreq, there's something that needs to be resolved. Modules in the ecosystem are using feed operators with things that aren't iterable. Here's an example from CUID:

sub timestamp {
        (now.round(0.01) * 100)
        ==> to-base36()
        ==> adjust-by8()
        ==> padding-by8()
}

Should this behaviour be preserved?

lizmat · 2019-05-25T19:07:40Z

Does that currently return an array or a scalar?

Kaiepi · 2019-05-25T19:09:10Z

A scalar

lizmat · 2019-05-26T09:07:39Z

Then I think a nqp::p6store will take care of that eventuality.

jnthn · 2019-05-29T23:55:51Z

Before I can continue with my pullreq, there's something that needs to be resolved. Modules in the ecosystem are using feed operators with things that aren't iterable.

My feeling is that any function you feed a value into had better be happy with getting its input as a final extra Iterable argument (presumably a Seq with an underlying iterator that is pulling from a Channel). Or, once we support it, such an argument at insertion point.

If we've things in the ecosystem that don't play well with that - which I don't believe the example given here will - we may need to preserve the existing semantics for 6.d and below, and introduce the new ones for 6.e.PREVIEW and onwards.

The feed operators really didn't get that much attention to date. The implementation before the recent work was very much a case of "first draft", and certainly didn't explore the parallel aspects alluded to in the language design docs. I'd be surprised if we can make them behave usefully going forward without breaking some of the (less thought out, and probably accidental) past behaviors.

jnthn · 2019-05-30T00:18:12Z

Also, some notes on the parallelism model with feed operators: it's quite different from the hyper/race approach.

In the hyper/race case, we take the data, divide it up into batches, and work on it. Where possible, for the sake of locality, we try to push a single batch through many operations, e.g. if you do @source.race.map(&foo).grep(&bar).map(&baz) then we'd send a batch, do the maps/grep in the worker, and send back the resulting values. In this model, the parallelism comes from dividing the input data. The back-pressure here is provided by the final consumer of the pipeline.

By contrast, the feed model is about a set of steps that execute in parallel. The parallelism is in the stages of the pipeline being run in parallel, not from the data items. It can be seen as a simple case of a Staged Event-Driven Architecture. Since a given state is single-threaded, it may be stateful - whereas if you try and do stateful things in a map block in a hyper/race it's going to be a disaster. The backpressure model here would ideally be that once a queue becomes full, you cannot put anything more into it. One possible solution here would be to make Channel take an optional bound. Then a send into a Channel that is considered full would block, so you can't put more in, meaning a fast stage can't overwhelm a slow one.

One slightly more general problem is that Channel today doesn't really fit our overall concurrency model very well: it blocks a real OS thread when we try and receive from it, whereas in reality we like non-blocking awaiting of things where possible. I mention that here mostly because I think the stages in a pipeline should be spawned on the thread pool scheduler, but it's quite clear that they won't be the best behaved schedulees with Channel as it exists today. Probably we should solve that at the level of Channel, though, so I'd just use Channel between the stages today. It means we get error and completion conveyance, which are easy to get wrong, so I'd rather not have more implementations of those. :-)

Some problems will be better parallelized with hyper/race, some with feed pipelines, but there's also the issue that some things aren't even worth bothering. I fear the ==> operator is especially vulnerable to that: while I don't think too many folks will write .hyper because it looks prettier, they probably will write ==> for that reason. If we magically speed up their programs with parallelism that's great, but there's a decent chance it won't be worth it, and in fact slow things down. That's a tricky problem, and it's also one we'll have to solve for the hyper/race model too. For now, I'd say just do the parallel implementation, and we'll investigate such heuristics and automatic decision making later. I don't think usage of ==> is widespread enough yet for us to really upset anything

Kaiepi · 2019-07-20T23:39:23Z

The parallelization part of this is done, all that's left is support for <<==, ==>>, and *. I have a question regarding how <<== and ==>> should work though:

my @foo = (1, 2, 3);
(4, 5, 6) ==>> @foo ==>> my @bar;
say @bar; # OUTPUT: (1, 2, 3, 4, 5, 6)

What should the value of @foo be after running this? (1, 2, 3, 4, 5, 6) or (4, 5, 6)? I think (4, 5, 6) DWIMs better, but I'm not entirely sure.

AlexDaniel added the language Changes to the Raku Programming Language label May 15, 2019

AlexDaniel assigned jnthn May 15, 2019

lizmat mentioned this issue May 15, 2019

Are feed operators supposed to append?? rakudo/rakudo#2899

Closed

lizmat mentioned this issue May 18, 2019

Update S03-feeds/basic.t with regards to parallelism and lazy lists Raku/roast#538

Open

vrurg added the 6.e Related to the next 6.e language release label Nov 27, 2019

vrurg added this to In Development in v6.e Release Nov 27, 2019

lizmat closed this as completed May 26, 2020

v6.e Release automation moved this from In Development to Done May 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What are the intended semantics of <== and <<== #27

What are the intended semantics of <== and <<== #27

lizmat commented May 15, 2019

Kaiepi commented May 17, 2019

Kaiepi commented May 18, 2019

lizmat commented May 19, 2019

Kaiepi commented May 19, 2019

Kaiepi commented May 25, 2019 •

edited

lizmat commented May 25, 2019

Kaiepi commented May 25, 2019

lizmat commented May 26, 2019

jnthn commented May 29, 2019

jnthn commented May 30, 2019 •

edited

Kaiepi commented Jul 20, 2019

What are the intended semantics of <== and <<== #27

What are the intended semantics of <== and <<== #27

Comments

lizmat commented May 15, 2019

Kaiepi commented May 17, 2019

Kaiepi commented May 18, 2019

lizmat commented May 19, 2019

Kaiepi commented May 19, 2019

Kaiepi commented May 25, 2019 • edited

lizmat commented May 25, 2019

Kaiepi commented May 25, 2019

lizmat commented May 26, 2019

jnthn commented May 29, 2019

jnthn commented May 30, 2019 • edited

Kaiepi commented Jul 20, 2019

Kaiepi commented May 25, 2019 •

edited

jnthn commented May 30, 2019 •

edited