Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revive pull request #1348: "Issue 9882 - Implement a "tee" style InputRange... #1965

Merged
merged 28 commits into from Jul 16, 2014

Conversation

MetaLang
Copy link
Member

...so that a function can be called during a chain of InputRanges."

@brad-anderson
Copy link
Member

#1348

template tee(alias func, Flag!"pipeOnFront" pipeOnFront = No.pipeOnFront)
if (is(typeof(unaryFun!func)))
{
auto tee(Range)(Range inputRange) if (isInputRange!(Range))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: isInputRange!Range

@brad-anderson
Copy link
Member

I'd change the commit message to describe what is changed rather than discuss GitHub processes. Also the Pull Request description should say what this is for.

{
private Range _input;

this(Range r)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This constructor is basically dead code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@JakobOvrum
Copy link
Member

The result should propagate length.

Perhaps it should also propagate bidirectionality and random access (and slicing?), with pipeOnFront taken into account on back/popBack, of course. It sounds like it would be a lot easier if it could use sub-typing.

As for the idea itself, as I've stated before, I personally don't want this in Phobos because it encourages mixing functional-style and imperative-style code. tee, as presented, is essentially imperative code disguised as functional code. In particular, I think how tee is intended for extra side-effects, masked as functional code, hurts readability.

edit:

It looks like the commit should have retained the original author as an act of courtesy.

@MetaLang
Copy link
Member Author

"It looks like the commit should have retained the original author as an act of courtesy."

Whoops, that was completely unintended. Is there a way to fix that?

@JakobOvrum
Copy link
Member

The easiest way would be to simply checkout the relevant branch on the author's fork, then rebasing that onto a new branch of your own, branched off master. Optionally this could use interactive-rebase's squash command to merge the original three commits into one.

@MetaLang
Copy link
Member Author

@eco Edited with the original commit message.

@JakobOvrum Is it possible to make a new commit with the updated author? I originally checked out the author's branch, cherry-picked it into a new branch, then made a few changes and rebased, which squashed the original commit with them as the author... I think.

@JakobOvrum
Copy link
Member

You'll want to remake the commit that introduces the relevant changes. The easiest way to do that is with rebase. In this case, it's also not terribly tedious to use cherry-pick, as there are only three commits. You can also use tools such as git commit --amend --author to do the same thing "manually". Note that you don't have to make a new pull request; reuse your branch then force push. If you're unfamiliar with the aforementioned tools, I'd be happy to post step-by-step instructions :)

@MetaLang
Copy link
Member Author

git commit --amend --author sounds a lot simpler. I'll just do that with a commit that removes the outer template.

…o that a function can be called during a chain of InputRanges."
@JakobOvrum
Copy link
Member

As I said, git-rebase really is the easiest way... but it does predicate familiarity with that command.

@MetaLang
Copy link
Member Author

I am not comfortable enough with git that I'm confident I wouldn't mess anything up.

@MetaLang
Copy link
Member Author

Removed outer template and the redundant constructor. As for tee's usefulness, it is intended to aid in debugging range code. It is somewhat dirty inserting imperative code into a mostly-functional range chain, but we can also use debug in pure functions for debugging. This has an advantage over std.algorithm.map in that the user can choose whether to call the lambda function when front is accessed, or only on popFront.

@JakobOvrum
Copy link
Member

The proposed documentation does not state that this is intended for debug code. With debug statements in pure functions, it is also obvious at the call site that the code is debug code, which is not the case for tee.

…CS chains, and that the default behaviour is to call func on popFront
@MetaLang
Copy link
Member Author

I have modified the documentation to make it clear that tee is useful for debugging.

…vailable statically, make it available statically in the result as well. Change pipeOnFront back to pipeOnPop because _input may have back and popBack.
@monarchdodra
Copy link
Collaborator

The proposed documentation does not state that this is intended for debug code.

Implying tee can't be used for things "other" than debug? If this is really the case, then tee should be a no-op in release, which id not the case at all.

assert(equal(newValues2, values));

int count = 0;
auto newValues3 = filter!(a => a < 10)(values)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should UFCS from the start:

    auto newValues3 = values
                      .filter!(a => a < 10)()
                      .tee!(a => count++)()
                      .map!(a => a + 1)()
                      .filter!(a => a < 10)();

@monarchdodra
Copy link
Collaborator

Perhaps it should also propagate bidirectionality and random access (and slicing?)

Propagating random access with pipeOnPop could be problematic, since you could iterate the entire range without ever popping it. You could break the interface. You can always forward length though: That's always useful, even without RA, for things like a call to reserve in array or Appender.

I you can propagate RA safely iff pipOnPop == false.

@monarchdodra
Copy link
Collaborator

One of the issues I am seeing is this pattern of

int sentinel;
auto range = tee!(++sentinel)( ... );
assert(range.equal([...]);
assert(sentinel = ...);

I see this as a problem, because sentinel will only be modified if the range is actually "walked". Yet, all I see is a assertive equality test. This is problematic, since code inside an assert should have no side effect. I know it's a unittest, but it is also a documented unittest, so we don't want to promote wrong code.

I'd recommend you use a piece of code that will un-conditionally walk you range, and then test the sentinel. I kind of wish we had a walkRange primitive that simply does that, but in the mean time, a call to array should work just as well (and you can then test the array result for equality). reduce!"1" could work as an alternative to walkRange.

That, or just:

bool b = r.equal([...]); //Side effect
assert (b); //Test result

@JakobOvrum
Copy link
Member

Implying tee can't be used for things "other" than debug? If this is really the case, then tee should be a no-op in release, which id not the case at all.

I already stated that I don't think it's appropriate for other purposes (or for debug purposes for that matter).

@monarchdodra
Copy link
Collaborator

I already stated that I don't think it's appropriate for other purposes (or for debug purposes for that matter).

Ah... I failed to correctly read the totality of that post. Apologies. I think your "In particular, I think how tee is intended for extra side-effects, masked as functional code, hurts readability." is relevant to the comment I made regarding "I see this as a problem, because sentinel will only be modified if the range is actually "walked"."

@JakobOvrum
Copy link
Member

Propagating random access with pipeOnPop could be problematic, since you could iterate the entire range without ever popping it. You could break the interface.

I don't see it as an issue because you can also call front (and back) freely without ever invoking the call, and I see that as equivalent of indexing. Perhaps the flag could be called pipeOnMutate to emphasise the behaviour. Regardless though, at least the behaviour of bidirectionality should be completely intuitive, so I don't see any reason not to propagate it.

@monarchdodra
Copy link
Collaborator

I don't see it as an issue because you can also call front (and back) freely without ever invoking the call, and I see that as equivalent of indexing.

Right, but you can't iterate with front/back alone. You have to iterate on it using popFront()/popBack(). With indexing, you can iterate the entire range, but without ever popping anything.

Perhaps the flag could be called pipeOnMutate to emphasise the behaviour.

I'd confused with that specific name, since I'd understand it as mutating the elements themselves.

Regardless though, at least the behaviour of bidirectionality should be completely intuitive, so I don't see any reason not to propagate it.

Agreed. That's already done though.

@Poita
Copy link
Contributor

Poita commented Feb 25, 2014

Agreed with @JakobOvrum. I'm worried about the side-effect nature of the way your are supposed to use this, and I'm incredibly worried about what that means when save is being forwarded. With save there, you have no idea how many times popFront or popBack will be called, so using it for things like counting is just asking for bugs.

I do think the tee problem needs to be solved, but I don't think this is the way to solve it (with or without save).

@MetaLang
Copy link
Member Author

I'd like it if someone could comment on my use of DDOC. I'm not familiar with it at all, and I don't know if I'm using it correctly. Also above Andrei suggest the following:

A thought on the issues with tee: it should maintain a flag thisItemHasBeenTeed which is initialized to false in the constructor. Then, all of empty, front, and popFront called will examine the flag and tee it if false, then set it to true.

My response was:

I've already changed tee to do this when pipeOnPop is false, so fun will only be called on front once per item. Is it really necessary to duplicate this check in popFront and empty? At the very least, I don't think it's a good idea to make empty have side-effects. It doesn't make much sense for popFront, either, as the value of front is discarded with each call to popFront anyway.

Can somebody comment on this?

@mihails-strasuns
Copy link

Can somebody comment on this?

I actually think what Andrei proposes is fundamentally wrong approach. We do have a problem of unspecified standard range consumption and this proposal is an attempt to hide it via hack.

There was recently a similar discussion in one of other pull requests I can't spot right now :( I don't know how to better proceed with it - there seems to be no agreement between @WalterBright and @andralex on this topic and keeping it as permissive as it is creates collateral damage in Phobos.

@MetaLang
Copy link
Member Author

Perhaps we should leave it for now, then, and I can make further pull requests in the future once everyone can agree on an answer.

@quickfur
Copy link
Member

I'm not sure I understand the context of @andralex 's proposal. What advantage does checking the flag 3 times in empty, front, and popFront, as opposed to only checking it in front? I don't see how it adds anything of value, only needless overhead from what I can tell.

@MetaLang
Copy link
Member Author

I think his desire was to have the input "tee'd" whether the user accesses front first, empty, or popFront. I did not really understand at the time either, but it seems this is in the context of trying to hash out what the conventions for range semantics should be (there was a thread on the newsgroup about it some time ago).

@mihails-strasuns
Copy link

@quickfur problem is that currently one does not know what range methods will be called each iteration cycle and how many times each. It is unspecified which forces all kinds of weird hacks when you try to implement something like "do this exactly once for each range element".

@MetaLang meh I am afraid if we won't do nothing it will just get lost again. I don't know what to do :(

@quickfur
Copy link
Member

Ah, I see. So basically if you chain tee to another range adaptor that skips over N elements then starts processing the rest, the sink won't receive the first N elements.
So looks like the issue here is whether the sink is supposed to receive every element in the input, no matter what, or only those that are subsequently processed downstream, etc., and whether it should be called exactly once per element, or once per downstream access, etc., and what order it will be (which is not obvious if you're dealing with a bidirectional or random access range, for example).
Based on the name, I'm thinking the idea of tee is to pipe every element of the input into the sink exactly once, in that order, since every other semantics just seems so counterintuitive it would make the resulting code extremely difficult to reason about. That being the case, I'd argue for exposing only an input range interface in the output -- so that all downstream accesses are confined to linear access.
Otherwise, I don't see how you can define sane semantics for it. For example, a forward range can be saved arbitrarily often downstream, and re-iterated over an unknown number of times. A bidirectional range can get elements on either end popped off in arbitrary order. A random access range can randomly skip around and fail to access all elements. All of these run counter to the idea of tee.
If we want to support anything more than an input range in the result, I'd argue we have a different range, maybe call it trace, that doesn't care how many times any given element is accessed, it just sends it to the sink every time front is accessed -- the idea being, of course, that you're using it to trace how downstream filters are accessing the original range, which is useful for debugging, etc.. Perhaps the source of our woes come from conflating tee with trace, when they are logically two different, though similar, functions.

@quickfur
Copy link
Member

Bah, looks like I'm talking nonsense again: tee already always returns an input range. The issue is whether the sink should receive the elements when empty is called, or front, or popFront. I'm inclined to say it should only do this in the ctor (if the input is non-empty) and in popFront. Then the sink is guaranteed to receive each element exactly once. There's no need to wrap around empty or front, and no need for boolean flags.

@MetaLang
Copy link
Member Author

It will do so in either front or popFront, configurable by the user with pipeOnPop.

@quickfur
Copy link
Member

Ah, i see. That makes sense. I'm inclined to say this is good to merge. I don't agree with Andrei that we should check the flag in all 3 places, especially in empty - what if the downstream range stops processing after checking empty? There is no guarantee it's actually going to continue just because empty returns false -- it may have other conditions to check. Then that element shouldn't be sent to the sink.

@MetaLang
Copy link
Member Author

I'm inclined to say this is good to merge.

Then what are we waiting for? @Dicebot @monarchdodra

@mihails-strasuns
Copy link

Ok, lets do it and see how badly it breaks :)

@mihails-strasuns
Copy link

Auto-merge toggled on

@JakobOvrum
Copy link
Member

Blergh. I bet later we'll consider tee a mistake.

@quickfur
Copy link
Member

I don't think the breakage will be very bad, if any at all, since it returns a non-forward input range, and there are only a limited number of things you can do with an input range.

mihails-strasuns pushed a commit that referenced this pull request Jul 16, 2014
Revive pull request #1348: "Issue 9882 - Implement a "tee" style InputRange...
@mihails-strasuns mihails-strasuns merged commit 6d5ab30 into dlang:master Jul 16, 2014
@mihails-strasuns
Copy link

Blergh. I bet later we'll consider tee a mistake.

I agree it is a terrible name but I want this functionality in Phobos and tired of endless bikeshedding.

@MetaLang
Copy link
Member Author

I didn't get a chance to squash my commits before the merge. Does a squash after merge still work?

@quickfur
Copy link
Member

It's too late now. While in theory you can modify history in master after the fact, doing so will break pretty much everyone's repo the next time they pull from master, which is a very bad idea and will make everyone hate you.

But in the end unsquashed commits don't matter, since git tracks the branching structure of the commits, so git log --graph for example will show a nice structure of exactly which commits came from which pulls. And there are other ways of filtering history so that irrelevant stuff is filtered out. No big deal.

@MetaLang
Copy link
Member Author

Righto. Thanks.

@MetaLang MetaLang deleted the std-range-tee-fixup branch July 16, 2014 18:56
@JakobOvrum
Copy link
Member

I agree it is a terrible name but I want this functionality in Phobos and tired of endless bikeshedding.

The functionality is the problem.

.tee!(a => writefln("pre-map: %d", a))
.map!(a => a + 1)
.tee!(a => writefln("post-map: %d", a))
.filter!(a => a < 10);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unittest is producing stdout output. This is especially bad since the unittest is un-documented, so completely useless.

@MetaLang : Can it be re-written in a way that does not write to stdout, or does it conditionally? Could you write to an output range instead, maybe? By printing to an output range, you can also assert the correct output of your range.

In any case, the printing is disruptive to unit-testing:
https://auto-tester.puremagic.com/pull.ghtml?projectid=1&runid=1127154

@MetaLang
Copy link
Member Author

@monarchdodra Will do. I'll make another pull request after work.

@MetaLang
Copy link
Member Author

MetaLang commented Sep 2, 2014

Pull request: #2480

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet