Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introduce "fold" to std.algorithm #2033

Closed
wants to merge 1 commit into from
Closed

introduce "fold" to std.algorithm #2033

wants to merge 1 commit into from

Conversation

monarchdodra
Copy link
Collaborator

Try 2.

This time, I think the pull is complete, so fire away and destroy anything you like. Should I apply for "formal review"? I'll post about this on the boards after the initial salvo here ;)


fold is basically still the same thing as reduce, but "fixes" some interface issues that were not fixable without breakage. Namelly, palcing the range before the seed, for UFCS friendliness. Not a huge problem, but a common enough occurrence to cause frustration, and justify fixing.

Furthermore, it also improves usability by making the seeds passed by parameter pack, instead of forcing the use of a tuple.

Finally, it allows using only 1 seed, in which case, the same seed is replicated and is used for all the functions.

Oh yeah, also, I made it so that when no seed is given, it is an Error to use an empty range. This is the only case of deviation, but I think having nothrow justifies it.

"iterables" are not supported anymore.

These changes should make for an overall nicer D experience, and justify the update via a new name. Old code should not be impacted.


Implementation wise, I think fold is very neat, and contains lots of improvements over reduce: Shorter code, better initialization scheme, a better inference mechanism for guessing the seed type...

But it doesn't actually contain anything I couldn't port back to reduce afterwards. Most of it anyways I think.

else static if (Args.length == 1)
{
//Only 1 seed provided. Repeat it as necessary.
RepeatType!(Unqual!(Args[0]), funs.length) result = seeds[0];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to check here, whether the single seed works with all functions?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It always makes sense to add sanity checks. As a matter of fact, you could do it lower in the code, which would end up checking that each seed is valid for their corresponding function.

@bearophile
Copy link

Some critique/comments in Bugzilla.

@andralex
Copy link
Member

Interest in backporting all of this good work to reduce without adding a new name is 100x the interest in adding fold.

@bearophile
Copy link

@andralex: the whole point of this issue is to have the iterable as first argument and seed as second, for usage in UFCS chains. How do you plan to do that without changing the name of the function Andrei?

@JakobOvrum
Copy link
Member

@andralex: the whole point of this issue is to have the iterable as first argument and seed as second, for usage in UFCS chains. How do you plan to do that without changing the name of the function Andrei?

It is possible to support both argument orders in the same function for the vast majority of argument types, but not all (some would be ambiguous). However, it makes the code even more complex and the documentation even worse.

@andralex
Copy link
Member

As @JakobOvrum said there's been work on that.

@monarchdodra
Copy link
Collaborator Author

@bearophile had this to say about the design in the bug report. Here are his comments, and my replies:

https://d.puremagic.com/issues/show_bug.cgi?id=8755

(In reply to comment #14)

Furthermore, it also improves usability by making the seeds passed by parameter pack, instead of forcing the use of a tuple.<

OK. (Despite in a modern language tuples should be built-in, so using them
should be natural, common, and syntactically very cheap. In
Python/Haskell/Scala code you don't see functions that refrain from accepting a
tuple).

The current most spread phobos style is to accept arguments via vararg, and to
return tuples when you need to return several args, so I tried to stick to
that.

I'm not sure it is possible to accept either a Tuple (that "auto expands"
later), and varargs, without potentially creating some ambiguity.

Finally, it allows using only 1 seed, in which case, the same seed is replicated and is used for all the functions.

This is from the unittests:

// Compute sum and sum of squares in one pass.
// This can be used to compute get the average and standard deviation.
// A single seed (0.0) is passed, but it is optional
// if the range is not empty.
r = a.fold!("a + b", "a + b * b")(0.0);
assert(approxEqual(r[0], 35));  // sum
assert(approxEqual(r[1], 233)); // sum of squares

This is ambiguous, it seems that "a + b" has a seed while "a + b * b" doesn't
have a seed. So in my opinion if you give N function then you need to give 0
seeds, or one N-tuple, or N seeds. So I don't like this.

You think? It made sense to me. I'll have to ask for more input on this. That
said, if we turn it down, then it would be possible to make Tuple and
args... co-exist as input argument style. EG:

r = a.fold!("a + b", "a + b * b")(0.0, 0.0); //OK!
r = a.fold!("a + b", "a + b * b")(tuple(0.0, 0.0)); //OK! too!

The second one is more verbose, but I see its justified if your seed is already
in the form of a tuple.

So I think you sold me on it.

Oh yeah, also, I made it so that when no seed is given, it is an Error to use an empty range. This is the only case of deviation, but I think having nothrow justifies it.

I am not sure this is a good idea. Throwing when you give no seed is probably
acceptable. But I am not sure.

It's a deviation, but I think it's justified. It makes the code nothrow, and
quite frankly, accessign an empty range is an Error, so end of story.

The only argument I'd accept in its favor is stability with reduce, but if we
could redesign, I'd never accept throwing an exception in such a case.

"iterables" are not supported anymore.

If by "iterables" you mean that fold doesn't accept opApply-based iterables
then I am against this change, I have plenty of code that opApply-based and I
sometimes use reduce on them.

OK. I can work them back in.

@monarchdodra
Copy link
Collaborator Author

Interest in backporting all of this good work to reduce without adding a new name is 100x the interest in adding fold .

I agree that porting things back into reduce is a good idea, and it is planned too. But that wasn't the point of the pull:

As @JakobOvrum said there's been work on that.

Yes, it was I. #861

Long story short, it creates a silent compile-time ambiguity that can only be observable at runtime. And that is absolutely in-acceptable.

[4, 5, 6].reduce!"a ~ b"([1, 2, 3]);

What would this do? Would it reduce the range [4, 5, 6] into the seed [1, 2, 3], or would it reduce the range [1, 2, 3] into the seed [4, 5, 6]?

The problem is that the ambiguity is un-resolveable, and there is no-way to be explicit about which you want: You just have to trust reduce will do the right thing, and will never change its default behavior.

For what it's worth, the above code exists in phobos in the form of strings, so, it's not theoretical.


The cleanest, simplest, most non-ambiguous and most non-breaking way to fix the "reduce argument order" issue, is to re-introduce it with a different name. Everything else is "sugar"/"upkeep". Most of it can and will be backported to reduce anyways.

So, @andralex : It boils down to only 1 question:
"Do we want to re-introduce reduce as fold, and fix the issue, or do we close the issue forever as won't fix".

There's no moving forward if we don't statute on this point.

@JakobOvrum
Copy link
Member

What would this do? Would it reduce the range [4, 5, 6] into the seed [1, 2, 3], or would it reduce the range [1, 2, 3] into the seed [4, 5, 6]?

Surely [4, 5, 6] would be the seed, and the result is [4, 5, 6, 1, 2, 3] like it is currently. Any other result would not be backwards-compatible, thus kinda defeating the point of reusing reduce.

The cleanest, simplest, most non-ambiguous and most non-breaking way to fix the "reduce argument order" issue, is to re-introduce it with a different name.

But I do agree with this.

Regardless of whether reduce is reused or not, I also want to schedule the seed-first reduce for deprecation so we don't have duplicated functionality...

@monarchdodra
Copy link
Collaborator Author

Any other result would not be backwards-compatible, thus kinda defeating the point of reusing reduce .

Exactly my point, but...

I also want to schedule the seed-first reduce for deprecation so we don't have duplicated functionality...

If we want to deprecate it, there has to be something to migrate to. Which is fold.

That said, I think I could make it work so reduce just becomes roughly alias reduce = nReverseArgs!fold (or vice versa). So it's not so much "duplicate functionality" as it is "name duality".

@JakobOvrum
Copy link
Member

If we want to deprecate it, there has to be something to migrate to. Which is fold.

Not if reduce is amended to accept either order. We can deprecate just one path.

@monarchdodra
Copy link
Collaborator Author

Not if reduce is amended to accept either order. We can deprecate just one path.

I get the idea, but what if I have:

[4, 5, 6].reduce!"a ~ b"([1, 2, 3]);

Which one is my seed? What is the deprecation message that will appear, and how will I fix it? How can I know which version I'm calling, and how can I tell I want to call the other one?
"[4, 5, 6].reduce!"a ~ b"([1, 2, 3]);" is wrong, please use "[1, 2, 3].reduce!"a ~ b"([4, 5, 6]);" instead

Won't work...

@JakobOvrum
Copy link
Member

Which one is my seed?

[4, 5, 6], as I said.

What is the deprecation message that will appear, and how will I fix it? How can I know which version I'm calling, and how can I tell I want to call the other one?

That's a good point, there would be no way to do this until the completion of the deprecation cycle (at which point the argument order can be switched to fix it). I guess my vote is with fold and entirely deprecated reduce, then.

@w0rp
Copy link
Contributor

w0rp commented Mar 23, 2014

I also vote for introducing fold and deprecating reduce. If you change reduce in just about any way, a lot of code will break. If you deprecate it slowly, it will be fine.

@JakobOvrum
Copy link
Member

If you change reduce in just about any way, a lot of code will break.

No, it can be changed in non-breaking ways.

@monarchdodra
Copy link
Collaborator Author

Thank you for the feedback. There is 1 last thing I want to discuss: I want to make the "no-seed" version of fold assert if the range is empty. Currently, reduce on enforces this.

I think it is correct to assert, and make the code nothrow to boot, but it also mean a deviation from reduce. How do you guys feel about this change?

Once this is answered, I can go back to hacking at this.

@JakobOvrum
Copy link
Member

I think it is correct to assert, and make the code nothrow to boot, but it also mean a deviation from reduce. How do you guys feel about this change?

What would the function return in release mode?

@monarchdodra
Copy link
Collaborator Author

What would the function return in release mode?

I'm unsure. Whatever value is produced from calling the front of the empty range I guess. The call (I think) is already in Error state, which should not have happened to begin with. It is well within the programmer's reach to not have let that happen to begin with.

@monarchdodra
Copy link
Collaborator Author

BRB!

@ntrel
Copy link
Contributor

ntrel commented Jan 30, 2016

@monarchdodra Discussion thread on the NG seems in favour of this now (this PR linked in discussion):
http://forum.dlang.org/thread/n8fkn1$2u7a$1@digitalmars.com?page=1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants