min/max with keyfunction + for a single range #4019

wilzbach · 2016-02-25T10:25:38Z

Hey all,

this PR allows the min/max function to be used with a single range and with a custom key getter.
This topic has been mentioned quite often (e.g. Issue 4705), however imho the solution is trivial.

Why is this cool? Because with this we can do a couple of useful queries - most notably argMin/argMax, e.g.

assert([5,2,4].enumerate.min!"a[1]".index == 1);

(see the tests for more examples)

A couple of comments

I am still new to D - so please let me know your comments about the code ;-)
I didn't know how I should change the function ddox header for min/max as

burner · 2016-02-25T13:52:59Z

std/algorithm/comparison.d

+    auto maxValue = keyGetterFun(r.front);
+    auto maxEl = r.front;
+    r.popFront();
+    foreach(el; r){


spaces between keywords (foreach,if,while) and the opening parenthesis

foreach (

JackStouffer · 2016-02-25T14:07:15Z

No need to specify a keyGetter when you can pass a less predicate like you can with max and min. This would allow things like this

rangeOfTuples.min!((a, b) => a[0] < b[1]);

which is currently not possible with your solution.

wilzbach · 2016-02-25T14:18:09Z

@burner: updated with changing regarding all your remarks! Thanks a lot!
(I am really sorry that I am making such newbie mistakes)

@JackStouffer isn't a keyGetter the usual usecase? What if the user uses then min!((a, b) => a[0] > b[0])

wilzbach · 2016-02-25T14:22:20Z

documentation is missing

Added, but doesn't this conflict with the existing documentation for min / max (see new generated doc)
As this is only adds functionality for ranges, maybe it makes sense to move it std.range.primitives to avoid this problem?

burner · 2016-02-25T14:35:37Z

have a look at the output of CyberShadows autotester it builds the docs

wilzbach · 2016-02-25T14:36:57Z

@burner I just did - it produced two times max and min http://dtest.thecybershadow.net/artifact/website-f1fa75b46edfc3ec5685bf0764e09831ddfd0846-e5c0e65f21faa1e14726befbcae0370b/web/library-prerelease/std/algorithm/comparison.html#max

JackStouffer · 2016-02-25T18:54:47Z

isn't a keyGetter the usual usecase?

95% of the time, yes it is. But there's no reason we can't accommodate the other 5%

What if the user uses then min!((a, b) => a[0] > b[0])

Then the user either made a mistake which will be caught shortly, or for some strange reason doesn't want to use max. Either way it's not that common

Another added benefit of going with a predicate is that you can make this DRY by having both functions use another private function as the implementation, as most of these two functions are the same code.

wilzbach · 2016-02-25T19:50:54Z

@JackStouffer I added selectOne which hopefully was what you were asking for ;-)

With that min/max are pretty easy to write (see below), however I would suggest to let them in there for convenience and better naming.

r.selectOne!("a < b"); //min
r.selectOne!("a > b"); //max

ntrel · 2016-02-29T13:56:48Z

std/algorithm/comparison.d

+    // keygetter
+    assert([1,5,2].enumerate.max!"a[1]".index == 1);
+    assert([1,5,2].enumerate.max!"a[1]".value == 5);
+    assert([1,5,2].enumerate.max!"a[1]".array == [1,5]);


IIUC, "a[1]" is the same as "a.value". Using the latter would be clearer and more consistent. I didn't understand this code until I looked up the docs for enumerate. Ditto below various times.

2nd, the .array line is a bit confusing. Clearer might be to say the result of max == tuple(1, 5).
Also, the source array starts with 1,5 - perhaps change the 1 to something unique, just to make the example clearer.

wilzbach · 2016-02-29T14:11:10Z

@ntrel addressed and updated. thanks :)

GassaFM · 2016-03-01T10:02:08Z

I'd rather see a new name for a function that finds the minimum / maximum in a range.
I've explained my reasoning in this post. The example from there:

import std.algorithm, std.range, std.stdio, std.typecons;
void main () {
    writeln (min (3, 1, 2));  // 1
    writeln (min ([3, 1, 2]));  // 1
    writeln (min (tuple ([5, 6], [1, 2], [3, 4]).expand));  // [1, 2]
    writeln (min (tuple ([3, 1, 2]).expand));  // 1, huh?
}

In the example above, it is clear what happens. But with a heavily templated code, I'd rather see a compiler error when it has to find minimum of one argument after tuple expansion than have it silently handled this way.

Other than that, I like your "less participation to useless debates + more real contribution = win" attitude :) .

wilzbach · 2016-03-01T12:36:37Z

@GassaFM you have a very fair point there, but I only partially agree.
On on side a) we have the violation of the principle of least astonishment for newcomers (min with one arguments as a list should work) and on the other side b) silent errors by tuple expansion with one-item as a list.
As you mentioned a lot of languages (Java, C#, Python, more at Rosetta) allow min to be used with one argument and as still being a newcomer to D I feel that many people like me will just expect [1,2,3].min to work.

My personal opinion is that the violation of the principle of least astonishment weighs stronger than this silent handling of tuple expansion, because the majority (newcomers) will have to google/search for it whereas this tuple example seems seldom and still understandable.

What is your final opinion about this?, e.g. @JackStouffer, @andralex and @default0

Naming

That being said I would propose to add selectOne as a general solution to this with possibly convenience aliases. How about I split this PR up in selectOne (to which I hope we can agree) and one with a possibly longer discussion for its convenience naming?

If renaming the max/min methods is the consensus, other ideas for names would be

Min/Max
MinOne, MaxOne
smin, smax (s for single)
rmin, rmax (r for range)
rangeMin, rangeMax
oneMin, oneMax
minimum, maximum (my favorite)
minElement,maxElement` (from @andralex)

What names do you like?

andralex · 2016-03-01T13:51:42Z

std/algorithm/comparison.d

+
+Returns: The lowest value according to $(D pred) of the passed-in values.
+*/
+auto selectOne(alias pred, alias keyGetter = (a) => a, Range)(Range r)


This name is not suggestive of "extremum".

We don't use the key getter style of programming most anywhere else in Phobos (though it does have appeal). For projections we use map. Here, custom less predicate would be used.

Changed to extremum and removed the key getter.

The idea of the keygetter or map is not only syntax sugar, the idea is also to save mapFun calls as if I write extremum!"a.score() > b.score()" score would be executed twice as often.

I do admit that map is usually an inexpensive function or just an access, but here we save in performance and have better usability.

input.extremum!((a, b) => computeDistanceTo(a) < computeDistanceTo(b))"
vs.
input.minElement!computeDistanceTo

andralex · 2016-03-01T13:55:01Z

Overall I think this is rather foreign compared to prevalent Phobos style. I think extremum,minElement, and maxElement with the appropriate predicates are the way to go.

wilzbach · 2016-03-01T16:29:47Z

@andralex I really like your choice extremum and I changed the PR regarding all your comments and to min/maxElement.

Moreover I also moved it from algorithm/comparison.d to algorithm/searching.d.

Btw and other idea to be friendlier to newbies - how about adding a default method for min and max that accepts a range as single argument and triggers a helpful compile-time error (e.g. "please use maxElement for ranges")?

wilzbach · 2016-03-03T18:22:37Z

I think extremeElement needs to be mutable even when Range is const(T)[]. Not sure what the typical solution is, maybe std.traits.Unqual.

Nice - done and added a couple of tests with immutable, const and string ;-)

ntrel · 2016-03-05T18:17:12Z

LGTM :-)

dnadlinger · 2016-03-05T21:06:26Z

The issue I see with extremum is that there are two extrema for each predicate, the minimum and the maximum.

wilzbach · 2016-03-05T21:08:11Z

The issue I see with extremum is that there are two extrema for each predicate, the minimum and the maximum.

@JackStouffer suggested it might be useful to some people. We can also make it private for the time being?

dnadlinger · 2016-03-05T21:15:14Z

I'm talking about the name. extremum is misleading/unclear, since you don't know whether it will return the minimum or the maximum under the given relation. Both are "extremum"s.

wilzbach · 2016-03-05T21:19:22Z

extremum is misleading/unclear, since you don't know whether it will return the minimum or the maximum under the given relation. Both are "extremum"s.

Yes that is precisely the definition of extremum, so I do see your point.
There is still selectOne, but big boss didn't seem to like it. Do you have a better idea?

wilzbach · 2016-03-05T21:31:05Z

Do you have a better idea?

In SQL one calls it selectTop, but how about best, bestOne or selectBest?

ntrel · 2016-03-06T11:59:57Z

std/algorithm/searching.d

+
+Returns: The extreme value according to $(D less) of the passed-in values.
+*/
+auto extremum(alias less, Range)(Range r)


The predicate probably should be named more generically, e.g. selector.

good idea - done :)

ntrel · 2016-03-06T12:01:57Z

Looks like we can close this issue when this is merged:
https://issues.dlang.org/show_bug.cgi?id=13940

JakobOvrum · 2016-03-06T14:29:29Z

std/algorithm/comparison.d

@@ -1382,6 +1385,8 @@ Iterates the passed arguments and returns the minimum value.
 Params: args = The values to select the minimum from. At least two arguments
    must be passed, and they must be comparable with `<`.
 Returns: The minimum of the passed-in values.
+See_Also:
+    $(XREF algorithm,searching,minElement)


s/XREF/XREF_PACK/, or use the variadic REF: $(REF minElement, algorithm, searching)

Ah I already did this for maxElement, but just forgot it. Thanks & updated!
Btw is there any cheatsheet or nice overview of all these ddox magics`?

I've tried to migrate all our cross-referencing over to the new variadic MREF and REF macros, but the PRs are stagnant. Without these new macros our situation is a mess. I'd be up for documenting our Phobos macros once we pull through with that.

JakobOvrum · 2016-03-06T14:34:53Z

std/algorithm/searching.d

@@ -33,6 +33,9 @@ $(T2 commonPrefix,
        $(D commonPrefix("parakeet", "parachute")) returns $(D "para").)
 $(T2 endsWith,
        $(D endsWith("rocks", "ks")) returns $(D true).)
+$(T2 extremum,
+        Selects the extreme element of a range.
+        $(D [[0, 4], [1, 2]].extremum!"a[1] > b[1]") returns $(D [0, 4]).)


Is this different from [[0, 4], [1, 2]].maxPos!((a, b) => a[1] < b[1]).front?

I guess this was already pointed out.

edit:

Or maybe not. I don't see any mention of our existing minPos and recently also maxPos.

See: https://issues.dlang.org/show_bug.cgi?id=13940

wilzbach · 2016-03-10T16:48:15Z

ping @andralex - do you still prefer extremum or one of the other variants suggested above (e.g. selectBest).
What else is blocking this PR? :)

wilzbach · 2016-03-16T10:36:45Z

Ping pong @andralex - or maybe other reviewers have an opinion on this too @JakobOvrum @klickverbot @jmdavis @quickfur ?

ntrel · 2016-03-16T12:21:16Z

std/algorithm/searching.d

+        $(D minElement([3, 4, 1, 2])) returns $(D 1).)
+$(T2 maxElement,
+        Selects the maximal element of a range.
+        $(D minElement([3, 4, 1, 2])) returns $(D 4).)


typo: should be maxElement.

good spot - thanks :)

ntrel · 2016-03-16T12:25:03Z

I think @JakobOvrum is right, extremum!selector is the same as minPos!selector.front. Does that make extremum redundant? It seems a bit niche as it is, given min/maxElement. If we do add it, it would be nice to have a compelling example.

Also, wouldn't min/maxElement be more efficient if it only made one call to the map function per element? Or can it be optimised away?

wilzbach · 2016-03-23T23:22:14Z

extremum!selector is the same as minPos!selector.front. Does that make extremum redundant?

I am not a friend of extremum either - the only argument would be the performance point as it can be written to evaluate the mapping per element only once and the slicing (or saving of the forward reference) in minPos might create a slight overhead.

Also, wouldn't min/maxElement be more efficient if it only made one call to the map function per element? Or can it be optimised away?

AFAIK it can't be optimized away - my initial implementation was an extremum function that did this caching of the mapping function, which was criticized due to its inflexibility which would fall away if we agree that we only provide {min,max}Element because of a) performance or b) convenience. Would be happy to change it back ;-)

wilzbach · 2016-03-25T08:59:24Z

Could someone add the @andralex label? Thanks ;-)

wilzbach · 2016-03-29T18:37:24Z

I forgot to add {min,max}Element to the algorithm's package booktable - ping pong ping @andralex whether extremum should be exposed to the user.

wilzbach · 2016-04-08T08:36:53Z

Ping pong ping pong

ntrel · 2016-04-08T10:04:03Z

It seems Andrei has already approved the names: #4019 (comment)

So AIUI, one of the other reviewers with merge rights could merge this. Personally I would make extremum private though.

wilzbach · 2016-04-20T00:09:56Z

Personally I would make extremum private though.

Also, wouldn't min/maxElement be more efficient if it only made one call to the map function per element? Or can it be optimised away?

Resubmitted a variant #4221 that only calls the mapping function once as we now can modify the internal extremum :)

wilzbach force-pushed the min_max branch from 394cc06 to 8f12289 Compare February 25, 2016 10:26

burner reviewed Feb 25, 2016
View reviewed changes

wilzbach force-pushed the min_max branch 2 times, most recently from f9537bf to 941aede Compare February 25, 2016 14:14

wilzbach force-pushed the min_max branch from 941aede to b0d9c92 Compare February 25, 2016 14:27

wilzbach force-pushed the min_max branch from b0d9c92 to e200b37 Compare February 25, 2016 19:42

ntrel reviewed Feb 29, 2016
View reviewed changes

wilzbach force-pushed the min_max branch from e200b37 to 4e98ee9 Compare February 29, 2016 14:10

wilzbach force-pushed the min_max branch from 4e98ee9 to ae7ca80 Compare February 29, 2016 14:12

andralex reviewed Mar 1, 2016
View reviewed changes

wilzbach force-pushed the min_max branch 2 times, most recently from 6ebe16c to 972e2cc Compare March 1, 2016 16:14

wilzbach force-pushed the min_max branch from 972e2cc to 350f2b6 Compare March 1, 2016 16:32

ntrel reviewed Mar 6, 2016
View reviewed changes

wilzbach force-pushed the min_max branch from 7687bf0 to 02dbf88 Compare March 6, 2016 13:28

JakobOvrum reviewed Mar 6, 2016
View reviewed changes

wilzbach force-pushed the min_max branch from 02dbf88 to 70ef032 Compare March 6, 2016 14:33

JakobOvrum reviewed Mar 6, 2016
View reviewed changes

ntrel reviewed Mar 16, 2016
View reviewed changes

wilzbach force-pushed the min_max branch from 70ef032 to c59e085 Compare March 16, 2016 19:02

extremum and {min,max}Element for a single range

dee39ba

wilzbach force-pushed the min_max branch from c59e085 to dee39ba Compare March 29, 2016 18:35

DmitryOlshansky added the @andralex label Apr 5, 2016

wilzbach mentioned this pull request Apr 20, 2016

std.algorithm: {min,max}Element for a single range #4221

Merged

wilzbach closed this Apr 20, 2016

wilzbach deleted the min_max branch June 4, 2016 16:11

Uh oh!

min/max with keyfunction + for a single range #4019

min/max with keyfunction + for a single range #4019

Uh oh!

Conversation

wilzbach commented Feb 25, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JackStouffer commented Feb 25, 2016

Uh oh!

wilzbach commented Feb 25, 2016

Uh oh!

wilzbach commented Feb 25, 2016

Uh oh!

burner commented Feb 25, 2016

Uh oh!

wilzbach commented Feb 25, 2016

Uh oh!

JackStouffer commented Feb 25, 2016

Uh oh!

wilzbach commented Feb 25, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wilzbach commented Feb 29, 2016

Uh oh!

GassaFM commented Mar 1, 2016

Uh oh!

wilzbach commented Mar 1, 2016

Naming

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andralex commented Mar 1, 2016

Uh oh!

wilzbach commented Mar 1, 2016

Uh oh!

wilzbach commented Mar 3, 2016

Uh oh!

ntrel commented Mar 5, 2016

Uh oh!

dnadlinger commented Mar 5, 2016

Uh oh!

wilzbach commented Mar 5, 2016

Uh oh!

dnadlinger commented Mar 5, 2016

Uh oh!

wilzbach commented Mar 5, 2016

Uh oh!

wilzbach commented Mar 5, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ntrel commented Mar 6, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wilzbach commented Mar 10, 2016

Uh oh!

wilzbach commented Mar 16, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!