min/max reductions are confusing #4235

JeffBezanson · 2013-09-09T19:00:01Z

The generic max(itr) calls max on pairs of values, while max(Array) calls > on pairs of values. This is a problem because max(A::Array,B::Array) is elementwise, while all other definitions of max(A,B) return either A or B depending which is "larger".

At the very least, all 1-argument max definitions should be consistent on whether they do reduce(max, ...) or call >.

Then we are just left with the elementwise behavior, which may be a fight for another day.

The text was updated successfully, but these errors were encountered:

JeffBezanson · 2013-09-10T07:57:40Z

Worth noting that we solve this problem in other cases by giving the reducing function a different name: + => sum, * => prod. It would be nice to do this for min and max too, but it's hard to think of names, other than maybe max => maximum, or maybe maximal.

mschauer · 2013-09-10T11:26:44Z

An alternative view would be that rather max and (a > b ? a : b) should agree and the piecewise maximum could receive a new name, e.g. pmax as in R (because .max would be very exotic.)

JeffBezanson · 2013-09-10T20:43:43Z

I would be ok with a solution like that too.

…(max, A) for #4235

JeffBezanson · 2013-10-04T17:45:49Z

I think we should get this into 0.2 since it's a potentially breaking API change. In particular we must get rid of the awful max(A, (), dim) methods.
cc @ViralBShah @StefanKarpinski

timholy · 2013-10-04T20:02:04Z

Agreed that this is a noteworthy wart.

lindahua · 2013-10-07T12:35:50Z

I agree that the choice of using max for both the element-wise operation that takes the larger values and the reduction to get the maximum was unfortunate. It causes all this mess and also forces us to use the ugly syntax like max(A, (), dim).

I am in favor of using maximum and minimum for the reduction purpose, and am more than happy to fix any issues caused by this breakage in the packages maintained by me. That being said, we should be cautious though, as this would be a major API change, given how widely these functions may have been used.

blakejohnson · 2013-10-07T13:50:46Z

We tried to fix this before by renaming the multi-arg versions of max and min to maxof and minof in #2419. I like the name, but as you can see it required changing many uses of max and min.

lindahua · 2013-10-09T12:32:25Z

I suspect that renaming the reduction versions from max and min to maximum and minimum might be relatively easier than renaming max(a, b) to maxof(a, b), as the latter seems to be used much more frequently.

blakejohnson · 2013-10-09T13:03:46Z

@lindahua I think you are right.

johnmyleswhite · 2013-10-09T13:35:33Z

+1 for Dahua's names.

JeffBezanson · 2013-10-10T05:07:12Z

I'm fine with that renaming. It solves the () problem, but then there is the elementwise behavior. Would maximum(A) do reduce(max, A), or pick the largest element according to >? If max(a,b) were not elementwise, those two would be the same.

StefanKarpinski · 2013-10-10T21:33:01Z

Here's an idea. The min and max functions without a dim keyword argument do reduction over each argument and then return the min/max across arguments – they're basically maximally greedy. If you give a dim keyword argument with a single input array, then reduce those dimensions of it. If you give multiple array arguments with a dim keyword, then each argument is dimension reduced and then parallel min/max is done across the reduced results. Examples:

min(1,2) => 1
min([1, 2, 3], 10, [0]) => 0
min([1, 2, 3], [4, 5, 6]) => 1
min([1 2 3; 3 2 1], dim=1) => [1 2 1]
min([1 2 3; 3 2 1], dim=2) => [1 1]'
min([1 2 3; 3 2 1], [2 0 2] dim=1) => [1 0 1]
min([1 2 3; 3 2 1], [3 2 1; 1 2 3] dim=()) => [1 2 1; 1 2 1]

I think this is equivalent to having dim default to 1:n where n is the maximum number of dimensions of its arguments, with the convention that reducing along dimensions that something doesn't actually have isn't an error, but rather just works. I think this smoothly dovetails the two meanings of min and max without introducing new function names and it generally Just Works™.

JeffBezanson · 2013-10-10T21:58:18Z

Not a bad idea, but I'm worried about all that reducing. Would max("ab","bc") still give "bc", or 'c'?

StefanKarpinski · 2013-10-11T04:44:50Z

Ah, the dual nature of strings as both scalers and collections rears its homely head. I would vote to stick to doing this stuff for abstract arrays and leaving it at that. How often do you really want to reduce over the characters in a string?

JeffBezanson · 2013-10-11T05:29:02Z

This seems to be what Max does in mathematica (flattens all of its arguments). It strikes me as a bit odd, particularly in the case of arrays of arrays. But I think I like it more than the elementwise behavior (which can be done with clamp).

Honestly, at the end of the day I think all of these behaviors are non-obvious, and the one true max function is max(a,b) = a > b ? a : b. I also accept maximum(a) = reduce(max,a) (which also allows for maximum(A,dims)). Anything else is just hard to reason about. This is a pretty painful change but I think it's the only really sane approach.

StefanKarpinski · 2013-10-11T05:38:11Z

I don't really see why the version I proposed is so hard to reason about but I can see the value in having different names. How do you do an element-wise max in your scheme? Is that max(A,B) where the arguments are matrices?

JeffBezanson · 2013-10-11T05:40:53Z

Absolutely not; as I said the one true max function is max(a,b) = a > b ? a : b. Elementwise max is done with clamp or map.

JeffBezanson · 2013-10-11T05:43:49Z

Your version is not so bad; it is basically the same as mine except that max also flattens out any AbstractArrays in its arguments.

StefanKarpinski · 2013-10-11T06:24:04Z

I really don't think that asking people to remember max, maximum, clamp and map is great. I don't think i can keep them straight right now. What does clamp do here? Certainly it isn't easier to use than my thing, which mostly just does what you want.

JeffBezanson · 2013-10-11T06:43:45Z

I don't consider it confusing to give different names to different operations. If max is simple and well understood, and map is too, then combining them is easy. And there are several different operations being discussed here; it's going to be harder to cram them into one function.

True that elementwise max/min is a very useful operation, but your version doesn't provide it either.
clamp(x, lo, hi) limits x to between lo and hi. A typical use is clamp(image, 0, 255).

JeffBezanson · 2013-10-11T07:00:59Z

Sorry, I forgot you put the elementwise behavior into the dim keyword definition.

JeffBezanson · 2013-10-11T07:36:50Z

One obstacle is that reducing multiple dimensions still gives an array:

julia> sum(rand(2,2), 1:2)
1x1 Array{Float64,2}:
 1.18867

StefanKarpinski · 2013-10-11T07:47:43Z

Right. That's definitely is a bit of an issue. We could drop reduced dimensions if some of the arguments don't have them.

johnmyleswhite · 2013-10-11T12:41:36Z

FWIW, R does element-wise operations using pmax:

> min(c(1, 2, 3), c(3, 2, 1))
[1] 1
> pmin(c(1, 2, 3), c(3, 2, 1))
[1] 1 2 1

lindahua · 2013-10-11T13:03:15Z

There do exist some ambiguities of the semantics when we talk about element-wise max.

What is the desired behavior of max(a, b) when a and b are array of arrays. Currently, map(max, a, b) and max(a, b) give different results

julia> a = {[1, 3]};
julia> b = {[4, 2]};

julia> max(a, b)
1-element Array{Any,1}:
 [4,2]

julia> map(max, a, b)
1-element Array{Any,1}:
 [4,3]

This kind of inconsistent behavior is partly ascribed to how we define max currently:

max(x, y) = (x > y ? x : y)   # -- (1)
max{T1<:Real,T2<:Real}(x::AbstractArray{T1}, y<:AbstractArray{T2})  # -- (2)

When we call max(a, b), it selects the version (1), and since Julia allows comparing arrays as a whole using lexicographical order, it considers b as the greater one, and returns it.

When we call map(max, a, b), it applies max to each pair of elements. Since max(a[1], b[1]) is [4, 3], it returns {[4, 3]}.

I think that's why Jeff proposes to just keep max as simple as a > b ? a : b and remove the element-wise mapping functionality.

However, I still think it is worth keeping element-wise max as it is, because of several reasons:

Element-wise max is very useful in practice, but map(max, a, b) is just simply too slow at current point. A very simple test on my machine gets the following, over two arrays of length 10^7, max(a, b) takes 0.085 second, while map(max, a, b) takes 1.21 second (warming was done before measuring). The latter is 14x slower than the former -- this is enough to justify a dedicated method with efficient implementation.
clamp is not a proper function to replace max, especially when both a and b are arrays. Also max doesn't do the extra work to compare with the lower bound even when b is a scalar.
If we just keep the max(a, b) = (a > b ? a: b) definition and remove the element-wise functionality. Then people will see max([1, 5], [3, 4]) as [3, 4] instead of what they expect, that is [3, 5]. This vastly contrasts with what all other technical computing platform would do, and it is a really big surprise to people.
The element-wise definition did not actually cause a lot issues in practice -- that's why we notice this just a month ago when Jeff thought about it. If we change this, it would be a major breakage.

I really think that we should keep the element-wise version. However, we have to clean up the interface a little bit. For example, we should remove the T<:Real restriction from the element-wise version, such that the element-wise behavior also applies to Array{Any}.

In the mean time, the reduction functionality should be separated, and we may use maximum and minimum for reduction. This will reduce ambiguity in some cases and we can allow maximum(a, dim).

lindahua · 2013-10-11T13:29:23Z

The more I think about this, the more I feel the same with Jeff. Element-wise max/min does cause a lot of troubles when the elements themselves are arrays.

There are possibly two ways out of this mess:

Restrict element-wise operations to arrays of real numbers, as

max(a, b) = (a > b ? a : b);
max(a::AbstractArray, b::AbstractArray) = error("  ... some useful error message ... ")
max{T1<:Real,T2<:Real}(a::AbstractArray{T1}, b::AbstractArray{T2}) = ... element wise max ...
max{T1<:Real,T2<:Real}(a::AbstractArray{T}, b::T2) = ...
max{T1<:Real,T2<:Real}(a::T1, b::AbstractArray{T2}) = ...

Introduce a different name for element-wise max (say pmax as mentioned by John, which is used by R). Since map(max, a, b) remains too slow, a dedicated element-wise max function is needed. This introduces an additional name, and probably major breakage & surprise, but it is a way that is 100% clean -- zero ambiguity.

lindahua · 2013-10-11T13:46:23Z

A third way to reduce the ambiguity is to stop allowing arrays to be comparable. I don't really see why we should make arrays (as a whole) to be comparable. I never see people compare arrays in this way as [1, 4] < [2, 3]. Lexicographical comparison only makes sense for strings (and possible tuples), which fortunately are not defined as arrays in Julia.

If we disable comparison between arrays, then there won't be ambiguities when we use a single function to express both meaning, since max(a, b) will always have one behavior that makes sense -- when either a or b is an array, it does element-wise operation, otherwise, it picks the larger one. And it is simply not allowed to work on array of arrays, as the elements in such cases are not comparable.

johnmyleswhite · 2013-10-11T13:48:37Z

"Lexicographical comparison only makes sense for strings"

This isn't quite true: there's a substantial literature on lexicographic preferences in psychology and economics.

lindahua · 2013-10-11T14:04:31Z

I should have clarified my point. Whereas lexicographical comparison between arrays do has its value in specific context, it is not a general operation as the comparison between real numbers. Therefore, it would be beneficial to encourage users to explicitly express their intent when lexicographical comparison is what they intend to do. This will not only make the code clearer and less ambiguous, but also avoid having other users caught in a big surprise when they see max([1, 4], [2, 3]) produces [2, 3] instead of [2, 4].

Using specific functions for lexicographical comparison in the place of < for arrays also addresses the issue being discussed, that is, the ambiguities between the two meanings of max. Since lexicographical comparison between arrays is only used with in certain context, this change will not be a huge breakage which would otherwise caused by changed semantics of max and min.

lindahua · 2013-10-11T15:32:11Z

Here is a detailed proposal that comes with code skeleton:

Semantics:

max(a, b) yields the greater one, when a and b are comparable (and neither of them is an array). When a and b are floating-point values, special cases (e.g. NaN) will be taken care of using specialized methods.
max(a, b) performs element-wise operation when either argument is an array. But elements can not be arrays in general.
maximum(a) and maximum(a, dims) perform reduction.
An additional order argument may be provided for the use case where one want to do element-wise max between collections of arrays (e.g. using lexicographical order).

With this semantics, major of codes that currently use max will still work normally expected (except that codes that do reduction should use maximum instead of max), while ambiguities that might arise when using arrays of arrays would be eliminated.

Here is the codes to achieve this goal:

# the emax function is an internal function that is supposed to be used
# within element-wise and reduction functions. It is not exposed.
emax{T}(a::T, b::T) = (a < b ? b : a)
emax{T<:Real}(a::T, b::T) = (a < b ? b : a)
emax(a::Float32, b::Float32) = ... # special implementation that takes care of NaN
emax(a::Float64, b::Float64) = ... # special implementation that takes care of NaN
emax{T1<:Real, T2<:Real}(a::T1, b::T2) = emax(promote(a, b)...)
emax(a::AbstractArray, b::AbstractArray) = error(" ... useful error message ... ")
emax(a::AbstractArray, b) = error(" ... ")
emax(a, b::AbstractArray) = error(" ... ")

# to support element-wise max when special ordering is supplied
emax(a, b, ord::Ordering) = lt(ord, a, b) ?  b : a

## scalar max
# general case
max(a, b) = emax(a, b)
max(a, b, ord::Ordering) = emax(a, b, ord)

# element-wise for arrays
max(a::AbstractArray, b::AbstractArray) = ... # map(emax, a, b), implemented in an efficient way
max(a::AbstractArray, b) = ... # map(x -> emax(x, b), a) implemented in an efficient way
max(a, b::AbstractArray) = ... # map(x -> emax(a, x), b) implemented in an efficient way

# the following maps (x, y) --> emax(x, y, ord) in an element-wise manner in an efficient way
max(a::AbstractArray, b::AbstractArray, ord::Ordering) = ...
max(a::AbstractArray, b, ord::Ordering) = ...
max(a, b::AbstractArray, ord::Ordering) = ...

# reduction
function maximum(itr)  # reduction also use emax internally
    s = start(itr)
    m = next(itr, s)
    while !done(itr, s)
        (v, s) = next(itr, s)
        m = emax(m, v)
    end
    return m
end

maximum(itr, ord::Ordering) = ... # use ord for reduction
maximum(a::AbstractArray, dims::Dims) = ... # reduction along dimensions
maximum(a::AbstractArray, dims::Dims, ord::Ordering) = ... # reduction along dimensions using ord

# deprecate max(a) for single argument case in favor of maximum(a)
# deprecate max(a, (), dims) in favor of maximum(a, dims)

I think this properly address all the issues. The introduction of emax enables the use of max for scalar max and element-wise max without causing ambiguities (while cases that require special care, e.g. floating point max, are also taken care of).

Also, this would not introduce major breakage. The use of max(a) and max(a, (), dims) is deprecated gracefully.

StefanKarpinski · 2013-10-11T15:52:10Z

I haven't caught up on this whole conversation, but I would note that while order=Lexicographic could be the underlying implementation, a better way to express this would be to write lt=lexless and then have a special case in the ord function that checks if lt === lexless and uses the Lexicographic ordering type. This is how it works currently for lt === isless and by === identity:

function ord(lt::Function, by::Function, rev::Bool, order::Ordering=Forward)
    o = (lt===isless) & (by===identity) ? order  :
        (lt===isless) & (by!==identity) ? By(by) :
        (lt!==isless) & (by===identity) ? Lt(lt) :
                                          Lt((x,y)->lt(by(x),by(y)))
    rev ? ReverseOrdering(o) : o
end

This isn't the most elegant thing in the world, but it's really just a stop-gap until comparison functions can be specialized on and inlined. It keeps the same interface that you'd want if that were the case while allowing specialization. I was actually considering removing the order keyword altogether – it exposes the user to far too much internal machinery, but decided to keep it because otherwise deprecation is a total nightmare.

JeffBezanson · 2013-10-11T17:41:06Z

@lindahua 's latest proposal looks pretty good. After some thought, I agree that making a < b lexicographic for arrays by default is more trouble than it's worth. It's also very different from what most people are used to.

It's a good idea to make maximum work like the sorting functions (to the extent that they are both about order).

One small change I'd propose is to implement emax as

emax(a,b) = max(a,b)
emax(a::AbstractArray, b::AbstractArray) = error()
emax(a, b::AbstractArray) = error()
emax(a::AbstractArray, b) = error()

so that max can still be extended by defining methods for max.

lindahua · 2013-10-11T18:32:07Z

@JeffBezanson: Yes, your modification makes sense.

StefanKarpinski · 2013-10-11T19:17:30Z

This plan seems good to me although I'm a little concerned that the fancy NaN logic and other emax specializations don't compose with general orderings, but that's a bit of an implementation detail. The overall scheme seems sound.

Aside: I figured out a reasonable way to deprecate the order keyword argument: ad6903e.

this is step 1 for #4235

JeffBezanson · 2013-10-11T19:35:00Z

Shame; in this particular case order=Lexicographic is a nice interface.

StefanKarpinski · 2013-10-11T19:50:14Z

Well, I could always undeprecate it.

JeffBezanson · 2013-10-11T20:14:17Z

Next we have to decide how lexicographic ordering works. Maybe:

lexcmp(a, b) = cmp(a,b)
lexcmp(a::AbstractArray, b::AbstractArray) = # existing implementation of cmp for arrays

lexless(a,b) = lexcmp(a,b)<0

StefanKarpinski · 2013-10-11T20:32:17Z

This may be taking it too far, but lexicographic ordering should maybe be parameterized by another ordering that's used for comparing elements.

johnmyleswhite · 2013-10-11T21:57:05Z

Dahua's proposal with Jeff's modifications seems like a huge step forward.

I think Stefan's idea of having lexicographic ordering depend on another ordering seems sound to me.

We might also want to allow order = Generalized(A) for generalized inequalities on numeric arrays.

StefanKarpinski · 2013-10-11T23:12:37Z

I don't understand the generalized thing. Can you elaborate?

johnmyleswhite · 2013-10-11T23:16:08Z

Here's a brief introduction, where it explains how you can generalize component-wise inequality: http://www.nt.tuwien.ac.at/uploads/media/lecture7.pdf

Might be too obscure. Just one of the other useful orderings I know of for vectors after moving beyond lexicographic orderings.

lindahua · 2013-10-11T23:43:14Z

I think the easiest plan may be implement a subtype of Ordering called Lexicographic, and allow it to be supplied as an optional argument to the keyword param. We should proceed with the change of max and min functions as the plan, as this is blocking the 0.2 release.

The Generalized order mentioned by @johnmyleswhite is mainly used in convex optimization context and has to be defined with respect to a vector space. These things can be implemented in a package.

lindahua · 2013-10-11T23:44:13Z

I will make updates to NumericExtensions, Stats, Distributions, and other downstream packages as soon as these changes are merged.

make minimum and maximum use scalarmin and scalarmax, which are min and max but exclude arrays remove isless for AbstractArray add LexicographicOrdering and use it in sortrows and sortcols currently LexicographicOrdering is implemented with cmp(), which does the right thing for arrays, but we might not want to keep it that way

ViralBShah · 2013-10-13T13:38:09Z

Just caught up with this thread. It would be great to do a blog post summarizing the design when all the changes are in.

ihnorton · 2015-10-02T03:29:04Z

Hi @rosangelaoliveira4, welcome to Julia. Please refer to the manual, and ask questions on StackOverflow or the julia-users mailing list. Thanks.

jiahao · 2015-10-04T01:52:50Z

This is not the right place to ask basic usage questions.

JeffBezanson added a commit that referenced this issue Sep 10, 2013

make sure min(A) and max(A) reductions are all consistent with reduce…

496bb82

…(max, A) for #4235

JeffBezanson mentioned this issue Oct 4, 2013

Boolean region-reductions are broken #4435

Closed

JeffBezanson added a commit that referenced this issue Oct 11, 2013

rename methods of min and max that do reductions to minimum and maximum

5f9bbbe

this is step 1 for #4235

JeffBezanson closed this as completed in b33614e Oct 14, 2013

ivarne mentioned this issue Oct 21, 2013

Errors with readtable JuliaData/DataFrames.jl#382

Closed

StefanKarpinski mentioned this issue Oct 26, 2013

max() vs. maximum() #4652

Closed

lucasb-eyer mentioned this issue Jan 26, 2014

Documentation: "Vectorized Operators and Functions" #5547

Closed

ivarne mentioned this issue May 3, 2014

maximum gives surprising / wrong results #6724

Closed

kmsquire mentioned this issue Jun 22, 2014

RFC: Added find_extrema, ind_extrema #7327

Closed

nalimilan mentioned this issue Dec 23, 2014

problematic max and/or splatting #9439

Closed

dphildebrandt mentioned this issue Jun 26, 2015

Merging max and maximum #11872

Closed

rfourquet mentioned this issue Sep 24, 2020

[FR] max/maximum min/minimum #37733

Closed

min/max reductions are confusing #4235

min/max reductions are confusing #4235

Comments

JeffBezanson commented Sep 9, 2013

JeffBezanson commented Sep 10, 2013

mschauer commented Sep 10, 2013

JeffBezanson commented Sep 10, 2013

JeffBezanson commented Oct 4, 2013

timholy commented Oct 4, 2013

lindahua commented Oct 7, 2013

blakejohnson commented Oct 7, 2013

lindahua commented Oct 9, 2013

blakejohnson commented Oct 9, 2013

johnmyleswhite commented Oct 9, 2013

JeffBezanson commented Oct 10, 2013

StefanKarpinski commented Oct 10, 2013

JeffBezanson commented Oct 10, 2013

StefanKarpinski commented Oct 11, 2013

JeffBezanson commented Oct 11, 2013

StefanKarpinski commented Oct 11, 2013

JeffBezanson commented Oct 11, 2013

JeffBezanson commented Oct 11, 2013

StefanKarpinski commented Oct 11, 2013

JeffBezanson commented Oct 11, 2013

JeffBezanson commented Oct 11, 2013

JeffBezanson commented Oct 11, 2013

StefanKarpinski commented Oct 11, 2013

johnmyleswhite commented Oct 11, 2013

lindahua commented Oct 11, 2013

lindahua commented Oct 11, 2013

lindahua commented Oct 11, 2013

johnmyleswhite commented Oct 11, 2013

lindahua commented Oct 11, 2013

lindahua commented Oct 11, 2013

StefanKarpinski commented Oct 11, 2013

JeffBezanson commented Oct 11, 2013

lindahua commented Oct 11, 2013

StefanKarpinski commented Oct 11, 2013

JeffBezanson commented Oct 11, 2013

StefanKarpinski commented Oct 11, 2013

JeffBezanson commented Oct 11, 2013

StefanKarpinski commented Oct 11, 2013

johnmyleswhite commented Oct 11, 2013

StefanKarpinski commented Oct 11, 2013

johnmyleswhite commented Oct 11, 2013

lindahua commented Oct 11, 2013

lindahua commented Oct 11, 2013

ViralBShah commented Oct 13, 2013

ihnorton commented Oct 2, 2015

jiahao commented Oct 4, 2015