Allow to specify the result type of a sum #5311

lindahua · 2014-01-05T13:27:50Z

Currently, the sum function use the element type (except for the boolean case) of the input array as result type. This is sometimes not a desirable choice:

function mysum(R, x)
    s = zero(R)
    for v in x; s += v; end
    return s
end

julia> x = uint8(rand(1:10, 50));

julia> sum(x)
0x0000000000000122

# to prevent overflow
julia> mysum(Int, x)
290

julia> a = rand(Float32, 10^6);

julia> sum(float64(a)) - sum(a)
-0.018809685949236155

# to get higher accuracy
julia> sum(float64(a)) - mysum(0.0, a)
1.1641532182693481e-10

My proposal is to add a method like sum(R, x) where R is the user-specifed type. This won't break any existing codes, but provide the user additional choices.

The text was updated successfully, but these errors were encountered:

lindahua · 2014-01-05T13:29:26Z

I can make a PR to implement this if people agree that this is useful.

nalimilan · 2014-01-05T14:02:26Z

Interesting. And how about (in addition to adding your proposed function) defaulting to Int (resp. Float,Uint) for all shorter signed integer (resp. unsigned integer, float) types? My reasoning is that you may want to store a large vector of values in e.g. a Uint8 to save space, because you know the values are small, but typically if you sum them they won't hold in a Uint8. If you really want to save memory, use the detailed function. Just a thought.

lindahua · 2014-01-05T14:14:33Z

I agree with using a more sensible choice in determining the default type.

ivarne · 2014-01-05T14:38:23Z

This seems like a very natural extension of the current functionality. I have some problem to understand a few gotchas.

Currently int8(100) + int8(100) === int64(200) because (int8, int8) gets promoted to Int64/Int32 depending on the system. This feels wrong to me, but I know someone raised it as an issue before and it was not changed. Because of this, the stock sum function gives does not overflow in your first example and the only difference is that Unsigned Ints prints in hex. Your mysum is also sometimes wrong and type unstable because of promotion rules and I can not see how to fix that without adding typecasts after every computation.
This would make the current behavior for reduce with an initial vector v0 even more confusing, because that is only a alias for reduce(op, v0, itr) = op(v0, reduce(op, itr)), and the type of v0 does not affect the rounding and overflow in the computation. Maybe we should deprecate reduce with initial value?

lindahua · 2014-01-05T15:31:57Z

The correct implementation of mysum should be

function mysum(R, x)
    s = zero(R)
    for v in x
         s = convert(R, s + v)
    end
    return s
end

When s + v is of the type R, the convert operation is no-op, and will not cause performance problems. This, I believe, is the case where this function is intended to help.

lindahua · 2014-01-05T15:34:23Z

I agree that the current implementation of reduce is not perfect -- but that should be a different issue.

ivarne · 2014-01-05T15:46:13Z

Yes it is kind of a different issue, but sum is mentioned as a special case of reduce in the documentation, so they are definitely related. Your new mysum would could also easily be used for general reduction also if op was not hard-coded as +, and s was initialized to the first element in x.

lindahua · 2014-01-05T15:59:53Z

Currently, sum is implemented separately (like other reduction functions maximum, minimum, all, any). Their implementation is not based on reduce, which is simply too slow (without the compiler's support on inlining input functions).

Also, sum over floating point arrays are using pairwise summation, maximum & minimum using special mechanism to deal with NaN at the beginning, and all & any using shortcut. These special mechanisms can not be easily subsumed under a reduce function. From a practical standpoint, reduce should only be used when you want to apply a special operation that none of the builtin reduction functions support.

ivarne · 2014-01-05T16:24:22Z

My point was that these functions are closely related, and if one want to change the interface, all the interfaces should be considered together.

I have some understanding of the implementation details, and did NOT intend to suggest that the other reduction functions should be implemented using our reduce implementation. Standard library functions should use every trick possible to get the correct result in the most efficient way.

lindahua · 2014-01-05T17:34:27Z

I agree that the implementation of reduce(op, v0, itr) needs improvement. I can revise this reduce method to ensure its behavior is consistent with sum.

carlobaldassi · 2014-01-05T18:17:14Z

On the subject of making reduce and sum consistent, I'll take the chance to point out the following, which I recently found out: having a Vector{Vector{Float64}} with elements all of the same size, e.g. Ws=Vector{Float64}[[1.0,2.0,3.0], [3.0,2.0,0.5]], the following used to work in julia 0.2:

W = sum(Ws)

but it now fails because zero{Vector{Float64}} is not defined (and understandably so); instead, this works fine at the moment:

W = reduce(+, Ws)

I'm not sure if the former should actually work or not, but if the documentation says that the two should be equivalent, it should.

BTW the error for the sum(Ws) form is rather cryptic:

julia> sum(Ws)
ERROR: no method convert(Type{Array{Float64,1}}, Int64)
 in sum_seq at abstractarray.jl:1445
 in sum_pairwise at abstractarray.jl:1505
 in sum at abstractarray.jl:1512

carlobaldassi · 2014-01-07T10:22:51Z

Note: the issue in the previous comment about sum(x) != reduce(+,x) was fixed by @lindahua in #5318.

oscardssmith · 2016-10-30T06:59:27Z

Given the existence of #18423, should this be closed?

kmsquire · 2016-10-31T02:48:16Z

@oscardssmith typically issues are closed only after relevant pull requests are merged.

yuyichao · 2016-10-31T02:52:37Z

Or when the two issues are duplicate of each other (or when one covers the other one). These two issues seems to be closely related but are not quite the same. (A fix should probably take both into account though.)

martinholters · 2016-10-31T08:56:38Z

The original issue seems to be solved, sum(OutputType, x) does work:

julia> a = rand(Float32, 10^6);

julia> sum(Float64.(a)) - sum(a)
-0.020957350730895996

julia> sum(Float64.(a)) - sum(Float64,a)
0.0

Close or change title?

PS: I haven't verified there is a test for this; we might want to do so before closing.

KristofferC · 2020-06-23T09:58:37Z

Works now.

simonster mentioned this issue Sep 9, 2016

Provide method for sum that accepts a neutral element #18423

Closed

felixrehren mentioned this issue Feb 10, 2017

give sum(T::Type, xs) = zero(T) when xs is empy #20561

Closed

KristofferC closed this as completed Jun 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow to specify the result type of a sum #5311

Allow to specify the result type of a sum #5311

lindahua commented Jan 5, 2014

lindahua commented Jan 5, 2014

nalimilan commented Jan 5, 2014

lindahua commented Jan 5, 2014

ivarne commented Jan 5, 2014

lindahua commented Jan 5, 2014

lindahua commented Jan 5, 2014

ivarne commented Jan 5, 2014

lindahua commented Jan 5, 2014

ivarne commented Jan 5, 2014

lindahua commented Jan 5, 2014

carlobaldassi commented Jan 5, 2014

carlobaldassi commented Jan 7, 2014

oscardssmith commented Oct 30, 2016 •

edited by vchuravy

kmsquire commented Oct 31, 2016

yuyichao commented Oct 31, 2016

martinholters commented Oct 31, 2016

KristofferC commented Jun 23, 2020

Allow to specify the result type of a sum #5311

Allow to specify the result type of a sum #5311

Comments

lindahua commented Jan 5, 2014

lindahua commented Jan 5, 2014

nalimilan commented Jan 5, 2014

lindahua commented Jan 5, 2014

ivarne commented Jan 5, 2014

lindahua commented Jan 5, 2014

lindahua commented Jan 5, 2014

ivarne commented Jan 5, 2014

lindahua commented Jan 5, 2014

ivarne commented Jan 5, 2014

lindahua commented Jan 5, 2014

carlobaldassi commented Jan 5, 2014

carlobaldassi commented Jan 7, 2014

oscardssmith commented Oct 30, 2016 • edited by vchuravy

kmsquire commented Oct 31, 2016

yuyichao commented Oct 31, 2016

martinholters commented Oct 31, 2016

KristofferC commented Jun 23, 2020

oscardssmith commented Oct 30, 2016 •

edited by vchuravy