sum(f, itr) when itr is an empty collection should return zero #36262

CameronBieganek · 2020-06-12T22:18:53Z

The sum of an empty, typed array is zero:

julia> sum(Int[])
0

julia> sum(Bool[])
0

Broadcasting over an empty array returns an empty array:

julia> (x -> 2x).(Int[])
0-element Array{Int64,1}

julia> Int[] .== 1
0-element BitArray{1}

So, as expected, summing over the above broadcasted operations returns zero:

julia> sum((x -> 2x).(Int[]))
0

julia> sum(Int[] .== 1)
0

However, if we perform the same operations using the sum(f, itr) method, we get an ArgumentError:

julia> sum(x -> 2x, Int[])
ERROR: ArgumentError: reducing over an empty collection is not allowed
Stacktrace:
 [1] _empty_reduce_error() at ./reduce.jl:295
 [2] mapreduce_empty(::Function, ::Function, ::Type{T} where T) at ./reduce.jl:334
 [3] _mapreduce(::var"#7#8", ::typeof(Base.add_sum), ::IndexLinear, ::Array{Int64,1}) at ./reduce.jl:392
 [4] _mapreduce_dim(::Function, ::Function, ::NamedTuple{(),Tuple{}}, ::Array{Int64,1}, ::Colon) at ./reducedim.jl:312
 [5] #mapreduce#580 at ./reducedim.jl:307 [inlined]
 [6] mapreduce at ./reducedim.jl:307 [inlined]
 [7] _sum at ./reducedim.jl:657 [inlined]
 [8] #sum#584 at ./reducedim.jl:653 [inlined]
 [9] sum(::Function, ::Array{Int64,1}) at ./reducedim.jl:653
 [10] top-level scope at REPL[8]:1

julia> sum(==(1), Int[])
ERROR: ArgumentError: reducing over an empty collection is not allowed
Stacktrace:
 [1] _empty_reduce_error() at ./reduce.jl:295
 [2] mapreduce_empty(::Function, ::Function, ::Type{T} where T) at ./reduce.jl:334
 [3] _mapreduce(::Base.Fix2{typeof(==),Int64}, ::typeof(Base.add_sum), ::IndexLinear, ::Array{Int64,1}) at ./reduce.jl:392
 [4] _mapreduce_dim(::Function, ::Function, ::NamedTuple{(),Tuple{}}, ::Array{Int64,1}, ::Colon) at ./reducedim.jl:312
 [5] #mapreduce#580 at ./reducedim.jl:307 [inlined]
 [6] mapreduce at ./reducedim.jl:307 [inlined]
 [7] _sum at ./reducedim.jl:657 [inlined]
 [8] #sum#584 at ./reducedim.jl:653 [inlined]
 [9] sum(::Function, ::Array{Int64,1}) at ./reducedim.jl:653
 [10] top-level scope at REPL[9]:1

The text was updated successfully, but these errors were encountered:

DilumAluthge · 2020-06-12T22:32:33Z

What if, for example, f(x) always returns some specific type, regardless of the type of x?

For example, suppose we define f(x) = 1.0. So f(x) always returns a Float64, regardless of the type of x.

In this example, sum(f, Int[1]) will return a Float64, and sum(f, Int[1, 2]) will return a Float64, but sum(f, Int[]) will return an Int?

Won't that mean that this method of sum is not type-stable? Because you can get two different output types for the same input types.

CameronBieganek · 2020-06-12T22:54:38Z

Well, ok. I guess I shouldn't have put zero(eltype(itr)) in the title. Ideally sum(f, itr) would return the zero of the output type of f (or more precisely, the zero of the type resulting from summing an array of the type of the output of f). Because this is inconsistent:

julia> sum((x -> 1.0).(Int[]))
0.0

julia> sum(x -> 1.0, Int[])
ERROR: ArgumentError: reducing over an empty collection is not allowed
Stacktrace:
 [1] _empty_reduce_error() at ./reduce.jl:295
 [2] mapreduce_empty(::Function, ::Function, ::Type{T} where T) at ./reduce.jl:334
 [3] _mapreduce(::var"#13#14", ::typeof(Base.add_sum), ::IndexLinear, ::Array{Int64,1}) at ./reduce.jl:392
 [4] _mapreduce_dim(::Function, ::Function, ::NamedTuple{(),Tuple{}}, ::Array{Int64,1}, ::Colon) at ./reducedim.jl:312
 [5] #mapreduce#580 at ./reducedim.jl:307 [inlined]
 [6] mapreduce at ./reducedim.jl:307 [inlined]
 [7] _sum at ./reducedim.jl:657 [inlined]
 [8] #sum#584 at ./reducedim.jl:653 [inlined]
 [9] sum(::Function, ::Array{Int64,1}) at ./reducedim.jl:653
 [10] top-level scope at REPL[28]:1

However, off the top of my head, I don't know how to implement that. Is there an internal function somewhere in Base or Core that returns the inferred output type of a function?

CameronBieganek · 2020-06-12T23:25:08Z

I can't speak to the efficiency of this, but here's an implementation that has the desired behavior:

julia> mysum(f, itr) = sum(f.(itr))
mysum (generic function with 1 method)

julia> mysum(x -> 1.0, Int[])
0.0

julia> mysum(==(1), Int[])
0

goretkin · 2020-06-12T23:42:03Z

What zero should it return, when the type is not inferrable?

On the other hand:

julia> sum(Array{Union{Float64, Int64},1}())
0

So if sum(x -> x%2==0 ? 1 : 2.0, Int64[]) === 0, then that's consistent.

tkf · 2020-06-12T23:44:37Z

Defining sum(f, itr) for empty generic itr and arbitrary f is not possible in the current Julia infrastructure.

But we now have init for sum #36188. So, as of Julia 1.6, the best practice would be to use init if you want sum to work for generic possibly empty collections.

(Note: The confusion here is that broadcasting is "cheating" by using the internal inference API return_type in the empty case. The broadcasting implementation weighs more on convenience and performance than the correctness. So, I don't think we should be using the behavior of broadcasting as a reference.)

CameronBieganek · 2020-06-13T00:04:22Z

Defining sum(f, itr) for empty generic itr and arbitrary f is not possible in the current Julia infrastructure.

Well, technically it's possible if you "cheat" and use broadcasting: mysum(f, itr) = sum(f.(itr)).

But we now have init for sum #36188. So, as of Julia 1.6, the best practice would be to use init if you want sum to work for generic possibly empty collections.

Ok. And I see that you've updated the docstrings for sum, so that helps to clarify what happens with empty collections. However, I feel that there's still a semantic difference between sum(f, itr) and sum(f.(itr)) that is not explained in the docstring.

@tkf Question:
After #36188, does sum(Int[]) still return zero, or do you have to call sum(Int[]; init=0) to not get an error?

tkf · 2020-06-13T00:21:18Z

Defining sum(f, itr) for empty generic itr and arbitrary f is not possible in the current Julia infrastructure.

Well, technically it's possible if you "cheat" and use broadcasting: mysum(f, itr) = sum(f.(itr)).

Actually, no, it's not defining any concrete API. It's exposing undefined behavior of return_type rather than throwing an exception.

However, I feel that there's still a semantic difference between sum(f, itr) and sum(f.(itr)) that is not explained in the docstring.

Personally, I think it's better to make f.(itr) "more correct" by returning Array{Union{}} when the input is empty. But I know that this is not very popular opinion.

After #36188, does sum(Int[]) still return zero, or do you have to call sum(Int[]; init=0) to not get an error?

sum(Int[]) should be fine. More generally, if you have sum(xs) with IteratorEltype(xs) isa HasEltype and zero(eltype(xs)) is defined, sum(xs) should return the zero. For example, sum(Iterators.filter(>(0), Int[])) works. If not, please open an issue.

goretkin · 2020-06-13T00:45:32Z

Note: The confusion here is that broadcasting is "cheating" by using the internal inference API return_type in the empty case.

Is that the whole story?
Regarding #36262 (comment)
Broadcasting is using that internal inference API to produce the result Array{Union{Float64, Int64},1}(), but that is separate from sum(Array{Union{Float64, Int64},1}()) === 0

tkf · 2020-06-13T01:00:11Z

On the other hand:
julia> sum(Array{Union{Float64, Int64},1}())
0
So if sum(x -> x%2==0 ? 1 : 2.0, Int64[]) === 0, then that's consistent.

Actually I miss why these behaviors are inconsistent. I'd expect these behaviors if using return_type is OK.

NightMachinery · 2020-08-27T10:33:07Z

@tkf commented on Jun 13, 2020, 4:51 AM GMT+4:30:

sum(Int[]) should be fine. More generally, if you have sum(xs) with IteratorEltype(xs) isa HasEltype and zero(eltype(xs)) is defined, sum(xs) should return the zero. For example, sum(Iterators.filter(>(0), Int[])) works. If not, please open an issue.

sum(1 for i in []) doesn't work though.

I am trying to count something using a for comprehension. I tried length(nothing for i in [1,2] if i == 1), but that throws ERROR: MethodError: no method matching length(::Base.Iterators.Filter{var"#246#248",Array{Int64,1}}).

martinholters · 2020-08-28T06:35:52Z

count(==(1), [1,2]) for that case. More generally, the recently introduced init keyword for sum will help out:

julia> sum(1 for i in []; init = 0)
0

Given the difficulty in determining which zero (i.e. of which type) to return in general for an empty input collection, I wonder whether init is sufficient to close this?

vtjnash · 2021-11-23T19:10:08Z

closing since we now have init, both as an argument, and in the error message, as mentioned above:

But we now have init for sum #36188. So, as of Julia 1.6, the best practice would be to use init if you want sum to work for generic possibly empty collections

CameronBieganek changed the title ~~sum(f, itr) when itr is an empty collection should return zero(eltype(itr))~~ sum(f, itr) when itr is an empty collection should return zero Jun 12, 2020

tkf added the fold sum, maximum, reduce, foldl, etc. label Jun 13, 2020

davidavdav mentioned this issue Mar 23, 2021

prod() over zero-length tuple generator should just return 1? #40160

Closed

vtjnash closed this as completed Nov 23, 2021

stevengj mentioned this issue Jun 29, 2023

fix conversion of empty Dates.CompoundPeriod() to zero units #50259

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sum(f, itr) when itr is an empty collection should return zero #36262

sum(f, itr) when itr is an empty collection should return zero #36262

CameronBieganek commented Jun 12, 2020

DilumAluthge commented Jun 12, 2020 •

edited

CameronBieganek commented Jun 12, 2020 •

edited

CameronBieganek commented Jun 12, 2020 •

edited

goretkin commented Jun 12, 2020

tkf commented Jun 12, 2020 •

edited

CameronBieganek commented Jun 13, 2020

tkf commented Jun 13, 2020

goretkin commented Jun 13, 2020

tkf commented Jun 13, 2020

NightMachinery commented Aug 27, 2020 •

edited

martinholters commented Aug 28, 2020

vtjnash commented Nov 23, 2021

sum(f, itr) when itr is an empty collection should return zero #36262

sum(f, itr) when itr is an empty collection should return zero #36262

Comments

CameronBieganek commented Jun 12, 2020

DilumAluthge commented Jun 12, 2020 • edited

CameronBieganek commented Jun 12, 2020 • edited

CameronBieganek commented Jun 12, 2020 • edited

goretkin commented Jun 12, 2020

tkf commented Jun 12, 2020 • edited

CameronBieganek commented Jun 13, 2020

tkf commented Jun 13, 2020

goretkin commented Jun 13, 2020

tkf commented Jun 13, 2020

NightMachinery commented Aug 27, 2020 • edited

martinholters commented Aug 28, 2020

vtjnash commented Nov 23, 2021

DilumAluthge commented Jun 12, 2020 •

edited

CameronBieganek commented Jun 12, 2020 •

edited

CameronBieganek commented Jun 12, 2020 •

edited

tkf commented Jun 12, 2020 •

edited

NightMachinery commented Aug 27, 2020 •

edited