Add support for zero(::Array{Array}) #8759

jiahao · 2014-10-22T00:33:58Z

The problem with the existing definition is that it tries to call zero(::Type{Array}), which fails because it doesn't know what size the eltype should be.

The solution here is simple - if the outer Array is not empty, call zero on its first element.

A similar problem occurs with one() but that is harder to fix.

The problem with the existing definition is that it tries to call zero(::Type{Array}), which fails because it doesn't know what size the eltype should be. The solution here is simple - if the outer Array is not empty, call zero on its first element. A similar problem occurs with one() but that is harder to fix.

ivarne · 2014-10-22T12:38:13Z

-1

It's not obvious to me that zero(a::Array{Array{Float64,2},2}) should return an array with 4 references to the same array.

julia> A = [zeros(2,2) for i=1:2, j=1:2];
julia> a = zero(A);
julia> a[1][1] = 1;
julia> a
2x2 Array{Array{Float64,2},2}:
 2x2 Array{Float64,2}:
 1.0  0.0
 0.0  0.0  2x2 Array{Float64,2}:
 1.0  0.0
 0.0  0.0
 2x2 Array{Float64,2}:
 1.0  0.0
 0.0  0.0  2x2 Array{Float64,2}:
 1.0  0.0
 0.0  0.0

jiahao · 2014-10-22T13:14:58Z

Thanks for catching that.

jiahao · 2014-10-22T13:18:47Z

Could this be considered a bug in similar or fill!?

julia> A = [zeros(2,2) for i=1:2, j=1:2]
2x2 Array{Array{Float64,2},2}:
 2x2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0  2x2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0
 2x2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0  2x2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0

julia> A[1,1][1,2]=1
1

julia> A
2x2 Array{Array{Float64,2},2}:
 2x2 Array{Float64,2}:
 0.0  1.0
 0.0  0.0  2x2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0
 2x2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0  2x2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0

julia> B=similar(A)
2x2 Array{Array{Float64,2},2}:
 #undef  #undef
 #undef  #undef

julia> fill!(B, zeros(2,2))
2x2 Array{Array{Float64,2},2}:
 2x2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0  2x2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0
 2x2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0  2x2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0

julia> [pointer(a) for a in A]
4-element Array{Any,1}:
 Ptr{Float64} @0x00007fb3b2f440f0
 Ptr{Float64} @0x00007fb3b2f44140
 Ptr{Float64} @0x00007fb3b2f441e0
 Ptr{Float64} @0x00007fb3b2f44280

julia> [pointer(a) for a in B]
4-element Array{Any,1}:
 Ptr{Float64} @0x00007fb3b6b91480
 Ptr{Float64} @0x00007fb3b6b91480
 Ptr{Float64} @0x00007fb3b6b91480
 Ptr{Float64} @0x00007fb3b6b91480

jiahao · 2014-10-22T13:21:40Z

@timholy, would this be a horrible change?

#multidimensional.jl:174-9
@ngenerate N typeof(A) function fill!{T,N}(A::AbstractArray{T,N}, x)
    @nloops N i A begin
        @inbounds (@nref N A i) = x #should this be deepcopy(x)?
    end
    A
end

ivarne · 2014-10-22T13:24:18Z

To quote @StefanKarpinski it would be "bizarre", (if I understand correctly what he is talking about).

I'm not sure I agree with that statement though.

jiahao · 2014-10-22T13:27:45Z

I'm puzzled. I don't really see a use case for the current behavior of fill! to create an entire array with multiple references to the same object. It's like creating an entire suburban subdivision but each mailbox goes to the same house.

andreasnoack · 2014-10-22T13:31:58Z

Or maybe even deepcopy in case x is a Matrix{Matrix{Float64}}

jiahao · 2014-10-22T13:33:18Z

To continue the second-guessing of @StefanKarpinski,

When you write fill!(arr, ChannVals()) you are asking to fill arr with the one value that is the result of evaluating ChannVals() once.

sounds to me like fill! should make multiple copies of the value, not the reference (which it currently does). But if you had (say) an iterator x, you don't necessarily want fill!(A, next(x)) to generate multiple values from different invocations of next(x), you just want to evaluate it once, then make multiple copies of the value you get.

jiahao · 2014-10-22T13:38:38Z

@andreasnoack that is a good point, the proposed change should be deepcopy(x) not copy(x). I'll edit my comment above.

timholy · 2014-10-22T13:46:13Z

I think this is a question for others to answer; I just "Cartesian-ified" an existing implementation of fill!.

This seems to be one of those "what do you mean by x?" questions.

kmsquire · 2014-10-22T13:56:23Z

The same issue exists when providing a default value for a DefaultDict (in DataStructures.jl). The solution there was to allow a function to be passed in which would create the new object (mimicking Python). So here, for example, one way to do this would be to allow

fill!(B, ()->zeros(2,2))

The downside (here and in DefaultDict) is that it becomes challenging to fill something with an actual function. But it is doable:

fill!(B, ()->(()->zeros(2,2))))

andreasnoack · 2014-10-22T14:01:07Z

Is it ever useful to fill an Array with the same reference in all elements?

Med venlig hilsen

Andreas Noack

2014-10-22 9:56 GMT-04:00 Kevin Squire notifications@github.com:

The same issue exists when providing a default value for a DefaultDict
(in DataStructures.jl). The solution there was to allow a function to be
passed in which would create the new object (mimicking Python). So here,
for example, one way to do this would be to allow

fill!(B, ()->zeros(2,2))

The downside (here and in DefaultDict) is that it becomes challenging to
fill something with an actual function. But it is doable:

fill!(B, ()->(()->zeros(2,2))))

—
Reply to this email directly or view it on GitHub
#8759 (comment).

timholy · 2014-10-22T14:07:54Z

Sure:

black = fill(RGB(0,0,0), 5, 5)
red = fill(RGB(1,0,0), 5, 5)
checkerboard = Array(Matrix{RGB}, 9, 9)
fill!(checkerboard, black)
for i = 1:2:length(checkerboard)
    checkerboard[i] = red
end

jiahao · 2014-10-22T14:10:36Z

It doesn't look like the checkerboard example would break if we changed fill!(A, x) to do an implicit deepcopy(x) rather than direct assignment of x to each element. What would be interesting is if there were somehow a use case that took advantage of the assignment-of-reference semantics and would break if we changed it to assignment-of-value.

timholy · 2014-10-22T14:25:15Z

Throwing down the gauntlet, eh? (I'm not saying changing it would be wrong on average, but I do feel that whatever we choose will be wrong sometimes.)

component1 = [0 0 0; 1 1 1; 0 0 0]
component2 = [0 1 0; 0 1 0; 0 1 0]
img = rand(27, 27)
tiles = fill(component1, 9, 9)
for i = 1:2:27 tiles[i] = component2; end
for i = 1:iter
    # improve component1, component2
    nmf(...)
    # choose the best component to describe the ith tile
    tiles[i] = err1 < err2 ? component1 : component2
end

StefanKarpinski · 2014-10-22T14:58:56Z

Adding implicit deepcopy anywhere seems like a bad idea – I'd prefer to know when I'm about to make a complete copy of everything in the entire world. Since deepcopy is a no-op on immutable that don't reference mutables, I'm not super concerned about that issue (ints, floats, RGB, etc.). The thing is that the behavior people really want here is probably for the expression that generates the value to be evaluated many times, not for the value to be copied many times. For example, if someone writes

fill!(Array(10), rand())

they probably wanted to call rand ten times. Of course, now it doesn't do what they wanted either, but it seems like the current behavior is simpler, even if it's sometimes unintuitive.

andreasnoack · 2014-10-22T15:11:02Z

Since deepcopy is a no-op on immutable

It doesn't seem to be the case

julia> @time for i = 1:10^7;copy(1);end
elapsed time: 5.74e-7 seconds (0 bytes allocated)

julia> @time for i = 1:10^7;deepcopy(1);end
elapsed time: 2.670992386 seconds (3520000000 bytes allocated, 53.65% gc
time)

Med venlig hilsen

Andreas Noack

2014-10-22 10:58 GMT-04:00 Stefan Karpinski notifications@github.com:

Adding implicit deepcopy anywhere seems like a bad idea – I'd prefer to
know when I'm about to make a complete copy of everything in the entire
world. Since deepcopy is a no-op on immutable that don't reference
mutables, I'm not super concerned about that issue (ints, floats, RGB,
etc.). The thing is that the behavior people really want here is probably
for the expression that generates the value to be evaluated many times,
not for the value to be copied many times. For example, if someone
writes

fill!(Array(10), rand())

they probably wanted to call rand ten times. Of course, now it doesn't do
what they wanted either, but it seems like the current behavior is simpler,
even if it's sometimes unintuitive.

—
Reply to this email directly or view it on GitHub
#8759 (comment).

StefanKarpinski · 2014-10-22T15:14:11Z

Ah, yes. Because deepcopy creates a dict to keep track of circularities. It's still a no-op, just a very expensive one. Good ol' deepcopy.

jiahao · 2014-10-22T18:01:14Z

@timholy just to clarify, the very issue I am wrestling with is in loops of the form

fill!(tiles, startingtile)
for i = 1:iter
    dostuff()
    tiles[i][j] = newval #I want to update the jth entry of the ith element but this doesn't do it
end

If the inner loop were to do tiles[i] = newval then the code would assign as intended. However if tiles were an Array of Arrays this code simply would not work as intended, because fill! creates a tiles array such that every single element is a reference to the same memory location, which is one initialized instance of startingtile. The assignment tiles[i][j]=newval then has the same semantics as tiles[:][j]=newval. I find it very hard to believe that users would ever want this behavior.

The thing is that the behavior people really want here is probably for the expression that generates the value to be evaluated many times, not for the value to be copied many times.

I see the issue; it happens to make no difference for my particular use case. However, I still see no justification for the current semantics of fill!: why would anyone ever want to create a Pleasantville-array with multiple copies of pointers to the same exact memory location?

JeffBezanson · 2014-10-22T18:01:18Z

The system should not try to guess what needs to be copied, or how you're going to mutate things. Generic library functions can do what they want, and mutation has to be managed manually.

JeffBezanson · 2014-10-22T18:03:04Z

The current fill! is basically optimized for functional-style programming. If you don't mutate things, it's a more efficient representation of the correct answer.

timholy · 2014-10-22T18:08:58Z

@jiahao, the point with my example was intended to be that you want to maintain just 2 components to describe all the tiles, so each element of the array must maintain a pointer to one or the other. Sorry if that wasn't clear.

I'm not saying there aren't good reasons to want the other thing sometimes, too. Just that there are use cases for the current behavior.

jiahao · 2014-10-22T18:13:11Z

If you don't mutate things, it's a more efficient representation of the correct answer.

So my question is, is this a premature optimization? Should fill!(A, mutablething) support the semantics of a mutable container of mutable objects? Or is this a "don't do that" gotcha?

It would be very sad if we go for the latter. It will mean that practically all the generic linear algebra code will get uglier because we will have to special case whether or not the element is mutable.

jiahao · 2014-10-22T18:22:52Z

@timholy

the point with my example was intended to be that you want to maintain just 2 components to describe all the tiles, so each element of the array must maintain a pointer to one or the other. Sorry if that wasn't clear.

IIUC then I should see that there are only two unique pointer addresses in the checkerboard code and they should be the address of black and red respectively, but there are 81:

black = fill(RGB(0,0,0), 5, 5)
bptr = pointer(black)
red = fill(RGB(1,0,0), 5, 5)
rptr = pointer(red)
checkerboard = Array(Matrix{RGB}, 9, 9)
fill!(checkerboard, black)
map(pointer, checkerboard[:]) |> unique |> length #81
sum(map(pointer, checkerboard[:]) .== bptr) #0
sum(map(pointer, checkerboard[:]) .== rptr) #0
for i = 1:2:length(checkerboard)
    checkerboard[i] = red
end
map(pointer, checkerboard[:]) |> unique |> length #81
sum(map(pointer, checkerboard[:]) .== bptr) #0
sum(map(pointer, checkerboard[:]) .== rptr) #0

or am I misunderstanding what pointer does?

JeffBezanson · 2014-10-22T18:25:14Z

Why would linear algebra code mutate the elements of a matrix? Seems odd to me.

The problem is that once you get into stuffing implicit copys here and there, you're never done debating where they should and shouldn't go. For example when copy! moves elements from one array to another, should it copy the elements? I doubt there is an intuitive rule for knowing when things get copied behind your back and when they don't. "Never" is the simplest and most general option. It's easier to do extra copies yourself than to un-copy things that were copied for you.

jiahao · 2014-10-22T18:34:03Z

Why would linear algebra code mutate the elements of a matrix? Seems odd to me.

(I assume you meant elements of elements of a matrix.) This comes up every time you want to write in-place updating operations on tiled matrices (i.e. a matrix of matrices) which would overwrite the elements of the outer matrix with new elements (inner matrices). Usually we don't want out-to-place updating operations where you allocate new inner matrices, populate them and then reassign then as elements of the outer matrices, which would be the only thing you can do with immutable semantics.

timholy · 2014-10-22T18:39:16Z

@jiahao, bug in my example (non-concrete element declaration). Change to checkerboard = Array(Matrix{RGB{Float64}}, 9, 9) and the World Will Make Sense Again.

jiahao · 2014-10-22T18:42:14Z

@timholy Ah yes much better. I was beginning to go crazy for Not Getting The Point.

JeffBezanson · 2014-10-22T18:42:21Z

Another way to say it: you can't expect all issues surrounding mutation to be solved for you by adding a few copys to the standard library. One such change might work today, but then later somebody will want it to be a deepcopy instead.

You're describing a function whose preconditions include "all inner matrices must be distinct objects". I'd argue that this isn't a reasonable precondition, because it's very hard for a caller to ensure. That use of mutation is an implementation detail of the function, so the function needs to bear full responsibility, not force it on callers.

jiahao · 2014-10-22T18:48:28Z

@JeffBezanson

The current fill! is basically optimized for functional-style programming.

Isn't functional programming all about purity of ~~output~~ function mapping? And isn't the current behavior antithetical to that desire because because its output is riddled with side effects associated with conflated memory addresses? Or am I misunderstanding something very fundamental about what "functional programming" is?

cc: @jakebolewski

JeffBezanson · 2014-10-22T18:49:46Z

Yes, you're misunderstanding. Shared references are not "impure". Mutation is what's impure.

StefanKarpinski · 2014-10-22T18:51:08Z

Predictably, I'm 100% with Jeff on this one (or is he 100% with me – after all, I called it bizarre first ;-) – it's easier to ask for a copy when you want one than to prevent a copy when you don't need one.

jiahao · 2014-10-22T19:08:31Z

I will agree that copying behind your back is generally not a good thing.

I think I've finally zeroed in (as it were) on what is so confusing for me. It's the fact that the term 'value' in the description of the function

Base.fill!(A, x)

Fill the array "A" with the value "x"

means "compiler value", which if boxed is a pointer value, but if unboxed is a value much closer to what I think about as "numeric value".

jiahao · 2014-10-22T19:13:41Z

Thanks everyone for humoring me as I stumble through yet another obvious-in-retrospect trajectory.

StefanKarpinski · 2014-10-22T19:21:10Z

These conversations are well worth having since a non-negligible amount of the time they lead to a positive change in the language and usually lead to more clarity for the people having the discussion.

timholy · 2014-10-22T20:44:38Z

Real non-allocating linear algebra routines have two exclamation points. Much more impressive that way.

But more seriously, if one thinks about writing A_mul_B! in pure Julia code, and you want to handle the case you're considering now, you'd literally have to replace

for j = 1:J, i = 1:I
    s = 0.0
    for k = 1:K
        s += A[i,k]*B[k,j]
    end
    C[i,j] = s
end

with

tmp = similar(C[1,1])
for j = 1:J, i = 1:I
    c = C[i,j]  # C is pre-filled with zeros-matrices
    for k = 1:K
        A_mul_B!(tmp, A[i,k], B[k,j])
        add!(c, tmp)
    end
end

If you tried to make this generic so it works for either immutable c or mutable c, I'd worry about losing the ability to use CPU registers or incurring extra write cache-misses. It might make the generic algorithm unattractive from a performance perspective.

timholy · 2014-10-22T20:45:45Z

(Didn't hit refresh on the browser and so didn't realize the conversation had taken a different turn.)

ivarne · 2014-11-14T08:31:56Z

Just saw @amitmurthy publish code to julia-users with this exact mistake. Do we really expect our average user to have such a different intuition of what fill means than me, Jiaho and Amit?

amitmurthy · 2014-11-14T08:44:21Z

Thanks @ivarne , corrected the mistake on julia-users.

JeffBezanson · 2014-11-15T03:15:19Z

But that is not a mistake. Any objection to this is an objection to mutation. It would be better to get rid of mutation than to have to randomly copy things everywhere for no clear reason.

ivarne · 2014-11-15T07:35:20Z

How can that use of fill not be a mistake? The question was about a Design object that is way too big to create arbitrary copies, and contains mutable arrays. Do you really think Marcel wasn't planning on mutating the arrays, and wouldn't be up for a big surprise when his simulation gives wrong results?

amitmurthy · 2014-11-15T07:40:45Z

That use of fill was a mistake. But I am also in agreement with the current behavior, and just better documentation would have helped me avoid it.

matthieugomez · 2015-07-31T14:11:48Z

I stumbled on the same behavior when creating an array of set

a = fill(Set{Int}(), 2)
push!(a[1], 1)
a
#2-element Array{Set{Int64},1}:
# Set{Int64}({1})
# Set{Int64}({1})

I think the current behavior will lead to unpleasant experiences / silent errors in user codes.

matthieugomez · 2015-07-31T17:07:42Z

I used fill in my code only because Array returns undefined refs rather than empty sets. A solution may be for Array to return the outcome of the empty constructor if available, so that one can do

a = Array(Set{Int}, 2)
push!(a[1], 1)

This may be linked to #9147

jiahao closed this Oct 22, 2014

jiahao deleted the cjh/zero branch October 25, 2014 17:59

amitmurthy mentioned this pull request Nov 15, 2014

doc: further explain fill/fill! #9015

Merged

jiahao mentioned this pull request Dec 12, 2014

Improve zero checks in sparse and reduce #9325

Closed

JeffBezanson mentioned this pull request Jan 8, 2015

Should sum() of 1-element array return a new object? #9669

Closed

jiahao mentioned this pull request Jun 3, 2015

Functions that return arrays with eltype as input should use container type instead? #11557

Closed

nalimilan mentioned this pull request Jul 31, 2015

fill(Foo(), dim) #12411

Closed

JeffBezanson mentioned this pull request Aug 3, 2015

convert can lead to unexpected aliasing #12441

Closed

Sacha0 mentioned this pull request Jul 20, 2016

Augment convert's documentation with a note on aliasing #17519

Merged

TotalVerb mentioned this pull request Sep 15, 2016

Deprecate vectorized real methods in favor of compact broadcast syntax #18513

Closed

andreasnoack mentioned this pull request Oct 15, 2016

Pushing to array of arrays #18958

Closed

Add support for zero(::Array{Array}) #8759

Add support for zero(::Array{Array}) #8759

Conversation

jiahao commented Oct 22, 2014

ivarne commented Oct 22, 2014

jiahao commented Oct 22, 2014

jiahao commented Oct 22, 2014

jiahao commented Oct 22, 2014

ivarne commented Oct 22, 2014

jiahao commented Oct 22, 2014

andreasnoack commented Oct 22, 2014

jiahao commented Oct 22, 2014

jiahao commented Oct 22, 2014

timholy commented Oct 22, 2014

kmsquire commented Oct 22, 2014

andreasnoack commented Oct 22, 2014

timholy commented Oct 22, 2014

jiahao commented Oct 22, 2014

timholy commented Oct 22, 2014

StefanKarpinski commented Oct 22, 2014

andreasnoack commented Oct 22, 2014

StefanKarpinski commented Oct 22, 2014

jiahao commented Oct 22, 2014

JeffBezanson commented Oct 22, 2014

JeffBezanson commented Oct 22, 2014

timholy commented Oct 22, 2014

jiahao commented Oct 22, 2014

jiahao commented Oct 22, 2014

JeffBezanson commented Oct 22, 2014

jiahao commented Oct 22, 2014

timholy commented Oct 22, 2014

jiahao commented Oct 22, 2014

JeffBezanson commented Oct 22, 2014

jiahao commented Oct 22, 2014

JeffBezanson commented Oct 22, 2014

StefanKarpinski commented Oct 22, 2014

jiahao commented Oct 22, 2014

jiahao commented Oct 22, 2014

StefanKarpinski commented Oct 22, 2014

timholy commented Oct 22, 2014

timholy commented Oct 22, 2014

ivarne commented Nov 14, 2014

amitmurthy commented Nov 14, 2014

JeffBezanson commented Nov 15, 2014

ivarne commented Nov 15, 2014

amitmurthy commented Nov 15, 2014

matthieugomez commented Jul 31, 2015

matthieugomez commented Jul 31, 2015