Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zero should be type preserving #29839

Closed
GiggleLiu opened this issue Oct 29, 2018 · 9 comments
Closed

zero should be type preserving #29839

GiggleLiu opened this issue Oct 29, 2018 · 9 comments

Comments

@GiggleLiu
Copy link
Contributor

The change of type causes trouble in https://github.com/FluxML/Flux.jl/issues/472

julia> A = randn(2,2)
2×2 Array{Float64,2}:
 -2.75325    0.338825
 -0.611465  -1.18981 

julia> A'
2×2 LinearAlgebra.Adjoint{Float64,Array{Float64,2}}:
 -2.75325   -0.611465
  0.338825  -1.18981 

julia> zero(A')
2×2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0

julia> transpose(A)
2×2 LinearAlgebra.Transpose{Float64,Array{Float64,2}}:
 -2.75325   -0.611465
  0.338825  -1.18981 

julia> zero(transpose(A))
2×2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0

Adjoint, Transpose wrappers should be keeped.

I think this PR #29777 is a proper place to make this change?

@StefanKarpinski
Copy link
Sponsor Member

Adjoint, Transpose wrappers should be keeped.

Why?

@GiggleLiu
Copy link
Contributor Author

GiggleLiu commented Oct 30, 2018

zero should be type preserving, to make program more predictable

a minmal example

mutable struct MyArr{AT}
    arr::AT
end
m = randn(2,2)
a = MyArr(m')
a.arr = zero(a.arr) # break

what's happening in Flux (simplified version)

mutable struct MyTracker{AT}
    arr::AT
    grad::AT

    MyTracker(x) = new{typeof(x)}(x)
end
m = randn(2,2)
a = MyTracker(m')
a.grad = zero(a.arr)  # break, no one expect that, right?

@GiggleLiu
Copy link
Contributor Author

Also, if we believe 0' == 0 always hold, we can define

Base.zero(ad::Adjoint{T, AT}) where {T, AT} = Adjoint{T, AT}(zero(parent(ad)))

to make Adjoint propagate.

Then the following example will work

using LinearAlgebra
Base.zero(a::MyArr) = MyArr(zero(a.arr))
adarr = Adjoint(MyArr);
zero(adarr);

@StefanKarpinski
Copy link
Sponsor Member

This could also be handled by having an appropriate convert method.

@StefanKarpinski
Copy link
Sponsor Member

There's also an argument to be made that the type of the grad field is wrong.

@GiggleLiu
Copy link
Contributor Author

BTW, similar is also not type preserving, I didn't even notice that

julia> similar(randn(3,3)') |> typeof
Array{Float64,2}

julia> randn(3,3)'
3×3 LinearAlgebra.Adjoint{Float64,Array{Float64,2}}:
 -1.97386    0.537503  -0.306281
 -0.187487  -0.669094  -0.154925
 -0.407494   0.818063   2.02156 

I am not a fan of Flux.jl, but here, the behavior of zero is less intuitive.

In the doc, zero is described as unit adder and similar is described as best-suited array that keep shape and eltype, these are all correct but vague. Could you please examplify when we must change type for zero and similar?

@StefanKarpinski
Copy link
Sponsor Member

BTW, similar is also not type preserving

Yes, intentionally. Because when you want something similar to an adjoint, you generally don't actually want it to be an adjoint. You just want something that's sufficiently compatible to work.

@GiggleLiu
Copy link
Contributor Author

Ok, it is really a hard decision between consistency and easy to use. Let's keep this issue open and see if threre are more feedbacks.

@GiggleLiu
Copy link
Contributor Author

I think we can close this issue now, because recently someone fixed it.

julia> zero([1,2,3]')
1×3 adjoint(::Vector{Int64}) with eltype Int64:
 0  0  0

julia> similar([1,2,3]')
1×3 adjoint(::Vector{Int64}) with eltype Int64:
 139677310498048  139677310498496  139677310498832

Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants