-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correct rrule for reshape #266
Conversation
function reshape_pullback(Ȳ) | ||
∂A = reshape(Ȳ, dims) | ||
∂A = reshape(Ȳ, A_dims) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than close over all of A
we just close over its size.
lets A be GCed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand the issue is that the reshape(Y, dims)
would not do anything because the output array Y
is already shaped as dims
, rather than the original shape of array A
. (Essentially needs to reshape back to the original shape, but doesn't)
I don't understand your comment about closing and what GC means. Could you please explain or link to relevant wiki page?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand the issue is that the reshape(Y, dims) would not do anything because the output array Y is already shaped as dims, rather than the original shape of array A. (Essentially needs to reshape back to the original shape, but doesn't)
Correct.
I think by mistake the frule
for rshape was written.
Closures construct callable structs (functors),
with a field for each thing in partent scope they reference.
We call the things in parent scope that it references the variables that it closes over.
we say it is a closures over those variables.
So rather than closing over A
we close over A_dims
.
The fields of the closure are not copies, just references, so its not too bad.
But the garbage collector can't run until there are no more references to the thing.
Compare:
julia> function mkclosure()
X = ones(5,5)
pb() = (size(X) .+ 1)
return pb
end
mkclosure (generic function with 1 method)
julia> _pb = mkclosure()
(::var"#pb#13"{Array{Float64,2}}) (generic function with 1 method)
julia> _pb()
(6, 6)
julia> _pb.X
5×5 Array{Float64,2}:
1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0
julia> fieldnames(typeof(_pb))
(:X,)
julia> Base.summarysize(_pb)
248
Vs:
julia> function mkclosure2()
X = ones(5,5)
X_dims = size(X)
pb() = (X_dims .+ 1)
return pb
end
mkclosure2 (generic function with 1 method)
julia> _pb = mkclosure2()
(::var"#pb#15"{Tuple{Int64,Int64}}) (generic function with 1 method)
julia> _pb()
(6, 6)
julia> _pb.X_dims
(5, 5)
julia> fieldnames(typeof(_pb))
(:X_dims,)
julia> Base.summarysize(_pb)
16
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aaahh I see the comment is referring to why it isn't just size(A)
inside the pb
, rather than why the change is made at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct
Any time you test your rules without using finite differencing, you might as well not be testing your rules