`fvec` #31

mcabbott · 2022-01-23T01:35:28Z

This is an attempt to write a function which does what Flux.destructure does, but more carefully. It allows for missing branches of the structural gradient, and for shared weights having separate gradients, which ought to accumulate in making a vector. When returning to a structural gradient, only the first node gets the gradient, to avoid mistakenly doubling it.

It also declares some Base types like Transpose to be functors, since a decoder with transposed weights seems like the canonical example. And sets up a way to allow their gradients not to be Transpose.

And, thirdly, it changes fmap to not use the cache on isbits objects. This is to avoid falsely tying two parameters initially say SA[0,0,0]. We thought elsewhere that the way to indicate that you do what such things tied is to wrap them in some TiedArray type like https://gist.github.com/mcabbott/35f660ff9fb8e7b0b23a2abb94618cf4, but not in this PR.

Most of this can simply run with a different walk to hit only trainable arrays. Which I think ought to be done in Optimisers.jl, which owns trainable. The exception is that the gradient of fcopy doesn't seem to fit into just using a different walk, maybe there's a way? It takes a different children function instead.

Not sure what should be exported here, or what anything should be called. I think the name destructure should belong to Optimisers.jl too. Perhaps this package should export a function which similarly returns a vector and a reconstructor. That's just one line after the pieces defined here. (But easier to write & test the pieces separately.)

Needs more tests, still.

mcabbott · 2022-01-31T14:37:28Z

src/base.jl

+functor(::Type{<:Transpose}, x) = (parent = parent(x),), y -> transpose(only(y))
+lazywrap(x::Transpose) = parent(x), transpose


This is #28 and should perhaps be pulled out.

From FluxML/Optimisers.jl#42 I think the right pattern is actually more like this -- calling parent(x) is not safe, and we may want to transpose things which aren't matrix-like.

Functors.functor(::Type{<:Transpose}, x) = (parent = _transpose(x),), y -> _transpose(only(y)) _transpose(x) = transpose(x) _transpose(x::NamedTuple{(:parent,)}) = x.parent # "structural" gradient # _transpose(bc::Broadcast.Broadcasted{S}) where S = Broadcast.Broadcasted{S}(map(_transpose, bc.args)) do args... # transpose(bc.f(map(transpose, args)...)) # end _transpose(bc::Broadcast.Broadcasted) = transpose(Broadcast.materialize(bc))

mcabbott added 3 commits January 22, 2022 20:25

functor some base types

184832b

add fvec and friends

11cde33

fcopy should prune

02a05a3

mcabbott closed this Jan 23, 2022

mcabbott added 2 commits January 24, 2022 23:31

only cache isbits types

30b8216

improve

b7f7f5f

mcabbott reopened this Jan 26, 2022

mcabbott mentioned this pull request Jan 27, 2022

Optimise only at isnumeric leaves FluxML/Optimisers.jl#29

Merged

mcabbott force-pushed the flatten branch from 23a81d0 to cb4a3d5 Compare January 29, 2022 16:25

mcabbott marked this pull request as ready for review January 29, 2022 16:25

mcabbott force-pushed the flatten branch from cb4a3d5 to 2a5b70e Compare January 29, 2022 17:12

mcabbott mentioned this pull request Jan 29, 2022

Add destructure FluxML/Optimisers.jl#40

Closed

fixup

09cdfd5

mcabbott force-pushed the flatten branch from 2a5b70e to 09cdfd5 Compare January 29, 2022 17:31

ToucheSir mentioned this pull request Jan 30, 2022

Extract common functionality into fold #32

Open

mcabbott commented Jan 31, 2022

View reviewed changes

This was referenced Jan 31, 2022

Functor Transpose et. al. #33

Merged

Should Flux.destructure be moved to Functors.jl #5

Closed

mcabbott marked this pull request as draft February 2, 2022 03:30

This was referenced Feb 5, 2022

Have destructure return only trainable params FluxML/Flux.jl#1742

Closed

Add destructure, take II FluxML/Optimisers.jl#54

Merged

mcabbott closed this in FluxML/Optimisers.jl#54 Feb 14, 2022

mcabbott mentioned this pull request Feb 15, 2022

Add a structural loadparams! FluxML/Flux.jl#1875

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`fvec` #31

`fvec` #31

mcabbott commented Jan 23, 2022 •

edited

Loading

mcabbott Jan 31, 2022

		functor(::Type{<:Transpose}, x) = (parent = parent(x),), y -> transpose(only(y))
		lazywrap(x::Transpose) = parent(x), transpose

fvec #31

fvec #31

Conversation

mcabbott commented Jan 23, 2022 • edited Loading

mcabbott Jan 31, 2022

Choose a reason for hiding this comment

`fvec` #31

`fvec` #31

mcabbott commented Jan 23, 2022 •

edited

Loading