Add testmode! back for normalization layers #1044

darsnack · 2020-02-21T05:33:03Z

Fixed #909

I added testmode!(m, mode) back to Flux as per v0.9. Now the mode can be false, true, or :auto/nothing with the default being :auto for newly constructed layers. In :auto mode, the istraining() functions added in v0.10 are used to determine whether we are evaluating within an AD trace or not.

Also plan on adding a doc section in an additional commit.

src/layers/normalise.jl

CarloLucibello · 2020-02-21T20:13:10Z

I'm dubious about defaulting to :auto in testmode!(m, mode=:auto) because it would be more natural
to interpret the statement testmode!(m) as "set the model in test mode regardless of anything else".
Viable alternatives for the signature are

testmode!(m, mode=true)     # set to test mode when called 1 arg
testmode!(m, mode)             # force to input the mode

CarloLucibello · 2020-02-21T20:14:21Z

btw, rebase has gone wrong here

darsnack · 2020-02-21T20:22:56Z

I'm dubious about defaulting to :auto in testmode!(m, mode=:auto) because it would be more natural
to interpret the statement testmode!(m) as "set the model in test mode regardless of anything else".
Viable alternatives for the signature are
testmode!(m, mode=true)     # set to test mode when called 1 arg
testmode!(m, mode)             # force to input the mode

Yeah I agree. I'll change that.

btw, rebase has gone wrong here

Any suggestions on how to resolve? I see two options — wait until the other PR I have outstanding has merged into master, or roll back the commits above then try to rebase from upstream. I haven't done the latter before, but I can figure it out. The weird commit history is why I set this PR as a draft.

CarloLucibello · 2020-02-21T20:37:37Z

your other PR is orthogonal to this, so you could have just branched from current master without any fear about the merging order.

I don't know what would be the best way to fix this. What I would do, since the PR small, would be something a little bit dirty:

-- save the changes from this PR in some untracked file
git  checkout master
git branch -d mybranch  #delete branch locally
git checkout -b mybranch
-- apply and commit changes
git push origin +mybranch  # force push changes here

darsnack · 2020-02-21T20:50:48Z

I ended up rebasing from before the other PR. The commit history should be clean now.

test/layers/normalisation.jl

CarloLucibello · 2020-02-27T19:18:50Z

@dhairyagandhi96 @MikeInnes @xukai92 any comments on this? This is non-breaking, but once it gets in reverting would be breaking

CarloLucibello · 2020-02-29T12:56:17Z

@darsnack should we also add trainmode! for convenience?

darsnack · 2020-02-29T21:49:56Z

@darsnack should we also add trainmode! for convenience?

Are you thinking of something like trainmode!(m) = testmode!(m, false)? Because I think having both trainmode!(m, mode) and testmode!(m, mode) might be redundant/confusing (i.e. two functions with similar signatures that serve the same purpose except the boolean argument is reversed).

darsnack · 2020-02-29T22:26:11Z

I did update testmode! to return the model, removing the need for trainmode in the tests.

CarloLucibello · 2020-03-01T17:15:51Z

Are you thinking of something like trainmode!(m) = testmode!(m, false)? Because I think having both trainmode!(m, mode) and testmode!(m, mode) might be redundant/confusing (i.e. two functions with similar signatures that serve the same purpose except the boolean argument is reversed).

yeah, I was thinking

trainmode!(m, mode=true) = mode isa Bool ?  testmode!(m, !mode) : testmode!(m, mode)

I know it's redundant, but it seems more natural to not break arbitrarily the symmetry toward testmode in the interface. This way, one would do testmode!(m) while testing and trainmode!(m) while training. I don't know, it feels a bit more natural to me, but I don't have a strong opinion, we could just go with testmode! if you think it's better

CarloLucibello · 2020-03-01T17:19:31Z

Also, we should warn in the docstring to set back the model to train once test phase is over, or something along these lines

src/functor.jl

CarloLucibello · 2020-03-01T18:49:18Z

ok, looks good, we can merge whenever you are ready. If you also bump Flux's minor version in the project.toml we can also proceed with tagging a new release

darsnack · 2020-03-01T19:05:13Z

I think this is ready to merge

CarloLucibello · 2020-03-01T19:09:52Z

wait, it looks like the version was already bumped, sorry, could your change it back?

darsnack · 2020-03-01T19:11:46Z

Oops should probably have double-checked that after merging. Should be good to go again.

CarloLucibello · 2020-03-01T19:13:51Z

nice, thanks!

bors r+

bors · 2020-03-01T19:27:41Z

Build succeeded

ci/gitlab/gitlab.com

anoojpatel · 2020-03-02T22:39:43Z

This is awesome! Could this be tagged? I would love using this as soon as possible! :)

darsnack · 2020-03-03T16:46:41Z

This turns out to be breaking for cases where people instantiate a layer by manually specifying the fields instead of using the outer constructor. (e.g. ObjectDetector.jl)

DhairyaLGandhi · 2020-03-03T17:02:58Z

I would've thought that simply defining the istraining function would be sufficient?

CarloLucibello · 2020-03-03T17:04:13Z

This turns out to be breaking for cases where people instantiate a layer by manually specifying the fields instead of using the outer constructor. (e.g. ObjectDetector.jl)

ouch

IanButterworth · 2020-03-03T17:04:34Z

This is the problematic line in ObjectDetector.jl
https://github.com/r3tex/ObjectDetector.jl/blob/dd86f91e7e32327e805ff287759433922b421991/src/yolo/yolo.jl#L246

darsnack · 2020-03-03T17:05:22Z

I would've thought that simply defining the istraining function would be sufficient?

Sufficient for the functionality added in this PR? I don't follow how that allows users to force train/test mode on a per layer basis.

DhairyaLGandhi · 2020-03-03T17:05:45Z

That would prevent the breakages too, I think

IanButterworth · 2020-03-03T17:06:22Z

If the usage in ObjectDetector is unusual, I'd happily take a PR to correct it

DhairyaLGandhi · 2020-03-03T17:06:59Z

for freezing layers, the recommended way would be via the parameters anyway, so I think it's an orthogonal concern, unless that's specifically desired

darsnack · 2020-03-03T17:14:08Z

If the usage in ObjectDetector is unusual, I'd happily take a PR to correct it

Even if it is unusual, adding fields to a type is a breaking change unless the field defaults to a value. @ianshmean your PR should have been part of this one — that was my bad.

for freezing layers, the recommended way would be via the parameters anyway, so I think it's an orthogonal concern, unless that's specifically desired

The normalization layers update some fields on the forward pass when part of an AD trace. I don't think freezing by excluding the parameters will stop this update. This is standard, but v0.10 made the change of automatically deciding whether to update these fields or not. There are use-cases that are not standard training where the fields should not be update even though a gradient is being computed.

DhairyaLGandhi · 2020-03-03T17:26:06Z

The parameters that we collect are the ones we update, I believe

darsnack · 2020-03-03T19:42:39Z

I think I am missing something. If you could explain how to address #909 with the istraining and parameter freezing, then I am happy to work on a PR for a downstream major version change.

MikeInnes · 2020-03-16T13:33:14Z

I suggest reverting this and working on it some more. I have a bunch of concerns about the API, and the code is pretty strange (e.g. why are there a bunch of additions to functor.jl when this has nothing to do with the functor impl? If anything this should use functor because then it'd work with custom layers etc.)

darsnack · 2020-03-16T13:55:56Z

I added the generic functions that work on any layer to functor.jl. Seemed like it had parallels organizationally to trainable, but I could have put it in layers/basic.jl.

If anything this should use functor because then it'd work with custom layers etc.

By this do you mean freezing via trainable like @dhairyagandhi96 suggested? I'm sorry but I still don't understand this route. (I'm just going to talk myself through an example). If we take BatchNorm, we have the following line (L171):

trainable(bn::BatchNorm) = (bn.β, bn.γ)

This affects what gets updated during the gradient step. I don't see how that affects L189-190:

μ = mean(x, dims = axes)
σ² = sum((x .- μ) .^ 2, dims = axes) ./ m

These are recomputed every pass that is contained within an AD trace regardless of what trainable(bn::BatchNorm) returns. In contrast, when the layer is in test mode, L183-184 don't recompute the mean and variance:

μ = reshape(BN.μ, affine_shape...)
σ² = reshape(BN.σ², affine_shape...)

In my mind, latching onto functor/trainable means rewriting the normalization layers so that all updates are part of the gradient computation (i.e. custom adjoints).

MikeInnes · 2020-03-16T15:06:51Z

The fmap would not be for the normalisation layers themselves but for larger models that have normalisation inside them – like Chain. You've made Chain a special case, but if you use fmap instead you could apply testmode! to any model, including user-defined structs, without defining a method for each one specifically.

1432: Generalize train/testmode! to all Functors r=CarloLucibello a=ToucheSir Addresses #1044 (comment). See also https://discourse.julialang.org/t/do-i-have-to-implement-flux-testmode-for-my-own-models/52038. ### PR Checklist - [x] Tests are added - [ ] Entry in NEWS.md - [ ] Final review from `@dhairyagandhi96` (for API changes). Co-authored-by: Brian Chen <ToucheSir@users.noreply.github.com>

CarloLucibello reviewed Feb 21, 2020

View reviewed changes

src/layers/normalise.jl Outdated Show resolved Hide resolved

Added testmode! functionality back to normalization layers.

7c12af0

darsnack added 2 commits February 21, 2020 15:10

Updated to place function definitions in the appropriate places.

924b8f4

Added docs on testmode!

ba5259a

darsnack marked this pull request as ready for review February 25, 2020 19:55

CarloLucibello reviewed Feb 26, 2020

View reviewed changes

test/layers/normalisation.jl Outdated Show resolved Hide resolved

darsnack closed this Feb 26, 2020

darsnack reopened this Feb 26, 2020

darsnack added 2 commits February 29, 2020 16:09

Changed testmode! to return model

5cbd2ce

Removed trainmode from tests

568ecb1

darsnack added 3 commits March 1, 2020 12:30

Added trainmode! and updated docs with warning

c001d0f

Merge branch 'master' into feature/istraining

4cebf36

Fixed broken @ref in docstring

35e460b

CarloLucibello reviewed Mar 1, 2020

View reviewed changes

src/functor.jl Outdated Show resolved Hide resolved

darsnack added 2 commits March 1, 2020 12:49

Add "during X phase" phrasing to testmode!/trainmode! docstring.

23f791e

Bump minor version to v0.10.3

88cad1c

Debump version

e49d9c4

bors bot merged commit 069d228 into FluxML:master Mar 1, 2020

darsnack deleted the feature/istraining branch March 1, 2020 20:01

darsnack mentioned this pull request Mar 3, 2020

Prevent breakage due to new active field in normalise layers #1070

Merged

jonathan-laurent mentioned this pull request Mar 23, 2020

The Flux backend is currently broken jonathan-laurent/AlphaZero.jl#2

Closed

ToucheSir mentioned this pull request Dec 19, 2020

Generalize train/testmode! to all Functors #1432

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add testmode! back for normalization layers #1044

Add testmode! back for normalization layers #1044

darsnack commented Feb 21, 2020

CarloLucibello commented Feb 21, 2020 •

edited

Loading

CarloLucibello commented Feb 21, 2020

darsnack commented Feb 21, 2020

CarloLucibello commented Feb 21, 2020 •

edited

Loading

darsnack commented Feb 21, 2020

CarloLucibello commented Feb 27, 2020

CarloLucibello commented Feb 29, 2020

darsnack commented Feb 29, 2020

darsnack commented Feb 29, 2020

CarloLucibello commented Mar 1, 2020 •

edited

Loading

CarloLucibello commented Mar 1, 2020

CarloLucibello commented Mar 1, 2020

darsnack commented Mar 1, 2020

CarloLucibello commented Mar 1, 2020

darsnack commented Mar 1, 2020

CarloLucibello commented Mar 1, 2020

bors bot commented Mar 1, 2020

anoojpatel commented Mar 2, 2020

darsnack commented Mar 3, 2020 •

edited

Loading

DhairyaLGandhi commented Mar 3, 2020

CarloLucibello commented Mar 3, 2020

IanButterworth commented Mar 3, 2020

darsnack commented Mar 3, 2020

DhairyaLGandhi commented Mar 3, 2020

IanButterworth commented Mar 3, 2020

DhairyaLGandhi commented Mar 3, 2020

darsnack commented Mar 3, 2020

DhairyaLGandhi commented Mar 3, 2020 •

edited

Loading

darsnack commented Mar 3, 2020

MikeInnes commented Mar 16, 2020

darsnack commented Mar 16, 2020

MikeInnes commented Mar 16, 2020

Add testmode! back for normalization layers #1044

Add testmode! back for normalization layers #1044

Conversation

darsnack commented Feb 21, 2020

CarloLucibello commented Feb 21, 2020 • edited Loading

CarloLucibello commented Feb 21, 2020

darsnack commented Feb 21, 2020

CarloLucibello commented Feb 21, 2020 • edited Loading

darsnack commented Feb 21, 2020

CarloLucibello commented Feb 27, 2020

CarloLucibello commented Feb 29, 2020

darsnack commented Feb 29, 2020

darsnack commented Feb 29, 2020

CarloLucibello commented Mar 1, 2020 • edited Loading

CarloLucibello commented Mar 1, 2020

CarloLucibello commented Mar 1, 2020

darsnack commented Mar 1, 2020

CarloLucibello commented Mar 1, 2020

darsnack commented Mar 1, 2020

CarloLucibello commented Mar 1, 2020

bors bot commented Mar 1, 2020

Build succeeded

anoojpatel commented Mar 2, 2020

darsnack commented Mar 3, 2020 • edited Loading

DhairyaLGandhi commented Mar 3, 2020

CarloLucibello commented Mar 3, 2020

IanButterworth commented Mar 3, 2020

darsnack commented Mar 3, 2020

DhairyaLGandhi commented Mar 3, 2020

IanButterworth commented Mar 3, 2020

DhairyaLGandhi commented Mar 3, 2020

darsnack commented Mar 3, 2020

DhairyaLGandhi commented Mar 3, 2020 • edited Loading

darsnack commented Mar 3, 2020

MikeInnes commented Mar 16, 2020

darsnack commented Mar 16, 2020

MikeInnes commented Mar 16, 2020

CarloLucibello commented Feb 21, 2020 •

edited

Loading

CarloLucibello commented Feb 21, 2020 •

edited

Loading

CarloLucibello commented Mar 1, 2020 •

edited

Loading

darsnack commented Mar 3, 2020 •

edited

Loading

DhairyaLGandhi commented Mar 3, 2020 •

edited

Loading