Unify lower level API for EfficientNet and MobileNet model families #200

theabhirath · 2022-08-26T13:43:17Z

This PR unifies the lower level API for the EfficientNet and MobileNet model families into a single irmodelbuilder function. This followed as a natural consequence of #198 - a considerable amount of the calculations, model structure and configurations have now been abstracted out enough into pieces that compose, meaning that most of the models in these two families can be almost completely identified by their configuration dict and a couple of other arguments. Amongst other things, what this meant was

a simplification of the builders for the inverted residual block stages,
an ability to expose a more friendly mid-level API for all models that doesn't need the user to specify unexported, constant configuration dicts. This step is still in process but once completed, should hopefully result in a smoother experience for advanced users and
Even more reduction of code duplication

Documentation TODO

Builders
Mid level API

Miscellaneous Notes

In a step closer to a uniform API, the higher level model APIs now all accept the pretrain keyword argument.

1. Expose `pretrain` option for all models 2. Make it easier to initialise models with config options at the mid level by providing an additional dispatch 3. Some cleanup + documentation

1. Unify MobileNet and EfficientNet APIs into a single lower level builder function 2. Further unification and consolidation of the mid level API 3. Some cleanup

ToucheSir · 2022-08-27T04:58:57Z

src/layers/selayers.jl

+                                conv_norm((1, 1), inplanes, squeeze_planes, activation;
+                                          norm_layer)...,
+                                conv_norm((1, 1), squeeze_planes, inplanes,
+                                          gate_activation; norm_layer)...), .*)


TIL that (.*) returns a Base.Broadcast.BroadcastFunction. Maybe we should create an alias for it in NNlib...

ToucheSir

Fewer comments than I thought for a first pass :)

src/convnets/builders/resnet.jl

src/convnets/builders/mbconv.jl

1. Rename `DropPath` to `StochasticDepth` to better reflect the paper 2. `_prob` is more accurate than `_rate`. 3. Add stochastic depth to the `mbconv` builders

theabhirath · 2022-08-28T03:09:49Z

This PR now also adds an additional batch mode to StochasticDepth (renamed from DropPath because it's not very descriptive and also to mimic torchvision) for completeness, fixes some minor issues with linear scheduling rates, renames all the *_rate to *_prob to make it clear that these are probability values and finally, adds StochasticDepth with a default value of 0.2 to the EfficientNets, finally achieving feature parity with the papers and torchvision/timm

More cleanup

ToucheSir · 2022-08-29T02:43:13Z

src/layers/drop.jl

 """
-    DropPath(p; [rng = rng_from_array(x)])


To make this non/less breaking, adding a deprecation for DropPath would work.

But the Layers API is unexported, so does this still count as a breaking change?

It really should be exported in 0.8, but for now it isn't so I suppose it might work out in our favour 😄

I assumed it was why you added the label, but that is true. Built docs list it as "public", but I think that's a false positive and JuliaHub + GH search turn up no results.

Oh no, the breaking change is that EfficientNet literally returns a different model 😅

theabhirath · 2022-09-01T06:17:15Z

This PR is complete I think - any further docs improvements would be best handled in a separate PR after landing this one and then #199

darsnack

Just a few final naming things. Also, we can tackle more thorough docs including docstrings separately.

src/convnets/builders/irmodel.jl

src/convnets/builders/mbconv.jl

darsnack · 2022-09-02T13:44:21Z

src/convnets/builders/mbconv.jl

+                        stochastic_depth_prob = nothing, norm_layer = BatchNorm,
+                        divisor::Integer = 8, kwargs...)


Do we need to have unused keywords? As far as I can tell, right now it is silently ignored, but not including extraneous keywords would throw a MethodError (as it should).

These are being used though, to pass stuff like the se_round_fn in for MobileNetv3

I specifically mean stuff like stochastic_depth_prob

I specifically mean stuff like stochastic_depth_prob

This particular one needs to be there because it will cause a MethodError for dwsep_conv_norm if passed through (and the default model builder passes it through). See also #200 (comment)

theabhirath · 2022-09-04T03:33:07Z

Bump?

shivance · 2022-12-24T18:00:24Z

src/convnets/builders/mbconv.jl

+                       stochastic_depth_prob = nothing, norm_layer = BatchNorm,
+                       divisor::Integer = 8, kwargs...)
+    width_mult, depth_mult = scalings
+    block_repeats = [ceil(Int, block_configs[idx][end - 1] * depth_mult)


For Mobilenet the config list is

const MOBILENETV2_CONFIGS = [ (mbconv, 3, 16, 1, 1, 1, nothing, relu6), (mbconv, 3, 24, 6, 2, 2, nothing, relu6), (mbconv, 3, 32, 6, 2, 3, nothing, relu6), (mbconv, 3, 64, 6, 2, 4, nothing, relu6), (mbconv, 3, 96, 6, 1, 3, nothing, relu6), (mbconv, 3, 160, 6, 2, 3, nothing, relu6), (mbconv, 3, 320, 6, 1, 1, nothing, relu6), ]

so won't block_repeats just be a list of nothing ?
and as the name suggest, it tells how many time a particular block of layers is repeated?

More uniform mid level API

251b323

1. Expose `pretrain` option for all models 2. Make it easier to initialise models with config options at the mid level by providing an additional dispatch 3. Some cleanup + documentation

theabhirath force-pushed the midlevel branch from 4c5e052 to 5e0b218 Compare August 26, 2022 13:54

theabhirath added this to the 0.8 milestone Aug 26, 2022

theabhirath requested review from darsnack and ToucheSir August 26, 2022 13:57

theabhirath force-pushed the midlevel branch from 5e0b218 to 8e4b346 Compare August 26, 2022 14:42

The one true function

5ce0204

1. Unify MobileNet and EfficientNet APIs into a single lower level builder function 2. Further unification and consolidation of the mid level API 3. Some cleanup

theabhirath force-pushed the midlevel branch from 8e4b346 to 5ce0204 Compare August 26, 2022 18:46

ToucheSir reviewed Aug 27, 2022

View reviewed changes

src/convnets/builders/resnet.jl Outdated Show resolved Hide resolved

src/convnets/builders/mbconv.jl Outdated Show resolved Hide resolved

Complete StochasticDepth

4f487a0

1. Rename `DropPath` to `StochasticDepth` to better reflect the paper 2. `_prob` is more accurate than `_rate`. 3. Add stochastic depth to the `mbconv` builders

theabhirath added the breaking label Aug 28, 2022

theabhirath force-pushed the midlevel branch from 1bc26e1 to 97c3911 Compare August 28, 2022 03:16

Add StochasticDepth to EfficientNets

4cdcaef

More cleanup

theabhirath force-pushed the midlevel branch from 97c3911 to 4cdcaef Compare August 28, 2022 03:21

Add some docstrings

7827ca6

ToucheSir reviewed Aug 29, 2022

View reviewed changes

More docstrings

e048d78

theabhirath force-pushed the midlevel branch from b49fc40 to f9458b9 Compare September 2, 2022 10:52

Some more cleanup

da5d3a7

theabhirath force-pushed the midlevel branch from f9458b9 to da5d3a7 Compare September 2, 2022 10:55

darsnack requested changes Sep 2, 2022

View reviewed changes

Renaming ir_ to invres_

ab96700

theabhirath force-pushed the midlevel branch from 51858eb to ab96700 Compare September 3, 2022 01:06

theabhirath requested a review from darsnack September 3, 2022 01:32

darsnack approved these changes Sep 4, 2022

View reviewed changes

darsnack merged commit 33c5257 into FluxML:master Sep 4, 2022

theabhirath deleted the midlevel branch September 4, 2022 13:54

theabhirath mentioned this pull request Sep 4, 2022

Migrate docs to Documenter.jl #199

Merged

shivance reviewed Dec 24, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unify lower level API for EfficientNet and MobileNet model families #200

Unify lower level API for EfficientNet and MobileNet model families #200

theabhirath commented Aug 26, 2022 •

edited

Loading

ToucheSir Aug 27, 2022

ToucheSir left a comment

theabhirath commented Aug 28, 2022

ToucheSir Aug 29, 2022

theabhirath Aug 29, 2022

theabhirath Aug 29, 2022

ToucheSir Aug 29, 2022 •

edited

Loading

theabhirath Aug 29, 2022

theabhirath commented Sep 1, 2022

darsnack left a comment

darsnack Sep 2, 2022

theabhirath Sep 3, 2022

darsnack Sep 4, 2022 •

edited

Loading

theabhirath Sep 4, 2022 •

edited

Loading

theabhirath commented Sep 4, 2022

shivance Dec 24, 2022

		stochastic_depth_prob = nothing, norm_layer = BatchNorm,
		divisor::Integer = 8, kwargs...)

Unify lower level API for EfficientNet and MobileNet model families #200

Unify lower level API for EfficientNet and MobileNet model families #200

Conversation

theabhirath commented Aug 26, 2022 • edited Loading

Documentation TODO

Miscellaneous Notes

Choose a reason for hiding this comment

ToucheSir left a comment

Choose a reason for hiding this comment

theabhirath commented Aug 28, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ToucheSir Aug 29, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

theabhirath commented Sep 1, 2022

darsnack left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

darsnack Sep 4, 2022 • edited Loading

Choose a reason for hiding this comment

theabhirath Sep 4, 2022 • edited Loading

Choose a reason for hiding this comment

theabhirath commented Sep 4, 2022

Choose a reason for hiding this comment

theabhirath commented Aug 26, 2022 •

edited

Loading

ToucheSir Aug 29, 2022 •

edited

Loading

darsnack Sep 4, 2022 •

edited

Loading

theabhirath Sep 4, 2022 •

edited

Loading