-
-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unify lower level API for EfficientNet and MobileNet model families #200
Conversation
1. Expose `pretrain` option for all models 2. Make it easier to initialise models with config options at the mid level by providing an additional dispatch 3. Some cleanup + documentation
1. Unify MobileNet and EfficientNet APIs into a single lower level builder function 2. Further unification and consolidation of the mid level API 3. Some cleanup
conv_norm((1, 1), inplanes, squeeze_planes, activation; | ||
norm_layer)..., | ||
conv_norm((1, 1), squeeze_planes, inplanes, | ||
gate_activation; norm_layer)...), .*) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL that (.*)
returns a Base.Broadcast.BroadcastFunction
. Maybe we should create an alias for it in NNlib...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fewer comments than I thought for a first pass :)
1. Rename `DropPath` to `StochasticDepth` to better reflect the paper 2. `_prob` is more accurate than `_rate`. 3. Add stochastic depth to the `mbconv` builders
This PR now also adds an additional |
More cleanup
""" | ||
DropPath(p; [rng = rng_from_array(x)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make this non/less breaking, adding a deprecation for DropPath
would work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the Layers API is unexported, so does this still count as a breaking change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It really should be exported in 0.8, but for now it isn't so I suppose it might work out in our favour 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assumed it was why you added the label, but that is true. Built docs list it as "public", but I think that's a false positive and JuliaHub + GH search turn up no results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh no, the breaking change is that EfficientNet
literally returns a different model 😅
This PR is complete I think - any further docs improvements would be best handled in a separate PR after landing this one and then #199 |
b49fc40
to
f9458b9
Compare
f9458b9
to
da5d3a7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few final naming things. Also, we can tackle more thorough docs including docstrings separately.
src/convnets/builders/mbconv.jl
Outdated
stochastic_depth_prob = nothing, norm_layer = BatchNorm, | ||
divisor::Integer = 8, kwargs...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to have unused keywords? As far as I can tell, right now it is silently ignored, but not including extraneous keywords would throw a MethodError
(as it should).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are being used though, to pass stuff like the se_round_fn
in for MobileNetv3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I specifically mean stuff like stochastic_depth_prob
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I specifically mean stuff like
stochastic_depth_prob
This particular one needs to be there because it will cause a MethodError
for dwsep_conv_norm
if passed through (and the default model builder passes it through). See also #200 (comment)
51858eb
to
ab96700
Compare
Bump? |
stochastic_depth_prob = nothing, norm_layer = BatchNorm, | ||
divisor::Integer = 8, kwargs...) | ||
width_mult, depth_mult = scalings | ||
block_repeats = [ceil(Int, block_configs[idx][end - 1] * depth_mult) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Mobilenet the config list is
const MOBILENETV2_CONFIGS = [
(mbconv, 3, 16, 1, 1, 1, nothing, relu6),
(mbconv, 3, 24, 6, 2, 2, nothing, relu6),
(mbconv, 3, 32, 6, 2, 3, nothing, relu6),
(mbconv, 3, 64, 6, 2, 4, nothing, relu6),
(mbconv, 3, 96, 6, 1, 3, nothing, relu6),
(mbconv, 3, 160, 6, 2, 3, nothing, relu6),
(mbconv, 3, 320, 6, 1, 1, nothing, relu6),
]
so won't block_repeats
just be a list of nothing
?
and as the name suggest, it tells how many time a particular block of layers is repeated?
This PR unifies the lower level API for the EfficientNet and MobileNet model families into a single
irmodelbuilder
function. This followed as a natural consequence of #198 - a considerable amount of the calculations, model structure and configurations have now been abstracted out enough into pieces that compose, meaning that most of the models in these two families can be almost completely identified by their configuration dict and a couple of other arguments. Amongst other things, what this meant wasDocumentation TODO
Miscellaneous Notes
In a step closer to a uniform API, the higher level model APIs now all accept the
pretrain
keyword argument.