-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add EfficientNet Architecture in TorchVision #4293
Conversation
4e5957f
to
411ce25
Compare
95dedaf
to
edbe693
Compare
edbe693
to
bb1bb17
Compare
Hi, Do you know when this will be released ? |
@rvandeghen No concrete date yet but as you can see I'm working on it. It's still early days and the architecture needs to be verified for all variants, add pre-trained weights etc. Subscribe to this PR to get notified when it's merged. |
75c132a
to
8ff7604
Compare
12fa4d2
to
14ce91f
Compare
The failing tests are not related. |
def1c52
to
86f812a
Compare
86f812a
to
7f8dae3
Compare
@datumbox thanks for putting the mention of weight sources in doc file and weight links comment, the B0-B4 took some time on my part to train. The B0, B3, and B4 that I trained are all better than the original TF baseline, RandAugment, and AdvProp + AA weights on ImageNet-1k, the B1 and B2 are earlier attempts that are decent, but not to the same level, I didn't follow up on since I rarely use those model sizes. Re those B5-B7 weights, I believe those are just the weights from the original tensorflow models (https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet, equivalent to my I'm always open to training B5-B7/B8 but would need some sort of business arrangement + someone elses compute resources as it'd would be a significant commitment in time and compute to iterate and produce good weights. |
@rwightman No worries, thanks for your work on it. :) Indeed B5-B7 require the "same" padding. We were debating whether to add code to handle this in our implementation but we decided not to because a) the degradation is very small (0.1-0.2 points), b) it's slower and c) as you said it can be resolved by retraining the models on the future. For now, users who want to faithfully reproduce the models can use your implementation. Concerning B1 and B2, yes I noticed that they can be further optimized but I think the models are good enough for now. We are happy to review them once the #3911 epic is completed as it will add lots of standard utils in TorchVision to achieve SOTA results. Some of them are added directly in PyTorch (LabelSmoothing, Warmup schedulers etc), so it might be feasible to reduce the amount of code you maintain on timm and help you focus on bringing more models. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks!
resize_size, crop_size = 256, 224 | ||
interpolation = InterpolationMode.BILINEAR | ||
if args.model == 'inception_v3': | ||
resize_size, crop_size = 342, 299 | ||
elif args.model.startswith('efficientnet_'): | ||
sizes = { | ||
'b0': (256, 224), 'b1': (256, 240), 'b2': (288, 288), 'b3': (320, 300), | ||
'b4': (384, 380), 'b5': (456, 456), 'b6': (528, 528), 'b7': (600, 600), | ||
} | ||
e_type = args.model.replace('efficientnet_', '') | ||
resize_size, crop_size = sizes[e_type] | ||
interpolation = InterpolationMode.BICUBIC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed with @datumbox on chat, I think it would be good in the future to factor this out somewhere else, maybe as a set of custom preset transforms
from .._internally_replaced_utils import load_state_dict_from_url | ||
from torchvision.ops import StochasticDepth | ||
|
||
from torchvision.models.mobilenetv2 import ConvBNActivation, _make_divisible |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we might want to put some of those helper functions elsewhere in the future. ConvBNActivation
could even be in torchvision.ops
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I want to defer this until Batteries Included is completed to see how many ops/layers we need and then refactor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly, A right place for helpers could be torchvision.layers
or torchvision.nn
.
One small reason why I think ops might not be good place is to distinguish post and preprocessing operations such as nms, IoU, box operations from generic layers that build models.
Torchtext does something similar https://github.com/pytorch/text/tree/main/torchtext/nn
There are a few candidates. such as ConvBNActivation, ConvActBN, squeezeExcite, MLP , to name a few.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we need to talk about this. I might also need to move the StochasticDepth layer from ops for exactly the same reason. Do you want to open an issue with the potential things we want to share across models along with their location? I remember there was an old issue asking about having nn
but I would probably open a new one with increased scope (sharing blocks across models).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I will open a new issue and also list down potential things we would like to share 😃
Summary: * Adding code skeleton * Adding MBConvConfig. * Extend SqueezeExcitation to support custom min_value and activation. * Implement MBConv. * Replace stochastic_depth with operator. * Adding the rest of the EfficientNet implementation * Update torchvision/models/efficientnet.py * Replacing 1st activation of SE with SiLU. * Adding efficientnet_b3. * Replace mobilenetv3 assets with custom. * Switch to standard sigmoid and reconfiguring BN. * Reconfiguration of efficientnet. * Add repr * Add weights. * Update weights. * Adding B5-B7 weights. * Update docs and hubconf. * Fix doc link. * Fix typo on comment. Reviewed By: fmassa Differential Revision: D30793344 fbshipit-source-id: 74b5fed89fd251372d17234d33984b71abd1a860
Fixes #980
The implementation follows a similar approach as MobileNetV3 and reuses components from existing models.
The weights are ported from the repos of @rwightman and @lukemelas.