Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add EfficientNet Architecture in TorchVision #4293

Merged
merged 23 commits into from
Aug 26, 2021

Conversation

datumbox
Copy link
Contributor

@datumbox datumbox commented Aug 19, 2021

Fixes #980

The implementation follows a similar approach as MobileNetV3 and reuses components from existing models.

The weights are ported from the repos of @rwightman and @lukemelas.

@datumbox datumbox force-pushed the models/efficientnet branch 2 times, most recently from 95dedaf to edbe693 Compare August 20, 2021 07:59
@rvandeghen
Copy link
Contributor

Hi,

Do you know when this will be released ?

@datumbox
Copy link
Contributor Author

@rvandeghen No concrete date yet but as you can see I'm working on it. It's still early days and the architecture needs to be verified for all variants, add pre-trained weights etc.

Subscribe to this PR to get notified when it's merged.

@datumbox datumbox marked this pull request as ready for review August 25, 2021 13:27
@datumbox datumbox requested a review from fmassa August 25, 2021 13:27
@datumbox
Copy link
Contributor Author

The failing tests are not related.

@rwightman
Copy link
Contributor

rwightman commented Aug 25, 2021

@datumbox thanks for putting the mention of weight sources in doc file and weight links comment, the B0-B4 took some time on my part to train. The B0, B3, and B4 that I trained are all better than the original TF baseline, RandAugment, and AdvProp + AA weights on ImageNet-1k, the B1 and B2 are earlier attempts that are decent, but not to the same level, I didn't follow up on since I rarely use those model sizes.

Re those B5-B7 weights, I believe those are just the weights from the original tensorflow models (https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet, equivalent to my tf_ weights) as ported by Luke. They need assymetric TF 'SAME' padding for full performance and will suffer reduced accuracy without. That drop will vary as you change the input size and have different padding offsets through the model relative to what SAME would produce. That can be more problematic using them as backbones for say obj detection / segmentation, esp if you freeze some of the layers.

I'm always open to training B5-B7/B8 but would need some sort of business arrangement + someone elses compute resources as it'd would be a significant commitment in time and compute to iterate and produce good weights.

@datumbox
Copy link
Contributor Author

@rwightman No worries, thanks for your work on it. :)

Indeed B5-B7 require the "same" padding. We were debating whether to add code to handle this in our implementation but we decided not to because a) the degradation is very small (0.1-0.2 points), b) it's slower and c) as you said it can be resolved by retraining the models on the future. For now, users who want to faithfully reproduce the models can use your implementation.

Concerning B1 and B2, yes I noticed that they can be further optimized but I think the models are good enough for now. We are happy to review them once the #3911 epic is completed as it will add lots of standard utils in TorchVision to achieve SOTA results. Some of them are added directly in PyTorch (LabelSmoothing, Warmup schedulers etc), so it might be feasible to reduce the amount of code you maintain on timm and help you focus on bringing more models.

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks!

Comment on lines +86 to +97
resize_size, crop_size = 256, 224
interpolation = InterpolationMode.BILINEAR
if args.model == 'inception_v3':
resize_size, crop_size = 342, 299
elif args.model.startswith('efficientnet_'):
sizes = {
'b0': (256, 224), 'b1': (256, 240), 'b2': (288, 288), 'b3': (320, 300),
'b4': (384, 380), 'b5': (456, 456), 'b6': (528, 528), 'b7': (600, 600),
}
e_type = args.model.replace('efficientnet_', '')
resize_size, crop_size = sizes[e_type]
interpolation = InterpolationMode.BICUBIC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed with @datumbox on chat, I think it would be good in the future to factor this out somewhere else, maybe as a set of custom preset transforms

from .._internally_replaced_utils import load_state_dict_from_url
from torchvision.ops import StochasticDepth

from torchvision.models.mobilenetv2 import ConvBNActivation, _make_divisible
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we might want to put some of those helper functions elsewhere in the future. ConvBNActivation could even be in torchvision.ops

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I want to defer this until Batteries Included is completed to see how many ops/layers we need and then refactor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly, A right place for helpers could be torchvision.layers or torchvision.nn.
One small reason why I think ops might not be good place is to distinguish post and preprocessing operations such as nms, IoU, box operations from generic layers that build models.

Torchtext does something similar https://github.com/pytorch/text/tree/main/torchtext/nn

There are a few candidates. such as ConvBNActivation, ConvActBN, squeezeExcite, MLP , to name a few.

Copy link
Contributor Author

@datumbox datumbox Aug 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we need to talk about this. I might also need to move the StochasticDepth layer from ops for exactly the same reason. Do you want to open an issue with the potential things we want to share across models along with their location? I remember there was an old issue asking about having nn but I would probably open a new one with increased scope (sharing blocks across models).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will open a new issue and also list down potential things we would like to share 😃

@datumbox datumbox merged commit 37a9ee5 into pytorch:main Aug 26, 2021
@datumbox datumbox deleted the models/efficientnet branch August 26, 2021 10:03
facebook-github-bot pushed a commit that referenced this pull request Sep 9, 2021
Summary:
* Adding code skeleton

* Adding MBConvConfig.

* Extend SqueezeExcitation to support custom min_value and activation.

* Implement MBConv.

* Replace stochastic_depth with operator.

* Adding the rest of the EfficientNet implementation

* Update torchvision/models/efficientnet.py

* Replacing 1st activation of SE with SiLU.

* Adding efficientnet_b3.

* Replace mobilenetv3 assets with custom.

* Switch to standard sigmoid and reconfiguring BN.

* Reconfiguration of efficientnet.

* Add repr

* Add weights.

* Update weights.

* Adding B5-B7 weights.

* Update docs and hubconf.

* Fix doc link.

* Fix typo on comment.

Reviewed By: fmassa

Differential Revision: D30793344

fbshipit-source-id: 74b5fed89fd251372d17234d33984b71abd1a860
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Add EfficientNet
6 participants