Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficientformer #20459

Merged
merged 34 commits into from
Jan 20, 2023
Merged

Efficientformer #20459

merged 34 commits into from
Jan 20, 2023

Conversation

Bearnardd
Copy link
Contributor

@Bearnardd Bearnardd commented Nov 25, 2022

This PR adds Efficientformer, a model that has similar latency as MobileNets, but achieves better accuracy on ImageNet. It is based on the closed PR: #18296

Paper: https://arxiv.org/abs/2206.01191
Code and weights: https://github.com/snap-research/EfficientFormer

Fixes #18041

Who can review?

@alaradirik @NielsRogge

@alaradirik
Copy link
Contributor

Hey @Bearnardd, thank you for working on this! Could you run make fixup to fix the failed style and code quality tests?

Also, type casting function arguments (e.g. def something(arg1: torch.Tensor):) causes errors if the type depends on a conditionally imported library (torch), you can see the failed test logs if you head over to the CI test details. Could you remove those from test_modeling_efficientformer.py?

@alaradirik alaradirik mentioned this pull request Nov 29, 2022
5 tasks
Copy link
Contributor

@alaradirik alaradirik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @Bearnardd! I left a couple of comments to fix inconsistencies and follow the best practices. Most of the comments are easy to address, some of the main issues are:

  • EfficientFormerLastStage could use some restructuring to get rid of using nn.Sequential.
  • test_feature_extraction_efficientformer.py is missing (can be copied from ViT).
  • EfficientFormerForImageClassificationWithTeacher uses a frozen external model (RegNetY-16GF) for distillation I'm not sure if the implementation is accurate as it consists of a single dense layer that is trained alongside the model.
  • Some common modeling tests (defined in tests/test_modeling_common.py) are failing, you can run all tests with:
    pytest tests/models/efficientformer/test_modeling_efficientformer.py

Hope this helps :)

src/transformers/__init__.py Outdated Show resolved Hide resolved
README_ko.md Outdated Show resolved Hide resolved
docs/source/en/index.mdx Outdated Show resolved Hide resolved
docs/source/en/model_doc/efficientformer.mdx Outdated Show resolved Hide resolved
docs/source/en/model_doc/efficientformer.mdx Outdated Show resolved Hide resolved
@Bearnardd
Copy link
Contributor Author

Hi @alaradirik - thank you very much for the detailed review! I will address the changes shortly :) . I am aware of the failing tests but I am not entirely sure how to count the number of expected attentions and hidden layers for this particular model since it does not have a "standard" transformer based architecture. Nevertheless I think that I will address the current comments and as the next step I will ask you some questions about expected attention and hidden outputs.

@alaradirik
Copy link
Contributor

Hi @alaradirik - thank you very much for the detailed review! I will address the changes shortly :) . I am aware of the failing tests but I am not entirely sure how to count the number of expected attentions and hidden layers for this particular model since it does not have a "standard" transformer based architecture. Nevertheless I think that I will address the current comments and as the next step I will ask you some questions about expected attention and hidden outputs.

Hey @Bearnardd, no problem at all! We define our own small model architecture within the test_modeling_efficientformer.py file as it is faster to test with a smaller dummy model. You would just need to check num_hidden_layers and num_attention_heads attributes of the test class to see the expected number of layers. It seems the model has the correct number of attention heads and hidden layers but doesn't return all of the outputs (attentions and hidden state outputs from all layers).

If you are sure the implementation is correct and this is expected (in this case or other cases), you can always override the common tests within test_modeling_efficientformer.py by adding a method with the same name to the test class (EfficientFormerModelTester).

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Dec 19, 2022

The documentation is not available anymore as the PR was closed or merged.

Copy link
Contributor

@alaradirik alaradirik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me!

I left few a comments, including a code snippet to fix the model initialization test failure. My main comment is regarding the feature extractor. Could you rename EfficientFormerFeatureExtractor as EfficientFormerImageProcessor?

We are changing the naming to avoid confusion as users sometimes think the feature extractor is a model itself rather than the preprocessor. The renaming requires changing the filename (image_processing_efficientformer.py), test filename, replacing all instances of feature extraction imports with the image processor and adding it to models/auto/image_processing_auto.py.

You can see an example of this in this PR.

src/transformers/models/auto/modeling_auto.py Outdated Show resolved Hide resolved
docs/source/en/model_doc/efficientformer.mdx Show resolved Hide resolved
README_hd.md Show resolved Hide resolved
README_zh-hans.md Outdated Show resolved Hide resolved
@sgugger
Copy link
Collaborator

sgugger commented Jan 8, 2023

Thanks @Bearnardd !
@NielsRogge I'll let you have one last look and merge if you're happy :-)

Copy link
Contributor

@alaradirik alaradirik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR is almost ready to be merged and looks great overall!

I left a few comments to update all repo names and fix the 3 failing slow tests. You can run all tests with:
RUN_SLOW=True pytest tests/models/efficientformer/test_modeling_efficientformer.py

@alaradirik alaradirik merged commit 1b37fb5 into huggingface:main Jan 20, 2023
ts2095 pushed a commit to ts2095/transformers that referenced this pull request Jan 20, 2023
- Adds EfficientFormer V1 to transformers
- PR co-authored by @novice03  and @Bearnardd 

Co-authored-by: novice <pranavpulijala@gmail.com>
Co-authored-by: novice <44259234+novice03@users.noreply.github.com>
@novice03
Copy link
Contributor

Thank you so much for working on this @Bearnardd!

venkat-natchi pushed a commit to venkat-natchi/transformers that referenced this pull request Jan 22, 2023
- Adds EfficientFormer V1 to transformers
- PR co-authored by @novice03  and @Bearnardd

Co-authored-by: novice <pranavpulijala@gmail.com>
Co-authored-by: novice <44259234+novice03@users.noreply.github.com>
miyu386 pushed a commit to miyu386/transformers that referenced this pull request Feb 9, 2023
- Adds EfficientFormer V1 to transformers
- PR co-authored by @novice03  and @Bearnardd 

Co-authored-by: novice <pranavpulijala@gmail.com>
Co-authored-by: novice <44259234+novice03@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

EfficientFormer
6 participants