Add FAN Model #20417

kiansierra · 2022-11-23T15:19:47Z

What does this PR do?

Implements the FAN Models described in this paper and available in the following github repo, Additionally this repo has some of the weights available as described in their README file.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

This is a cleanup from previous PR #20288 in order to mantain branch integrity, recommendations by @NielsRogge were implemented

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@NielsRogge, @sgugger, @patrickvonplaten

Additional Request

If this PR gets merged, would it be possible to migrate the model files from my HF space to the nvidia space

sgugger

Thanks for your PR. There is a lot to do to clean up the modeling code, I have started to give pointers. In general we require modeling code to be as explicit as possible for readability which means:

don't use short intermediate functions but directly do the thing when you use that intermediate function, so the reader does not have to go back and forth to understand what is happening.
don't use one-letter variable names
all building blocks of the model should be prefixed by Fan and initialized from the model config as much as possible
avoid flags like linear=False that trigger different code-paths.

sgugger · 2022-11-28T16:31:49Z

README.md

 1. **[EncoderDecoder](https://huggingface.co/docs/transformers/model_doc/encoder-decoder)** (from Google Research) released with the paper [Leveraging Pre-trained Checkpoints for Sequence Generation Tasks](https://arxiv.org/abs/1907.12461) by Sascha Rothe, Shashi Narayan, Aliaksei Severyn.
 1. **[ERNIE](https://huggingface.co/docs/transformers/model_doc/ernie)** (from Baidu) released with the paper [ERNIE: Enhanced Representation through Knowledge Integration](https://arxiv.org/abs/1904.09223) by Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, Hua Wu.
 1. **[ESM](https://huggingface.co/docs/transformers/model_doc/esm)** (from Meta AI) are transformer protein language models.  **ESM-1b** was released with the paper [Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences](https://www.pnas.org/content/118/15/e2016239118) by Alexander Rives, Joshua Meier, Tom Sercu, Siddharth Goyal, Zeming Lin, Jason Liu, Demi Guo, Myle Ott, C. Lawrence Zitnick, Jerry Ma, and Rob Fergus. **ESM-1v** was released with the paper [Language models enable zero-shot prediction of the effects of mutations on protein function](https://doi.org/10.1101/2021.07.09.450648) by Joshua Meier, Roshan Rao, Robert Verkuil, Jason Liu, Tom Sercu and Alexander Rives. **ESM-2 and ESMFold** were released with the paper [Language models of protein sequences at the scale of evolution enable accurate structure prediction](https://doi.org/10.1101/2022.07.20.500902) by Zeming Lin, Halil Akin, Roshan Rao, Brian Hie, Zhongkai Zhu, Wenting Lu, Allan dos Santos Costa, Maryam Fazel-Zarandi, Tom Sercu, Sal Candido, Alexander Rives.
+1. **[FAN](https://huggingface.co/docs/transformers/model_doc/fan)** (from NVIDIA) was released with the paper [Understanding The Robustness in Vision Transformers](https://arxiv.org/abs/2204.12451) by Daquan Zhou, Zhiding Yu, Enze Xie, Chaowei Xiao, Anima Anandkumar, Jiashi Feng and Jose M. Alvarez. Original code can be found in the repository [NVlabs/FAN](https://github.com/NVlabs/FAN).


Suggested change

1. **[FAN](https://huggingface.co/docs/transformers/model_doc/fan)** (from NVIDIA) was released with the paper [Understanding The Robustness in Vision Transformers](https://arxiv.org/abs/2204.12451) by Daquan Zhou, Zhiding Yu, Enze Xie, Chaowei Xiao, Anima Anandkumar, Jiashi Feng and Jose M. Alvarez. Original code can be found in the repository [NVlabs/FAN](https://github.com/NVlabs/FAN).

1. **[FAN](https://huggingface.co/docs/transformers/main/model_doc/fan)** (from NVIDIA) was released with the paper [Understanding The Robustness in Vision Transformers](https://arxiv.org/abs/2204.12451) by Daquan Zhou, Zhiding Yu, Enze Xie, Chaowei Xiao, Anima Anandkumar, Jiashi Feng and Jose M. Alvarez. Original code can be found in the repository [NVlabs/FAN](https://github.com/NVlabs/FAN).

Then you will need to run make fix-copies again to fix the other READMEs :-)

docs/source/en/model_doc/fan.mdx

src/transformers/__init__.py

src/transformers/models/auto/configuration_auto.py

sgugger · 2022-11-28T16:46:47Z

src/transformers/models/fan/modeling_fan.py

+
+    def __init__(self, img_size=224, patch_size=16, in_chans=3, hidden_size=768, act_layer=nn.GELU):
+        super().__init__()
+        img_size = to_2tuple(img_size)


Here make an explicit test:

img_size if isinstance(img_size, collections.abc.Iterable) else (img_size, img_size)

sgugger · 2022-11-28T16:47:21Z

src/transformers/models/fan/modeling_fan.py

+
+    def forward(self, x, return_feat=False):
+        x = self.proj(x)
+        Hp, Wp = x.shape[2], x.shape[3]


Please use explicit variable names that respect Python conventions (capitals are for class names)

src/transformers/models/fan/modeling_fan.py

sgugger · 2022-11-28T16:48:03Z

src/transformers/models/fan/modeling_fan.py

+
+
+# Copied from timm.models.layers.mlp
+class MlpOri(nn.Module):


Should have an explicit name and be prefixed by Fan.

kiansierra · 2022-12-03T15:36:07Z

Hi @sgugger thanks for you're feedback. I'll try to implement the changes soon

kiansierra · 2022-12-09T12:38:00Z

Implemented suggestions by @sgugger.

Pending the change on the README.md path, since I'm uncertain if I need to change only the README.md path or the actual doc path.

Also pending rebase

sgugger · 2022-12-09T15:39:18Z

Thanks for working on this! You need to change the link to the doc in the READMEs as suggested, but not the path to the file. You will also need to rebase/resolve the conflicts.

@NielsRogge could you have a review before I do a final pass?

…ation

…mbed

kiansierra · 2022-12-14T09:04:50Z

I've applied the README.md update and rebased the branch.

kiansierra · 2023-01-03T09:38:36Z

Hi @NielsRogge, @sgugger.

First of all happy new year, I hope 2023 is greater success than 2022 was for the huggingface team.
I've resolved the merge conflicts, and was hoping to know if any additional steps were required for this PR?

HuggingFaceDocBuilderDev · 2023-01-03T09:52:30Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

github-actions · 2023-01-27T15:02:29Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

kiansierra changed the title ~~Add fan~~ Add FAN Model Nov 24, 2022

sgugger reviewed Nov 28, 2022

View reviewed changes

kiansierra added 25 commits December 14, 2022 09:52

created FAN Model

1b1c98d

config and models weight similar to fan

544c8cd

modelling has FAN Encoder

454fab0

AutModel for Image Classification available

172d613

model takes care of different rounding for classification and segment…

9bc1f86

…ation

forward pass matches, prior to changing gamma for weight

a7ae620

renamed gamma for weight and from_pretrained load correctly

f4da8df

current state pre cleanup

4dbe3fd

classifiers mapping and working correctly

e0e327e

renamed batch size and added return_feat forward kwarg for ConvPatchE…

98d466f

…mbed

embeddings working

339c61d

Adding encoder block

de813b8

encoder layer appears to be functional

3075cc2

fixed use checkpoint

31688bf

commit pre upload checkpoints

d965b6d

fixed init conflicts

5436d03

added output attention and timm layers

cd2377a

Changed version for stable instalation

d75ffe2

added FANForSemanticSegmentation to automodels

1008b27

first doc draft

8aba253

updated config

8b40764

remapped one feature num_heads to num_attention_heads

2af75fd

renamed configuration parameters dropout_ration and depth

cefec76

removed sr_ratio from config file, since it is unused

5a3e793

removed unused sr_ratio from modeling

ee4666a

kiansierra added 18 commits December 14, 2022 09:58

added NVIDIA to modeling copyright

de5ec33

updated arxiv link in comments

41cad1e

removed FANLayer and load_tf_weights_in_fan

f1f3edb

updated to include ImageProcessor

7a90ca1

updated README.md to include FAN

59eed1d

applied styling formats

a13637e

updated test to require vision

07c9e19

applied make style

cad76fd

removed FANFeatureExtractor from examples and docs

bafd48b

applied make style

e4d4d2d

adding FANFeatureExtractor to docs

f1e6e0c

make all hidden states 3 dimensional

4f98398

removed competed todo tasks

c6fb2ab

set defaults to match config off fan_base_18_p16_224

1702637

fixing modeling superclass

39e5c8f

changed copied from line to pass make

3b1cf30

make repo-consistency changes

7d7cbb2

applied README fix

61c8e2b

kiansierra force-pushed the add-fan branch from 2d57df6 to 61c8e2b Compare December 14, 2022 09:00

kiansierra and others added 4 commits December 14, 2022 10:17

applied sorting to import structure and make styling

7658055

squashed all commits

0a4a56e

Merge branch 'main' into add-fan

b27357f

Merge branch 'main' into add-fan

183f2db

kiansierra added 2 commits January 3, 2023 10:52

applied make style

51483db

applied fix-copies

60a6b4c

github-actions bot closed this Feb 5, 2023

	1. [FAN](https://huggingface.co/docs/transformers/model_doc/fan) (from NVIDIA) was released with the paper [Understanding The Robustness in Vision Transformers](https://arxiv.org/abs/2204.12451) by Daquan Zhou, Zhiding Yu, Enze Xie, Chaowei Xiao, Anima Anandkumar, Jiashi Feng and Jose M. Alvarez. Original code can be found in the repository [NVlabs/FAN](https://github.com/NVlabs/FAN).
	1. [FAN](https://huggingface.co/docs/transformers/main/model_doc/fan) (from NVIDIA) was released with the paper [Understanding The Robustness in Vision Transformers](https://arxiv.org/abs/2204.12451) by Daquan Zhou, Zhiding Yu, Enze Xie, Chaowei Xiao, Anima Anandkumar, Jiashi Feng and Jose M. Alvarez. Original code can be found in the repository [NVlabs/FAN](https://github.com/NVlabs/FAN).



		# Copied from timm.models.layers.mlp
		class MlpOri(nn.Module):

Add FAN Model #20417

Add FAN Model #20417

Uh oh!

Conversation

kiansierra commented Nov 23, 2022

What does this PR do?

Before submitting

Who can review?

Additional Request

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

sgugger Nov 28, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sgugger Nov 28, 2022

Choose a reason for hiding this comment

Uh oh!

sgugger Nov 28, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

sgugger Nov 28, 2022

Choose a reason for hiding this comment

Uh oh!

kiansierra commented Dec 3, 2022

Uh oh!

kiansierra commented Dec 9, 2022

Uh oh!

sgugger commented Dec 9, 2022

Uh oh!

kiansierra commented Dec 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kiansierra commented Jan 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Jan 3, 2023

Uh oh!

github-actions bot commented Jan 27, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kiansierra commented Dec 14, 2022 •

edited

Loading

kiansierra commented Jan 3, 2023 •

edited

Loading