-
Notifications
You must be signed in to change notification settings - Fork 31.6k
Add FAN Model #20417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add FAN Model #20417
Conversation
sgugger
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your PR. There is a lot to do to clean up the modeling code, I have started to give pointers. In general we require modeling code to be as explicit as possible for readability which means:
- don't use short intermediate functions but directly do the thing when you use that intermediate function, so the reader does not have to go back and forth to understand what is happening.
- don't use one-letter variable names
- all building blocks of the model should be prefixed by Fan and initialized from the model config as much as possible
- avoid flags like
linear=Falsethat trigger different code-paths.
README.md
Outdated
| 1. **[EncoderDecoder](https://huggingface.co/docs/transformers/model_doc/encoder-decoder)** (from Google Research) released with the paper [Leveraging Pre-trained Checkpoints for Sequence Generation Tasks](https://arxiv.org/abs/1907.12461) by Sascha Rothe, Shashi Narayan, Aliaksei Severyn. | ||
| 1. **[ERNIE](https://huggingface.co/docs/transformers/model_doc/ernie)** (from Baidu) released with the paper [ERNIE: Enhanced Representation through Knowledge Integration](https://arxiv.org/abs/1904.09223) by Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, Hua Wu. | ||
| 1. **[ESM](https://huggingface.co/docs/transformers/model_doc/esm)** (from Meta AI) are transformer protein language models. **ESM-1b** was released with the paper [Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences](https://www.pnas.org/content/118/15/e2016239118) by Alexander Rives, Joshua Meier, Tom Sercu, Siddharth Goyal, Zeming Lin, Jason Liu, Demi Guo, Myle Ott, C. Lawrence Zitnick, Jerry Ma, and Rob Fergus. **ESM-1v** was released with the paper [Language models enable zero-shot prediction of the effects of mutations on protein function](https://doi.org/10.1101/2021.07.09.450648) by Joshua Meier, Roshan Rao, Robert Verkuil, Jason Liu, Tom Sercu and Alexander Rives. **ESM-2 and ESMFold** were released with the paper [Language models of protein sequences at the scale of evolution enable accurate structure prediction](https://doi.org/10.1101/2022.07.20.500902) by Zeming Lin, Halil Akin, Roshan Rao, Brian Hie, Zhongkai Zhu, Wenting Lu, Allan dos Santos Costa, Maryam Fazel-Zarandi, Tom Sercu, Sal Candido, Alexander Rives. | ||
| 1. **[FAN](https://huggingface.co/docs/transformers/model_doc/fan)** (from NVIDIA) was released with the paper [Understanding The Robustness in Vision Transformers](https://arxiv.org/abs/2204.12451) by Daquan Zhou, Zhiding Yu, Enze Xie, Chaowei Xiao, Anima Anandkumar, Jiashi Feng and Jose M. Alvarez. Original code can be found in the repository [NVlabs/FAN](https://github.com/NVlabs/FAN). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| 1. **[FAN](https://huggingface.co/docs/transformers/model_doc/fan)** (from NVIDIA) was released with the paper [Understanding The Robustness in Vision Transformers](https://arxiv.org/abs/2204.12451) by Daquan Zhou, Zhiding Yu, Enze Xie, Chaowei Xiao, Anima Anandkumar, Jiashi Feng and Jose M. Alvarez. Original code can be found in the repository [NVlabs/FAN](https://github.com/NVlabs/FAN). | |
| 1. **[FAN](https://huggingface.co/docs/transformers/main/model_doc/fan)** (from NVIDIA) was released with the paper [Understanding The Robustness in Vision Transformers](https://arxiv.org/abs/2204.12451) by Daquan Zhou, Zhiding Yu, Enze Xie, Chaowei Xiao, Anima Anandkumar, Jiashi Feng and Jose M. Alvarez. Original code can be found in the repository [NVlabs/FAN](https://github.com/NVlabs/FAN). |
Then you will need to run make fix-copies again to fix the other READMEs :-)
|
|
||
| def __init__(self, img_size=224, patch_size=16, in_chans=3, hidden_size=768, act_layer=nn.GELU): | ||
| super().__init__() | ||
| img_size = to_2tuple(img_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here make an explicit test:
img_size if isinstance(img_size, collections.abc.Iterable) else (img_size, img_size)
|
|
||
| def forward(self, x, return_feat=False): | ||
| x = self.proj(x) | ||
| Hp, Wp = x.shape[2], x.shape[3] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use explicit variable names that respect Python conventions (capitals are for class names)
|
|
||
|
|
||
| # Copied from timm.models.layers.mlp | ||
| class MlpOri(nn.Module): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should have an explicit name and be prefixed by Fan.
|
Hi @sgugger thanks for you're feedback. I'll try to implement the changes soon |
|
Implemented suggestions by @sgugger. Pending the change on the README.md path, since I'm uncertain if I need to change only the README.md path or the actual doc path. Also pending rebase |
|
Thanks for working on this! You need to change the link to the doc in the READMEs as suggested, but not the path to the file. You will also need to rebase/resolve the conflicts. @NielsRogge could you have a review before I do a final pass? |
|
I've applied the README.md update and rebased the branch. |
|
Hi @NielsRogge, @sgugger. First of all happy new year, I hope 2023 is greater success than 2022 was for the huggingface team. |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
What does this PR do?
Fixes #17234
Implements the FAN Models described in this paper and available in the following github repo, Additionally this repo has some of the weights available as described in their README file.
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
This is a cleanup from previous PR #20288 in order to mantain branch integrity, recommendations by @NielsRogge were implemented
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@NielsRogge, @sgugger, @patrickvonplaten
Additional Request
If this PR gets merged, would it be possible to migrate the model files from my HF space to the nvidia space