Skip to content

Conversation

@kiansierra
Copy link
Contributor

What does this PR do?

Fixes #17234

Implements the FAN Models described in this paper and available in the following github repo, Additionally this repo has some of the weights available as described in their README file.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

This is a cleanup from previous PR #20288 in order to mantain branch integrity, recommendations by @NielsRogge were implemented

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@NielsRogge, @sgugger, @patrickvonplaten

Additional Request

If this PR gets merged, would it be possible to migrate the model files from my HF space to the nvidia space

@kiansierra kiansierra changed the title Add fan Add FAN Model Nov 24, 2022
Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR. There is a lot to do to clean up the modeling code, I have started to give pointers. In general we require modeling code to be as explicit as possible for readability which means:

  • don't use short intermediate functions but directly do the thing when you use that intermediate function, so the reader does not have to go back and forth to understand what is happening.
  • don't use one-letter variable names
  • all building blocks of the model should be prefixed by Fan and initialized from the model config as much as possible
  • avoid flags like linear=False that trigger different code-paths.

README.md Outdated
1. **[EncoderDecoder](https://huggingface.co/docs/transformers/model_doc/encoder-decoder)** (from Google Research) released with the paper [Leveraging Pre-trained Checkpoints for Sequence Generation Tasks](https://arxiv.org/abs/1907.12461) by Sascha Rothe, Shashi Narayan, Aliaksei Severyn.
1. **[ERNIE](https://huggingface.co/docs/transformers/model_doc/ernie)** (from Baidu) released with the paper [ERNIE: Enhanced Representation through Knowledge Integration](https://arxiv.org/abs/1904.09223) by Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, Hua Wu.
1. **[ESM](https://huggingface.co/docs/transformers/model_doc/esm)** (from Meta AI) are transformer protein language models. **ESM-1b** was released with the paper [Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences](https://www.pnas.org/content/118/15/e2016239118) by Alexander Rives, Joshua Meier, Tom Sercu, Siddharth Goyal, Zeming Lin, Jason Liu, Demi Guo, Myle Ott, C. Lawrence Zitnick, Jerry Ma, and Rob Fergus. **ESM-1v** was released with the paper [Language models enable zero-shot prediction of the effects of mutations on protein function](https://doi.org/10.1101/2021.07.09.450648) by Joshua Meier, Roshan Rao, Robert Verkuil, Jason Liu, Tom Sercu and Alexander Rives. **ESM-2 and ESMFold** were released with the paper [Language models of protein sequences at the scale of evolution enable accurate structure prediction](https://doi.org/10.1101/2022.07.20.500902) by Zeming Lin, Halil Akin, Roshan Rao, Brian Hie, Zhongkai Zhu, Wenting Lu, Allan dos Santos Costa, Maryam Fazel-Zarandi, Tom Sercu, Sal Candido, Alexander Rives.
1. **[FAN](https://huggingface.co/docs/transformers/model_doc/fan)** (from NVIDIA) was released with the paper [Understanding The Robustness in Vision Transformers](https://arxiv.org/abs/2204.12451) by Daquan Zhou, Zhiding Yu, Enze Xie, Chaowei Xiao, Anima Anandkumar, Jiashi Feng and Jose M. Alvarez. Original code can be found in the repository [NVlabs/FAN](https://github.com/NVlabs/FAN).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. **[FAN](https://huggingface.co/docs/transformers/model_doc/fan)** (from NVIDIA) was released with the paper [Understanding The Robustness in Vision Transformers](https://arxiv.org/abs/2204.12451) by Daquan Zhou, Zhiding Yu, Enze Xie, Chaowei Xiao, Anima Anandkumar, Jiashi Feng and Jose M. Alvarez. Original code can be found in the repository [NVlabs/FAN](https://github.com/NVlabs/FAN).
1. **[FAN](https://huggingface.co/docs/transformers/main/model_doc/fan)** (from NVIDIA) was released with the paper [Understanding The Robustness in Vision Transformers](https://arxiv.org/abs/2204.12451) by Daquan Zhou, Zhiding Yu, Enze Xie, Chaowei Xiao, Anima Anandkumar, Jiashi Feng and Jose M. Alvarez. Original code can be found in the repository [NVlabs/FAN](https://github.com/NVlabs/FAN).

Then you will need to run make fix-copies again to fix the other READMEs :-)


def __init__(self, img_size=224, patch_size=16, in_chans=3, hidden_size=768, act_layer=nn.GELU):
super().__init__()
img_size = to_2tuple(img_size)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here make an explicit test:

img_size if isinstance(img_size, collections.abc.Iterable) else (img_size, img_size)


def forward(self, x, return_feat=False):
x = self.proj(x)
Hp, Wp = x.shape[2], x.shape[3]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use explicit variable names that respect Python conventions (capitals are for class names)



# Copied from timm.models.layers.mlp
class MlpOri(nn.Module):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have an explicit name and be prefixed by Fan.

@kiansierra
Copy link
Contributor Author

Hi @sgugger thanks for you're feedback. I'll try to implement the changes soon

@kiansierra
Copy link
Contributor Author

Implemented suggestions by @sgugger.

Pending the change on the README.md path, since I'm uncertain if I need to change only the README.md path or the actual doc path.

Also pending rebase

@sgugger
Copy link
Collaborator

sgugger commented Dec 9, 2022

Thanks for working on this! You need to change the link to the doc in the READMEs as suggested, but not the path to the file. You will also need to rebase/resolve the conflicts.

@NielsRogge could you have a review before I do a final pass?

@kiansierra
Copy link
Contributor Author

kiansierra commented Dec 14, 2022

I've applied the README.md update and rebased the branch.

@kiansierra
Copy link
Contributor Author

kiansierra commented Jan 3, 2023

Hi @NielsRogge, @sgugger.

First of all happy new year, I hope 2023 is greater success than 2022 was for the huggingface team.
I've resolved the merge conflicts, and was hoping to know if any additional steps were required for this PR?

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot closed this Feb 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add FAN model (vision)

3 participants