Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🌟 New model addition: FNet #12411

Open
3 tasks done
cccntu opened this issue Jun 29, 2021 · 5 comments
Open
3 tasks done

🌟 New model addition: FNet #12411

cccntu opened this issue Jun 29, 2021 · 5 comments

Comments

@cccntu
Copy link
Contributor

cccntu commented Jun 29, 2021

🌟 New model addition: FNet

FNet is a highly efficient Transformer-like encoder architecture, wherein the self-attention sublayers have been wholly replaced by standard, unparameterized Fourier Transforms.

I would like to help adding this!

Open source status

@NielsRogge
Copy link
Contributor

Somebody is already working on this, see #12335

@cccntu
Copy link
Contributor Author

cccntu commented Jun 29, 2021

Thanks @NielsRogge , weird that I didn't see it when I searched.

@gchhablani
Copy link
Contributor

gchhablani commented Jun 29, 2021

@cccntu I believe what you want for the JAX/Flax community week is a Flax model. It seems unlikely that I will finish the PR in the next week. Maybe, you can start working on the Flax model parallely?

Or, we can discuss over slack and then try to finish both.

@patil-suraj @patrickvonplaten wdyt? Is it easier to go from PyTorch to Flax? Or it doesn't matter at all? In case PT is needed, I am willing to spend my time next week on this and try to finish it.

@cccntu
Copy link
Contributor Author

cccntu commented Jun 29, 2021

@gchhablani Yes! I would love to add the Flax part.
@patil-suraj @patrickvonplaten I have a few questions before I proceed:

  • There is no license in the original repo, should I email the authors for permission for code and weights?
  • How much of the original model code should I modify, other than wrapping it in huggingface/transformers classes?
    Should we refactor it for better weight alignment with pytorch code e.t.c?

Thanks!

@gchhablani
Copy link
Contributor

Great @cccntu! Let's discuss over Slack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants