Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gelu and gelu_fast as possible activation functions #653

Closed
wants to merge 1 commit into from

Conversation

liezl200
Copy link
Contributor

Summary:
After this diff, you can train a transformer model with --activation-fn 'relu', 'gelu', or 'gelu_fast'

gelu_fast is the default implementation in https://github.com/hendrycks/GELUs/blob/master/mnist_fcn.py#L72-L77
gelu is the alternate implementation in https://github.com/hendrycks/GELUs/blob/master/mnist_fcn.py#L72-L77 and the default implementation in https://github.com/facebookresearch/XLM

Differential Revision: D14966006

…arch#653)

Summary:
Pull Request resolved: facebookresearch#653

After this diff, you can train a transformer model with --activation-fn 'relu', 'gelu', or 'gelu_fast'

gelu_fast is the default implementation in https://github.com/hendrycks/GELUs/blob/master/mnist_fcn.py#L72-L77
gelu is the alternate implementation in https://github.com/hendrycks/GELUs/blob/master/mnist_fcn.py#L72-L77 and the default implementation in https://github.com/facebookresearch/XLM

Reviewed By: pipibjc

Differential Revision: D14966006

fbshipit-source-id: e2f64230a28ba03473a3135994b7bce8bd277811
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 8500bdd.

yfyeung pushed a commit to yfyeung/fairseq that referenced this pull request Dec 6, 2023
* fix torchaudio version in dockerfile

* remove kaldiio
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants