Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MISH as ActivationFunction #63

Closed
CalogeroZarbo opened this issue Mar 12, 2020 · 3 comments
Closed

MISH as ActivationFunction #63

CalogeroZarbo opened this issue Mar 12, 2020 · 3 comments

Comments

@CalogeroZarbo
Copy link

Hi @lucidrains
I was wondering if it could make sense to you if I create a pull request where the user can choose between GLUE or MISH as the activation function.
The explanation of MISH can be found here:

The GitHub is here:

And the discussion can be found here:

Here there is a little benchmark:
image

If I'm not mistaken there is only one place in reformer_pytorch library where you define GLUE_ in the FeedForward layer, I could add a parameter to the constructor as a flag.

Let me know what would you think about it.

Thank you,
Cal

@lucidrains
Copy link
Owner

@CalogeroZarbo I really like Mish too! I used GELU mainly to be consistent with BERT, but I added the feature where you can pass in the Module class you'd like to be instantiated as the activation function. https://github.com/lucidrains/reformer-pytorch#customizing-feedforward

@CalogeroZarbo
Copy link
Author

@lucidrains I just lost that part! Thank you.
Btw I'm training using RangersLars with DeepSpeed and FP16 Apex optimization on an Encoder-Decoder Reformer architecture and it's working very well.

FYI: https://github.com/mgrankin/over9000

I'll close the issue since it's not needed :)

Cheers!
Cal

@lucidrains
Copy link
Owner

@CalogeroZarbo very cool, I'm a big fan of Ranger and Lookahead too :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants