Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Mish Activation Function #25584

Closed
digantamisra98 opened this issue Sep 3, 2019 · 4 comments
Closed

Adding Mish Activation Function #25584

digantamisra98 opened this issue Sep 3, 2019 · 4 comments
Labels
feature A request for a proper, new feature. module: nn Related to torch.nn triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@digantamisra98
Copy link

digantamisra98 commented Sep 3, 2019

I released Mish activation function a couple of weeks ago and it has a significant improvement in performance over ReLU, Swish and other commonly used activation functions. Full details along with the paper link provided in my repository here - https://github.com/digantamisra98/Mish
Mish was also used along with Ranger Optimizer to beat 12 records on the Fast.ai ImageNette and ImageWoof benchmarks.
The link to the relevant fast.ai forum - https://forums.fast.ai/t/meet-mish-new-activation-function-possible-successor-to-relu/53299/245

I have the pytorch implementation of Mish here - https://github.com/digantamisra98/Mish/tree/master/Mish/Torch

Hopefully, if validated, I would like Mish to be a part of PyTorch by submitting a PR for ease of use.

Thank You!

@pbelevich pbelevich added feature A request for a proper, new feature. module: nn Related to torch.nn triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Sep 3, 2019
@soumith
Copy link
Member

soumith commented Sep 3, 2019

We don't think this method should be in pytorch core, as opposed to your own personal repository or something like https://github.com/pytorch/contrib , at least not yet in time.

Our reservation is that we want to include methods that the community uses as a standard, or else the code maintenance problem balloons up for us.
We do show discretion based on what the paper shows as evidence, for example BatchNorm was included within weeks of it's publish date (in Torch).
In terms of rejected methods, we've rejected (then) newly minted papers such as Swish ( #3260 , #3182 ), Yellowfin ( #1960 ) and many others, and rightly so, these haven't become standardized in the community (like LSTM / Transformer / BatchNorm).

If you have a differing opinion, let us know why, and we can re-think.

tl;dr: The paper doesn't show evidence that makes it a method that has obvious long-term success. If the paper does have long-term success in the field we will include it

@hiyyg
Copy link

hiyyg commented Nov 24, 2020

Mish has shown better performance compared to SiLU, GELU, still no official plan?

@bsugerman
Copy link

Bump. Mish should be added, as should HardMish := x/2*max(min(x+2,0),2). These are quite commonly used and have been shown to be very effective, even compared with Swish and HardSwish, which are part of the torch library.

@LukeAI
Copy link

LukeAI commented Jan 5, 2021

The SOTA vision object detection: ScaledYolo now depends on Mish and a random third party extension to work - which itself works with pytorch1.6. so you cannot do SOTA object detection with Pytorch 1.7 because of this missing, which is maybe a good argument for its inclusion? @soumith
https://github.com/WongKinYiu/ScaledYOLOv4/tree/yolov4-csp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A request for a proper, new feature. module: nn Related to torch.nn triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

6 participants