Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Add Mish activation #484

Open
digantamisra98 opened this issue Nov 25, 2019 · 10 comments
Open

Feature Request: Add Mish activation #484

digantamisra98 opened this issue Nov 25, 2019 · 10 comments
Assignees

Comments

@digantamisra98
Copy link

Mish is a new novel activation function proposed in this paper.
It has shown promising results so far and has been adopted in several packages including:

All benchmarks, analysis and links to official package implementations can be found in this repository

It would be nice to have Mish as an option within the activation function group.

This is the comparison of Mish with other conventional activation functions in a SEResNet-50 for CIFAR-10:
se50_1

@extremety1989
Copy link

extremety1989 commented Jan 8, 2020

function mish(x) {
return x * (Math.exp(Math.log(1 + Math.exp(x))) - Math.exp(-Math.log(1 + Math.exp(x))))/(Math.exp(Math.log(1 + Math.exp(x))) + Math.exp(-Math.log(1 + Math.exp(x))));
}

function derivativeOfMish(y) {
let omega = Math.exp(3 * y)+4 * Math.exp(2 * y) + (6+4 * y) * Math.exp(y) + 4 * (1 + y);
let delta = 1 + Math.pow((Math.exp(y) + 1), 2);
return Math.exp(y) * omega / Math.pow(delta, 2);
}

@extremety1989
Copy link

extremety1989 commented Jan 10, 2020

Can you please add mish function,that i provided, i already tested it on my custom neural network and it works great,better than sigmoid and tanh ! on XOR

@mubaidr
Copy link
Contributor

mubaidr commented Jan 10, 2020

That sounds great. Would love to see as contributor though.

@robertleeplummerjr
Copy link
Contributor

Keep in mind we have the GPU implementations as well.

@extremety1989
Copy link

@mubaidr how can i apply this function as contributor?

@digantamisra98
Copy link
Author

@extremety1989 are you planning to submit a PR?

@extremety1989
Copy link

extremety1989 commented Jan 22, 2020

@digantamisra98 no, sometimes it returns NaN when learning rate is 0.1, i do not know what is the probleme,maybe javascript

@digantamisra98
Copy link
Author

@extremely1989 Mish has a Softplus operator which needs proper threshold to fix that NaN issue you might be facing.

@extremety1989
Copy link

@digantamisra98 my threshold is 0.5, i how much should i turn ?

@digantamisra98
Copy link
Author

@extremety1989 the Softplus operator thresholds that Tensorflow use is in the range of [0,20]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants