Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why does yolov4-tiny use leaky instead of mish? #6178

Open
1027663760 opened this issue Jul 8, 2020 · 4 comments
Open

Why does yolov4-tiny use leaky instead of mish? #6178

1027663760 opened this issue Jul 8, 2020 · 4 comments

Comments

@1027663760
Copy link

1027663760 commented Jul 8, 2020

@AlexeyAB

@WongKinYiu
Copy link
Collaborator

yolov4-tiny is developed for both of cpu and gpu, exponential and log functions in mish is not friendly for cpu inference.

@sealedtx
Copy link

sealedtx commented Jul 9, 2020

Can someone share pre-trained weights for yolov4-tiny with mish?

@PallHaraldsson
Copy link

PallHaraldsson commented Jul 9, 2020

Can someone clarify for me, as I'm new to this. I understand Mish is one of the best now, but since slow, it would also be (more so) for training on GPU (or CPU).

Pre-trained with Mish, would not work for inference with Leaky [ReLU, assuming]. My understanding is you could substitute one activation function for some other (to some degree); but they would need to match. Or maybe not...:

I'm looking into better activation functions, making my own, would you be interested in a better Mish, an approximation (or of similar function)? Could you train on a more accurate version and do inference with an approximate?

I'm not familiar enough with Yolo, 4 or 5 or any. Do the non-tiny variants use Mish or other, and only tiny Leaky ReLU?

My (unoptimized) Mish implementaion here: sylvaticus/BetaML.jl#6 (comment)

was 14.8x slower than (regular) ReLU. My PLU was however just as fast as ReLU (I only timed on CPU), so maybe a candidate?

@hfassold
Copy link

the "hard-mish" function (https://forums.fast.ai/t/hard-mish-activation-function/59238) is a runtime-efficient approximation of mish

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants