-
-
Notifications
You must be signed in to change notification settings - Fork 56.2k
Added Swish and Mish activations #15808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi! Thanks for contribution. Please check that all the backends are valid (OpenCL, CUDA, Intel's Inference Engine). Otherwise keep only default C++ implementation. |
cc @VadimLevin |
@thebhatman, Please add the tests for both layers. |
Yeah I will add the tests. Where exactly are the tests for activations? Should I define sample neural networks with these activations ? Or is there a test file with all activation functions such as sigmoid and tanh? |
@thebhatman, So due the activations are simple, you can test them without test data. So you can just create a test which applies all the backends and compares with reference data (which is computed inside the test). Take a look at test_layers.cpp and test_halide_layers.cpp. The only parameter you need to vary is target and backend. |
56b8404
to
afd5392
Compare
I have added the tests in |
@@ -583,7 +583,7 @@ TEST_P(NoParamActivation, Accuracy) | |||
testInPlaceActivation(lp, backendId, targetId); | |||
} | |||
INSTANTIATE_TEST_CASE_P(Layer_Test_Halide, NoParamActivation, Combine( | |||
/*type*/ Values("TanH", "Sigmoid", "AbsVal", "BNLL"), | |||
/*type*/ Values("TanH", "Sigmoid", "AbsVal", "BNLL", "Swish", "Mish"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good choice! 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CUDA tests are passing for the Swish activation.
The test for DNN_TARGET_CUDA
is passing for the Mish activation. The test for DNN_TARGET_CUDA_FP16
is failing.
[ RUN ] Layer_Test_Halide/NoParamActivation.Accuracy/21, where GetParam() = ("Mish", CUDA/CUDA_FP16)
.../opencv/modules/dnn/test/test_common.impl.hpp:68: Failure
Expected: (normL1) <= (l1), actual: 0.0288897 vs 0.004
.../opencv/modules/dnn/test/test_common.impl.hpp:71: Failure
Expected: (normInf) <= (lInf), actual: 0.0909604 vs 0.02
This means that the errors in the outputs are more than the accepted threshold.
You will have to increase the error tolerance for Mish activation when target is DNN_TARGET_CUDA_FP16
. I am not sure what would be the correct way to do this. In the worst case, the test can be skipped for the FP16 target.
The test calls void testInPlaceActivation(LayerParams& lp, Backend backendId, Target targetId)
which in turn calls void test(Mat& input, Net& net, Backend backendId, Target targetId, bool skipCheck = false)
. I think currently there is no way to pass custom thresholds to test
. A solution could be to pass thresholds to testInPlaceActivation
and test
as arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default Thresholds are being called from here https://github.com/thebhatman/opencv/blob/f6221044661dedbf4ad729c8f24c2b72ff37aca0/modules/dnn/test/test_common.hpp#L113. I don't see a way to pass the thresholds as arguments without changing the prototype of testInPlaceActivation
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should add new parameters (l1
and lInf
) to test
and testInPlaceActivation
. Their default values would be 0.0
.
Inside the test
function:
- if
l1
andlInf
arguments are zero, set them to the values given bygetDefaultThresholds
- if
l1
andlInf
arguments are not zero, use them
@dkurt is this ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@YashasSamaga, yes, that would be fine. Thanks!
Is this PR ready to be merged now? |
To enable the CUDA backend, you have to tick the following CMake options:
Are you able to build the CUDA backend on your PC? The CUDA build is failing on CI. Please have a look at CI build. There are errors in the compile log (open the log and search for "error:"). The CI does not run tests on CUDA devices. Hence, you (or someone) has to run them on their PC to verify that the tests are passing. Let me know when you are done with the PR, I'll build and test the PR once. |
The error is because of the log function |
Thanks @YashasSamaga . The CUDA builds are now passing. They were failing due to type mismatch errors. I resolved them by using |
@thebhatman, there is no still OpenCL kernels implementation. Please add it or remove unused |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@alalek, Can we merge it now to master? I'll backport it to 3.4 with https://github.com/dkurt/opencv/tree/thebhatman/Mish_swish
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has this already been released? In 4.3.0, I still get the error that it is unsupported:
|
From what I found, mish is available in 3.4.10 but not in 4.3.0. |
@BlueNotesRobot It is already there on master. |
Thanks @YashasSamaga I just compiled the newest master and can confirm it worked. |
* Added Swish and Mish activations * Fixed whitespace errors * Kernel implementation done * Added function for launching kernel * Changed type of 1.0 * Attempt to add test for Swish and Mish * Resolving type mismatch for log * exp from device * Use log1pexp instead of adding 1 * Added openCL kernels
I have added the Swish and Mish activation functions. This resolves #15693
###This Pull request changes
[Feature addition] New activation functions Swish and Mish.