-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add leaky ReLUs #412
Comments
Instead of writing an independent activation function we can just write a new |
Hi I'm willing to take this task. I'm new to the community and I'm trying for GSOC 16 and thus I believe implementing this issue can be a good starting point. The way I see it now, it involves creating a new rectifier function class at ann/activation_functions and then adding that to base_layer.hpp module ( and corresponding test cases at test/activation_functions_test.cpp ) |
You are right this is a good starting point to get familiar with the code. The BaseLayer only works with activation functions that can be called without any additional parameters like the sigmoid or the tanh function. Since the leaky rectified linear function uses the leakyness factor as an additional parameter you can't use the BaseLayer to call the function. But there is an easy solution you can directly implement the Please leave a comment if something doesn't make sense. |
Thanks for the contribution. Before I merge the code in (I guess you will open a pull request) could you take a look at the design guidelines especially the comments section: https://github.com/mlpack/mlpack/wiki/DesignGuidelines It's minor, but I tend to be picky about code but I am not mean. :) It would also be great if you could combine the two constructors into one:
And last but not least, can you add a function that returns alpha and enables the modification of alpha? |
About the test, take a look at the |
Hi. Thanks for the suggestions. I have made the required changes here.Regarding testing I have a doubt. As leakyReLu is a layer, as opposed to a single neuron, so it should have only forwad and backward as public methods and the activation function and derivative should not be exposed, but in tests/activation_functions_test.cpp only activation functions and their derivatives are being tested. |
@abhinavchanda your code is well written. It helped me to understand the codebase better. |
@zoq Is this task still open? |
@sharathts No, the code was merged in e6f7ffe. |
@zoq Thank you for the information. |
Unlike the standard ReL function the leaky rectified linear function has a non-zero gradient over it's entire domain. So instead of having
y = max(0, x)
, you havey = max(x / a, x)
, wherea
is some constant. This means you still get some sort of non-linearity, but the gradient can flow through in both directions.For more information see:
Since the parameter is fixed we could add the leakyness factor as a template parameter. The problem with the idea is that C++ doesn't support double as template parameter. So, we need to figure out a way around this issue.
The text was updated successfully, but these errors were encountered: