-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve speed of SparseAutoencoder and make it more flexible #451
improve speed of SparseAutoencoder and make it more flexible #451
Conversation
…repeated calculation or not
So you mentioned in IRC that this implementation is faster than the implementation in |
//sparse autoencoder function greedy | ||
using SAEFG = ann::SparseAutoencoderFunction<FSigmoidLayer, FSigmoidLayer, std::true_type>; | ||
|
||
BOOST_AUTO_TEST_SUITE(SparseAutoencoderTest2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note -- when this is merged, we can remove the old sparse_autoencoder_test.cpp
and revert this to SparseAutoencoderTest
instead of SparseAutoencoderTest2
.
1 : can I get a little more information on how you calculated that? 2 : This file looks like it's heavily based on the code in mlpack/methods/sparse_autoencoder/sparse_autoencoder_function.hpp 3 : But maybe it would be a better idea to make this class work with the Trainer class in src/mlpack/methods/ann/ |
a : add functions Train b : add new constructor c : add Serialize 2 : move it from ann to methods/sparse_autoencoder
(because submat of arma is a proxy object) 2 : make codes meet the requirement of style guide
template<typename InputVecType, typename OutputVecType> | ||
static void fn(const InputVecType& x, OutputVecType& y) | ||
{ | ||
y = (1.0 / (1 + arma::exp(-x))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally my intention for the overflow checks in the LogisticFunction class was to avoid strange issues during the training process. But I'm intended to remove the checks in the LogisticFunction class, if you agree we can remove the LazyLogisticFunction class and just use the LogisticFunction class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am totally fine with that
2 : add static functions to initialize weights 3 : fix bug--did not initialize parameters correctly
After some experiments, not all of the layers of ann are suitable for SparseAutoencoder, it need to do some modification(ex : you cannot just call the dropout function, you need to call the activation function like sigmoid or RELU after calling Forward), I would like to open two new folders 1 : SparseAutoencoder/layers Besides, I think the function "GetNewFeatures" of SparseAutoencoder should call the activation function of the hidden layer rather the Sigmoid. I will remove LazyLogisticFunction and remove the check of LogisticFunction as zoq mentioned Any suggestions? Edit : |
Don't mind, thanks for giving review, I know it takes time to do those things.
If you think so, then I would write a simple test case to test the output of forward and backward later on. After this has been done, we could deal with another problem |
No need to write another test for the forward and backward function, if the existing sparse autoencoder test works with the new code it's absolutely fine. I guess |
I would like to do that too, but the problem is old sparseAutoencoder and the new one do not have the same api, so the tests need to have some change. |
I upload the test file, the name is "sparse_autoencoder_test_3.cpp". The api of the FFN are different than the original sparseAutoencoder, therefore |
…e because travis build cannot pass
…e because travis build cannot pass
…putLayer from public api of FFN
Finally, all jobs have been done. I do some change on FFN(do not take the reference of outputLayer but add a public api to users, this make serialization become easier). I use FeedForward to get the Evaluate value(Evaluate api do not give the value) Please check the test cases, they test with the same data(except of the gradient part) but with different api, I hope my implementation meet the requirement of the semantic. If you think everything going well, you can merge it now. Thanks for taking time review Edit : |
When the merge is done, I'll go ahead and add a reverse-compatibility layer for the 2.x.x releases and then remove the old SparseAutoencoder code. |
…ncoder Improve speed of SparseAutoencoder and make it more flexible.
Thanks for the contribution. I made a couple of changes:
Let me know if I messed anything up. Since, we have this nice SparseAutoencoder class, it should be easy provide a reverse-compatibility layer for the 2.x.x releases. I'll go and write the necessary code, if nobody really likes to do it. We should also think about a test case that tests the code in combination with an optimizer. I run into a couple of problems, once I tested the code with the existing trainer class. I solved the issues in f34ae33. Another test could also test the ability to work with additional layer. We only test the standard sparse autoencoder model structure (input layer, hidden layer, output layer). Wich is fine since the former code uses this static model structure, but since we build the sparse autoencoder using the ann modules, we have the ability to add a bunch of interesting layer, e.g. Dropout layer. |
Also, I think a command line program would be neat feature. |
Thanks for the fix. I would provide a command line programs, with different activations(relu, tanh, sigmoid) and dropout(if it works) after serialization of FFN done. |
This commit intent to complete two things
1 : improve the speed of SparseAutoencoder
I cache the computation result in the data member, so the algorithm do not need to compute it two times.Since the member function is const, I declared the data member as mutable;Do anyone think that make the member function as non-const would be better?
2 : make the SparseAutoencoder more versatile
I add two template parameters as following
template<typename HiddenLayer, typename OutputLayer>
class SparseAutoencoderFunction;
This allow users use different layer from ann two compute the features