New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FFN serialization #516
FFN serialization #516
Conversation
@@ -2,7 +2,7 @@ | |||
* @file sparse_bias_layer.hpp | |||
* @author Tham Ngap Wei | |||
* | |||
* Definition of the SparseBiasLayer class. | |||
* Definition of the BiasLayer class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is a documentation mistake?
Looks good to me so far, but I only took a quick look. You might find some of the stuff in |
Thanks for reviews. I will fix the document mistakes on next commit. I will write some test cases too, after ffn is done, cnn and rnn should be quite easy to serialize too. mlpack 2.x still rely on libxml? Could we just use boost to replace it? |
The modified optimizer looks good, thanks for the effort. Once we merge the altered optimizer interface, we have to modify all layer that uses the optimizer interface. If you need some help with this step, don't hesitate to ask.
You are right since mlpack 2.0.0 the libxml2 dependency was replaced with boost::serialization. |
Does it really make sense to save input, output, delta or any other variable that is used only during training? The aim of archiving, I think, should be to save the weights and other variables that are needed to deploy "learned machine." Even if we are going to restart the training, variables used exclusively in training are re-initialised anyway. |
My thoughts here don't apply to this situation in particular (I haven't really looked over this), but I think that in general, serializing user preferences for training is worth doing. For instance, a user may want to load a model, and then train it some more. Any intermediate training variables probably aren't worth saving, but something like a learning rate is worth holding onto, in my opinion. This is what I've tried to do with other serialization code. I hope these thoughts are helpful. :) |
Joseph is right, we don't have to save the input, output and gradient parameter. We don't need the parameter for optimizing an already trained model nor for testing. |
Thanks for pointing out the defects, I will remove those intermediate training variables on next commit. I would keep those parameters needed for retraining, sometimes I need to reload the weights and use them to finetune the results. |
Hi @stereomatchingkiss Also I did not find code to archive a tuple in MLPack code base. So I wrote On Fri, Feb 12, 2016 at 10:51 AM, stereomatchingkiss <
Joseph Chakravarti Mariadassou |
Hi, @ theSundayProgrammer I think your copy constructor implementation are quite cool, it lower down the dependency between layer and optimizer, make codes become easier to read and serialize. If you already implement the serialization of ann based on it, please open a PR. I will delete my pull request #516 after that. Thanks for your helps :). ps : If you think there are something can reuse, I will keep it alive before you finish the serialization of the ann. |
I implemented the serialization in 69f97e5, f6c27ed and 43f0f13 for the FFN, RNN and CNN class. Since we removed the optimizer from each layer, we don't have to struggle with the optimizer serialization anymore. I merge the serialization code, once we removed the parameters like delta, input parameter, output parameter, etc. as Joseph pointed out. So maybe you can open another pull request? If not I'll go and extract the code from this pull request and make the necessary changes. As always, please comment if something doesn't make sense and I'll reopen it. |
Ok, I will open another pull request |
Let FFN able to serialize by boost serialization
Haven't finished yet, there are others parts need to serialize(initialize rules and others) before ffn serialization could be done.
Besides, I cannot serialize arma::cube, finding a solution