New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ANN Saving the network and reloading #531
Comments
Hi, by now the serialization of the ann haven't finished yet, I will open a pull request of ann in a few days, you should be able to save and load the network after it.
I am afraid this cannot be done, at least not some easy task. The ann module of mlpack rely on template, that means the type of your network must be specify at compile time. |
Thank you. If the various parameters of the FFN (such as input layers, hidden layers, outputlayer type etc) is stored in the XML file then we can definitely specify the type of network that will be loaded by just looking at the file. If not, I'm still not a 100% sure how to specify the type of network to load when the type is decided just before saving it. Thanks.
|
It is the parameters of ann, but I think it still lack the weights of another layers.
It should store the parameters of the cnn,ffn,rnn and other layer weights(I may wrong on this one since I haven't tested it yet)
You need to remember what kind of network you build, Following part are the "type" of network of yours
After you train the net, you can save the parameters and load it with same type(the FFN build by your codes), it has to be the original network you use to train the network, part of the parameters could be different(it depends). Unless the parameters about the features of the input data should be the same. ps : please correct me if anything wrong |
For what it's worth, this problem is encountered with other mlpack algorithms: sometimes the user wants to do, e.g., nearest neighbor search and save the model, but they can use different types of trees, and the trees are template parameters. What I've done in those situations is to create a "wrapper" class that serializes an identifier saying what type is being used, then serializing the appropriate type. Take a look here: For the situation of an arbitrary neural network architecture, the problem is a lot harder: you need some way to serialize the type information, and then some way to assemble an arbitrary type and return it when deserializing. I think it will be a tricky (but fun) problem to solve... :) The approach I've taken, in general, towards serializing arbitrary types is this: "if using an mlpack command-line program, the type should be handled for the user; if using the mlpack C++ code, then the user is responsible for serializing and deserializing the correct type". So if the user uses the |
Thank you both for your replies.
Moving forward, I see myself using the mlpack C++ code a lot more than the command line version, so I guess I need to care of serializing and deserializing this. The main problem in the current code is for every prediction I need to call "BuildFNN" which ends up training the network again and again. That is why I thought I could just return the trained network from where its being called and do prediction later but the I think the data gets corrupted on return and the prediction call segfaults.
As you can see in that code decltype(modules) and decltype(classOutputLayer) determines the type of the network at compile time. I understand I pass the types on my call and I can get the classOuputLayer type pretty easily which is the BinaryClassification layer. However, the first type which is the layertypes is a std::tie (which I'm fairly new to) which returns a tuple of the input-hidden layers. I'm not sure what this returns to create a network of the appropriate type. If I can get some help on that, it'll be great! Thanks. |
I think it should be something like
If you call std::make_tuple, if will become
You can take a look at boost typeindex too, this library could help you print the name of type(I read them from meyers book, but haven't tried it yet) |
As
If you are using C++14 though you can declare the return type as |
I am using C++14 (C++1y) and what I did was train the model and then just return net (which is the FFN) and saved it to another auto'd variable from main. However, when I called predict on that, I got a segfault. My guess is somewhere, somehow, while getting returned from BuildFFN, the variable net got corrupted. Personally, I would prefer if this worked well (because my machine has around 32gb of memory so holding data is not a problem). Perhaps, we could look into why this segfault occrus? |
Like theSundayProgrammer said, if you create your net by std::tie, the type of your net would take the reference of the layers, but not copy/move the layer(remember to define ARMA_USE_CXX11, else the matrix of armadillo cannot move). Return reference of the stack is not a good idea, because the stack will be destroy after you exit from scope. bad idea
ok, since it take the copy/move
Is this the case you are talking about?By the way, I would try to complete serialization of ann module on today, I would tell you if I could load and save your FFN without any trouble. |
I test the serialization of current implementation(without any change), it work like magic. Looks like I was wrong, the serialization of the ffn, rnn and cnn already work, the parameters of ffn already store the weights of the net(module), I am sorry for my misunderstanding. Your problem should be able to fixed if you create the tuple by std::make_tuple |
Thank you for your replies. After reading a bunch on tie and tuple I think I understand what they do. I changed my code to create the models using std::make_tuple instead of std::tie as shown:
I have the function return type auto and I return the variable net after training the network. In the main function I save it to another auto'd variable and call predict on it:
I also made sure that ARMA_USE_CXX11 was enable (after checking out how to do it here):
However, the compiler doesn't like this, it spit out a huge amount of error:
Based on what I can infer from this, it seems that the layers are references and making a tuple out of this is not working (I could be wrong). |
It works perfectly fine on my laptop(win8 64 bits, vc2015 64bits), maybe update your mlpack by can solve this problem? These errors looks like the compiler complain it cannot generate copy constructor/move constructor for users(if you want to know why, google the rules of zero) |
Once I modified the CMake file to build against c++14 it worked also on my machine. I guess another simple solution would be to write a wrapper class that creates the network and forwards the necessary functions e.g. |
Btw. I used the following code to test:
|
@zoq, In your code where are you training the network? Thank you. I am building against C++14 as well. Here is my CMake file:
After stero's suggestion, I did a pull today and fast-forwarded my branch by 4 commits and built and ran the mlpack_test which found no errors. After this, when I compile I am still getting errors but its different:
I was able to isolate the error to the net.Train(trainData, trainLabels) call. Any suggestions? Thanks. |
Compiler complain because of const correctness, it is quite easy to fix change
to
Pull request #536 fix this problem already. Any compiler has good support on c++11 should be able to fulfill the "rules of zero", you need to check how good your compiler support on c++11(you can treat c++14 as a small patch for c++11) You could improve performance a little by using move rather copy
Original solution will copy the modules and classOutputLayer into the FFN, but with std::move, it will move the object into the FFN, this could save some memory(remember to define ARMA_USE_CXX11, else the compiler will copy the data but not move). |
Thank you. I checked out pr/536 and linked the include directories and lib from there. I got rid off all the compiler error. However, I get a 100% classification error when I use std::make_tuple, but when I use std::tie I get 4.95916%. This is my funciton:
|
Pull request #542 should fix this issue, the problem is the parameter name "network" hide the name of data member "network". If you want to save memory when make the tuple, you can move the layer too
|
I had the same result when used make_tuple rather than tie. I fixed it Although it is for CNN not FNN, I guess a similar change for FNN should On Thu, Mar 3, 2016 at 3:29 AM, sudarshan notifications@github.com wrote:
Joseph Chakravarti Mariadassou |
I had the same result when used make_tuple rather than tie. I fixed it Although it is for CNN not FNN, I guess a similar change for FNN should On Thu, Mar 3, 2016 at 6:33 AM, Joe Mariadassou joe.mariadassou@gmail.com
Joseph Chakravarti Mariadassou |
Thank you all for your help. Working with pr/542 fixed everything and I am able to return net as an auto using C++1y and predict later in code and it works. Closing the issue now. |
Hey,
I'm working with the ANN module and have started with the feed_forward test. I'm trying to return the built network back to where I was calling it from so that I can use it there (and I was running into a lot of problems because of the template programming part more details here, which I was able to solve by compiling with -std=c++1y and return auto). However, when I tried to train the returned built network, it threw a matrix multiplication exception, which I'm assuming meant that something got corrupted while returning. So, I thought I could just train the network and save it, so that I could reload it later for prediction. The below does that and saves the network parameters in "test" file. The problem is while loading now I have to specify the type of the network (which was determined by decltype(modules) and decltype(classOutputLayer)). Furthermore, the XML file that was saved is just parameters, which I'm guessing is the values of the weight matrix and does not indicate the type of the FFN. Is there a way to fix this problem?
Thanks.
The text was updated successfully, but these errors were encountered: