-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MultiLayer Perceptron #143
Comments
I'll add that you can already do this with the MultiLayerNetwork. In this case assembling layers is the crux of the 0.0.3.3 and up releases. When I was thinking of a backprop network, I was thinking more of implementing the layer interface. Fitting an output layer with respect to logistic regression and pretrain is equivalent to a backpropagation network. If you mean being able to compose different neural networks together say: for a convolutional neural network, these already exist. |
Did you look at the new layer api: https://github.com/SkymindIO/dl4j-0.0.3.3-examples/blob/master/src/main/java/org/deeplearning4j/mnist/full/DBNExample.java The MultiLayerNetwork there is the key. I've reiterated this enough times, but I"ll do it again: I plan on having a good set of documentation for 0.0.3.4 now that dl4j is near feature complete with yarn and spark integration. I'm not expecting the API to change much after this. This composable network/hybrid architecture is more or less what people want in a neural network framework though. |
Please leave the attitude somewhere else. We do not need it here. |
Text isn't the best way to convey emotions ;) Im asking for clarification. I was merely pointing at examples of more or less what he was looking for. I admitted the docs aren't there yet which is why I was linking to relevant If you could kindly point out where I had an attitude I will gladly Thanks! Please leave the attitude somewhere else. We do not need it here. Edit: I see what looks like snark. Fixed. |
Before I posted, I had actually modified the class LayerFactory layerFactory = LayerFactories.getFactory(BaseLayer.class);
|
We need backprop in hidden layers like http://deeplearning.net/tutorial/DBN.html#dbn Though I suggest that instead of making all hidden layers both RMBs and percetrons, we should allow sub-sequence of each, e.g. layers 0-4 are RBMs and layers 4-6 are percetrons (layer 4 is both.) |
If you notice how they implement it, they only use logistic regression with On Sun, Feb 1, 2015 at 9:36 PM, winstonquock notifications@github.com
|
I think you meant they use only the logistic regression layer's error to calculate the gradient for all layers? That seems to be the case. self.params.extend(sigmoid_layer.params)
# compute the gradients with respect to the model parameters
gparams = T.grad(self.finetune_cost, self.params)
# compute list of fine-tuning updates
updates = []
for param, gparam in zip(self.params, gparams):
updates.append((param, param - gparam * learning_rate)) It looks different than the backprop algorithm I read in the book, but I'm not too sure what the right way is. I would say we should do backprop in the hidden layers but go with a correct algorithm. Also, that's why I feel we should allow configuring sub-sequence of hidden layers to be percetrons rather than forcing all layers to be both. Basically I feel that the network architecture would look like a few RBMs (that are trained only by pre-training) followed by 2-3 layers of percetron layers (that are trained only by backprop,) but then, if possible, we can make the config more flexible to allow overlapping for the sake of experimentation. |
This is what you're looking for. You'll notice they call it an MLP at the bottom. |
I can see there is a backprop method in the class MultiLayerNetwork but it is still not clear how to use it in its fit method. In the current version, fit only optimizes the parameters of the output layer. |
The reason I switched it to this version of the MLP is backprop isn't going to be generalizable for different neural networks. The new mix and match layer setup allows for more than just the normal 2 layer feed forward architecture. One of the things that may not be obvious from the outside is that conv nets and some of the other layers have different backward derivatives to calculate. I left the code in there as a stub. |
My point is MultiLayerNetwork should be more general and not having a predefined pretrain/output layer architecture for training. |
If you notice I'm not disagreeing with you. I left that code in there for a reason. My commits clearly demonstrate that that kind of general purpose architecture is what I'm moving towards. I read on avg ~10 papers a week (including re reading some things). I'm not ignorant of current results by any means. I physically meet the people who write these papers :P. Despite us trying to be the framework for industry, I'd be daft to not be in to the research myself. That's why I made this framework in the first place. Like anything in software engineering there's design trade offs involved, testing, and everything else. In the process of the reorganization, there's one training method that works that will satisfy a good portion of people. The other one right out the gate was going to be a little harder to support, however the way I designed I laid the ground work. The only thing I can do right now while trying to balance everything else is give you the hooks to allow for backwards architecture. Here's more or less what I wanted to do when I get the bandwidth: (remember I have examples to get out the door, spark and yarn versions, maintaining two other libraries for vectorization and scientific computing) Deep learning 4j isn't just core. Anyways: I wanted to add a knob on the multi layer configuration for backwards and a back wards method to the layer interface. That will give people the means to do back propagation with arbitrary layers. If you yourself would like the responsibility of adding the necessary code in the hooks, I gladily accept pull requests. My only compromise for right now will be giving you the necessary method stubs. Will that work? Otherwise this will have to wait a bit. |
@agibsonccc so the BaseMultiLayerNetwork had the back-prop before but it was taken out because of the poor F1 score, right? But why wouldn't other framework w/ back-prop encounter such issue? Could the poor score due to some other bug in the back-prop? [For my problem, the F1 is only 0.2 and very good, better than having the uniform outputs problem but not great. (it was "better" earlier due to the bug in the Evaluation class.)] Also in the old BaseMultiLayerNetwork, it still back-props to all hidden layer like deeplearning.net tutorial. I'm also interested to try the separatedly pure RBM and pure percetron layers as I read in some papers. |
The old basemultilayernetwork did both. Have you tried it since the pull request was put in? @winstonquock Same message for you: I wanted to add a knob on the multi layer configuration for backwards and a back wards method to the layer interface. That will give people the means to do back propagation with arbitrary layers. If you yourself would like the responsibility of adding the necessary code in the hooks, I gladily accept pull requests. My only compromise for right now will be giving you the necessary method stubs. Will that work? Otherwise this will have to wait a bit. |
OK, let me try that. |
So here's the hooks now: The MultiLayerNetwork hook is here, currently untested: |
Is it possible to provide the complete implementation instead of a hook? |
Like I said, no bandwidth right now. I added the hooks for later. Still working on some other stuff. If you can't tell by the mailing list among other things I'm buried right now. |
hmm... the change appears to trash the regular non-back-prop training. Please see my discovery when testing #123 |
This gets back to what we talked about before. One or the other. That's why it's a flag. |
But I did NOT turn on back-prop at all for the training. I actually stopped in the debugger to make sure of that. |
@agibsonccc I further narrow down the problem. It looks like the change to the
|
@agibsonccc how should this one be fixed? |
So I'm coming back around to the back prop stuff now. As of right now there is only the feed forward impl. I'll look at the other architectures and try to consolidate. With the stuff I mentioned earlier, it shouldn't be difficult to get it in place. |
Fixed for normal feed foward. |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
One way I look at dl4j is a repository of NN components that I can experiment different NN configurations.
I believe having a traditional multiLayer perceptron implementation could help with that.
This could allow developers to build hybrids NN architectures, for instance.
Thanks
The text was updated successfully, but these errors were encountered: