-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Train with this model for Mnist? #88
Comments
I believe |
Is this what you mean? this._net = new Net<double>();
this._net.AddLayer(new InputLayer(28, 28, 1));
this._net.AddLayer(new ConvLayer(5, 5, 32) { Stride = 1, Pad = 2 });
this._net.AddLayer(new ReluLayer());
this._net.AddLayer(new ConvLayer(5, 5, 32) { Stride = 1, Pad = 2 });
this._net.AddLayer(new ReluLayer());
this._net.AddLayer(new PoolLayer(2, 2));
this._net.AddLayer(new DropoutLayer<double>() { DropProbability = 0.25 });
this._net.AddLayer(new ConvLayer(3, 3, 64) { Stride = 1, Pad = 2 });
this._net.AddLayer(new ReluLayer());
this._net.AddLayer(new ConvLayer(3, 3, 64) { Stride = 1, Pad = 2 });
this._net.AddLayer(new ReluLayer());
this._net.AddLayer(new PoolLayer(2, 2) { Stride = 2 });
this._net.AddLayer(new DropoutLayer<double>() { DropProbability = 0.25 });
this._net.AddLayer(new ReluLayer());
this._net.AddLayer(new FullyConnLayer(256));
this._net.AddLayer(new DropoutLayer<double>() { DropProbability = 0.5 });
this._net.AddLayer(new FullyConnLayer(10));
this._net.AddLayer(new SoftmaxLayer(10)); |
I would put FullyConnLayer(256) before the ReluLayer. |
ok, thanks. I wasn't sure which Dropoutlayer and ReluLayer you were referencing. I'll give this a whirl. It'll take a while to train I'm sure. |
DropoutStorage is shared between forward and backward pass. It is supposed to be allocated in the forward pass. I will reproduce and try to understand that tonight. |
Ok, good. Hopefully you can find something because my estimation code says that training this model via CPU will be finished on 6/12/2018 3:45pm. LOL. So, I won't be training this without GPU training. I forgot to mention that I let this training run all night. It didn't get very far but I let it save out the model when I stopped the training. The model was a 35MB 1 line json file. Wow! Not sure what makes it so big but I can't imagine how big it'll be when I finish training this way. |
No more crash after PR #91 |
The json file contains all parameters and their gradients. Overtime their value will change but there won't be more parameters. So the size of the file should remain close to 35MB in your case. |
Are you saying I can remove the FilterGradient and BiasGradient sections of the json entirely, in order to make the file smaller, and the model will still work for me? |
With this model I am getting this exception with
this._net = new Net<double>();
this._net.AddLayer(new InputLayer(28, 28, 1));
this._net.AddLayer(new ConvLayer(5, 5, 32) { Stride = 1, Pad = 2 });
this._net.AddLayer(new ReluLayer());
this._net.AddLayer(new ConvLayer(5, 5, 32) { Stride = 1, Pad = 2 });
this._net.AddLayer(new ReluLayer());
this._net.AddLayer(new PoolLayer(2, 2));
this._net.AddLayer(new DropoutLayer<double>(0.25));
this._net.AddLayer(new ConvLayer(3, 3, 64) { Stride = 1, Pad = 2 });
this._net.AddLayer(new ReluLayer());
this._net.AddLayer(new ConvLayer(3, 3, 64) { Stride = 1, Pad = 2 });
this._net.AddLayer(new ReluLayer());
this._net.AddLayer(new PoolLayer(2, 2) { Stride = 2 });
this._net.AddLayer(new DropoutLayer<double>(0.25));
this._net.AddLayer(new FullyConnLayer(256));
this._net.AddLayer(new ReluLayer());
this._net.AddLayer(new DropoutLayer<double>(0.5));
this._net.AddLayer(new FullyConnLayer(10));
this._net.AddLayer(new SoftmaxLayer(10)); |
I'm trying to use your software and mimic these results from this keras model using the MnistDemo program.
In particular, I'm looking at this section:
This is what I have so far:
This seems close but I'm not exactly sure how to do the last section since I see no Flatten() or Dense() methods, so I'm not sure it will even work at all.
Do you have any ideas how to duplicate this model section with your codebase?
Thanks,
Darren
The text was updated successfully, but these errors were encountered: