-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP][SPARK-4251][SPARK-2352][MLLIB]Add RBM, ANN, BDN algorithm to MLlib #3222
Conversation
Test build #23257 has started for PR 3222 at commit
|
Test build #23257 has finished for PR 3222 at commit
|
Test FAILed. |
Test build #23258 has started for PR 3222 at commit
|
Test build #23258 has finished for PR 3222 at commit
|
Test PASSed. |
c78b421
to
1e4fa3b
Compare
Jenkins, retest this please. |
Test build #23363 has started for PR 3222 at commit
|
Test build #23364 has started for PR 3222 at commit
|
Test build #23363 has finished for PR 3222 at commit
|
Test PASSed. |
Test build #23364 has finished for PR 3222 at commit
|
Test PASSed. |
Why there are no any annotation or readme in your code? |
Sorry, This patch is still work in process., I will add the annotation and document at later. |
9855fe1
to
5f1c8a0
Compare
Test build #23440 has started for PR 3222 at commit
|
@witgo is your neural net model a RDD or it is an array ? |
Now, neural net model is stored in a matrix. The model is able to support 10000 * 500 * 100 three-layer neural network and 100000*1000 two-layer neural network. |
Test build #23440 has finished for PR 3222 at commit
|
Test FAILed. |
weightCost: Double, | ||
learningRate: Double): DBN = { | ||
StackedRBM.train(data.map(_._1), batchSize, numIteration, dbn.stackedRBM, | ||
fraction, momentum, weightCost, learningRate, dbn.stackedRBM.numLayer - 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last layer should be also trained?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it all depends on the problem. Last layer usually is changeable when it comes to classification or regression problems etc. It might not be necessary to be trained on pretrain
only if trained on finetune
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, Thanks.
Test build #23674 has started for PR 3222 at commit
|
Test build #25361 has started for PR 3222 at commit
|
Test build #25361 has finished for PR 3222 at commit
|
Test PASSed. |
Hi @witgo, where can I find your email? 中文交流 |
witgo#qq.com |
Test build #26145 has started for PR 3222 at commit
|
Test build #26145 has finished for PR 3222 at commit
|
Test PASSed. |
Test build #26207 has started for PR 3222 at commit
|
Test build #26207 has finished for PR 3222 at commit
|
Test PASSed. |
So, this hasn't been touched in a couple months, and doesn't merge. It overlaps with some existing functionality in MLlib, and some other works in progress. It's really a big-bang change that dumps a lot of new code and I'm not sure it's been argued that all of this belongs in MLlib. Some of this functionality might belong in the new API as a transformation. I think this should be closed, at this point, in favor of collaborating on the other ANN implementation or reintroducing some of this in much smaller changes. |
Well, I have to close it |
Activation function
Gradient descent
Regularization method
Experimental Results
mnist dataset
MLP
network structure: 784 * 500 * 10; Gradient descent: learning Rate: 0.1; Weight cost 0.0; Dropout rate: 0.2, fraction: 0.05; Training data: 5000; Test data: 5000; AdaGrad(rho 0.99 epsilon 0.01 gamma 0.1 momentum 0.9)
DBN
network structure: 784 * 300 * 300 * 500 * 10; Gradient descent: AdaGrad;learning Rate: 0.05; Weight cost 0.0; Dropout rate: 0.5, miniBatch: 300; Training data: 60000; Test data: 10000;
REFERENCES