Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to train a model on a fixed dataset using SGD? #21

Closed
cuihenggang opened this issue Apr 14, 2015 · 1 comment
Closed

How to train a model on a fixed dataset using SGD? #21

cuihenggang opened this issue Apr 14, 2015 · 1 comment

Comments

@cuihenggang
Copy link

Hi,

I assume the "async_sgd" app is the one to use if we are going to train a model using SGD (correct me if I'm wrong).

And if I understand correctly, the "async_sgd" app will go through the data multiple times (specified by the "max_pass_of_data" config), but the printed "loss" is actually the average loss of all these data passes (the loss of the first pass also gets averaged). Is that correct?

Since we are using a fixed dataset, we hope to train the model using the batched way. But we find the "darlin" app uses BCD instead of SGD. Is that possible to train on batched data using SGD? It will be useful if the async_sgd app could print out the loss of the current model on only one copy of the input dataset during the training.

Thanks,
Cui

@mli
Copy link
Member

mli commented Apr 17, 2015

hi henggang,

the printed loss is the averaged minibatch loss since the last printing. for example,
mb1: 1, mb2: .9, then print 0.95,
mb3: .8, mb4:.7, then print 0.85

if you set max_pass_of_data=1, then the loss can be viewed as the loss on the test set.

i usually set max_pass_of_data=1 when the dataset is big.

@mli mli closed this as completed Apr 17, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants