You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I assume the "async_sgd" app is the one to use if we are going to train a model using SGD (correct me if I'm wrong).
And if I understand correctly, the "async_sgd" app will go through the data multiple times (specified by the "max_pass_of_data" config), but the printed "loss" is actually the average loss of all these data passes (the loss of the first pass also gets averaged). Is that correct?
Since we are using a fixed dataset, we hope to train the model using the batched way. But we find the "darlin" app uses BCD instead of SGD. Is that possible to train on batched data using SGD? It will be useful if the async_sgd app could print out the loss of the current model on only one copy of the input dataset during the training.
Thanks,
Cui
The text was updated successfully, but these errors were encountered:
the printed loss is the averaged minibatch loss since the last printing. for example,
mb1: 1, mb2: .9, then print 0.95,
mb3: .8, mb4:.7, then print 0.85
if you set max_pass_of_data=1, then the loss can be viewed as the loss on the test set.
i usually set max_pass_of_data=1 when the dataset is big.
Hi,
I assume the "async_sgd" app is the one to use if we are going to train a model using SGD (correct me if I'm wrong).
And if I understand correctly, the "async_sgd" app will go through the data multiple times (specified by the "max_pass_of_data" config), but the printed "loss" is actually the average loss of all these data passes (the loss of the first pass also gets averaged). Is that correct?
Since we are using a fixed dataset, we hope to train the model using the batched way. But we find the "darlin" app uses BCD instead of SGD. Is that possible to train on batched data using SGD? It will be useful if the async_sgd app could print out the loss of the current model on only one copy of the input dataset during the training.
Thanks,
Cui
The text was updated successfully, but these errors were encountered: