You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to run experiments following the instructions given in README.
I find that in Step 1 warmup training, 5% of samples are randomly selected to train $M_S$. But in Step 2 Building the gradient datastore, the selected samples used to calculate gradients seem to be fixed as the first 200 samples of each dataset.
This makes me confused about whether the samples used for warmup training and gradient calculation should be the same, can you kindly explain it to me?
The text was updated successfully, but these errors were encountered:
In the first step we use 5% of the full dataset to perform warmup training to get the Adam optimizer states. When calculating the gradients in the second step, you should use the full dataset, including the data used for warmup training. Let me know if you have more questions!
Hi,
I'm trying to run experiments following the instructions given in README.
I find that in Step 1 warmup training, 5% of samples are randomly selected to train$M_S$ . But in Step 2 Building the gradient datastore, the selected samples used to calculate gradients seem to be fixed as the first 200 samples of each dataset.
This makes me confused about whether the samples used for warmup training and gradient calculation should be the same, can you kindly explain it to me?
The text was updated successfully, but these errors were encountered: