Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are samples used for warmup training and gradient calculation the same? #5

Closed
ZigeW opened this issue Mar 1, 2024 · 1 comment
Closed

Comments

@ZigeW
Copy link

ZigeW commented Mar 1, 2024

Hi,

I'm trying to run experiments following the instructions given in README.

I find that in Step 1 warmup training, 5% of samples are randomly selected to train $M_S$. But in Step 2 Building the gradient datastore, the selected samples used to calculate gradients seem to be fixed as the first 200 samples of each dataset.

This makes me confused about whether the samples used for warmup training and gradient calculation should be the same, can you kindly explain it to me?

@xiamengzhou
Copy link
Collaborator

Hi sorry for the late reply!

In the first step we use 5% of the full dataset to perform warmup training to get the Adam optimizer states. When calculating the gradients in the second step, you should use the full dataset, including the data used for warmup training. Let me know if you have more questions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants