-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In generating phase, the necessarity of increasing batch size to the length of dataset. #7
Comments
Hi @jiangtanzju , TL;DR. It won't influence the performance. We once had a set of experiments to try embedding generation with BatchNorm set in the train mode, which computes batch mean and variance on the fly, mitigating the discrepancy between the downstream graphs and the pretraining graphs. In that case, if the instances in the test dataset are not shuffled, it will cause data leaks if bs is set small. This is why we set it to be the dataset set in the first place. In case you want to try the same thing, you can ignore this. As for the inference time, this line won't have too much effect. I ran
|
Hi @jiangtanzju , The computation of each dataloader is mainly on It seems that in your setup, this function is not parallel, which leads to a single dataloader only utilizing 100% CPU. With MKL LAPACK installed, this function can be parallel. In that case, even if other dataloaders are touching fish, the one dataloader will utilize all of the CPUs so the time doesn't change so much in my setup. Anyway this shouldn't matter since you can simply increase the number of loaders by decreasing batch_size. |
https://github.com/THUDM/GCC/blob/master/generate.py#L90
I find if comment this line, i.e. maintain batch size at 32, inference will be much faster.
Why do you set batch size to the length of dataset. Will it influence the performance?
The text was updated successfully, but these errors were encountered: