With the great improvements done over the last few months ZeRO can be now be easily used for inference, as in this example:
huggingface/transformers#15399 (comment)
But to the user who may no longer care for training it's no longer intuitive to have these train-specific config variables:
"train_batch_size": train_batch_size,
"train_micro_batch_size_per_gpu": 1,
it's time to come up with a train/inference-non-specific config names and deprecate the train-specific ones.
there might be others, but these 2 just stood out.
Thank you.