You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, Could you help me solve a question, please? In the script espnet2/samplers/num_elements_batch_sampler.py, the method of dividing batches based on bin is implemented, and each batch loses the randomness of the samples. Does that greatly affect the training effect (although there is randomness between batches)? For TTS task, it will first sort the data globally and divide each batch of samples based on the sorting and bin. This process will be completed before training and will only be executed once (espnet2/tasks/abs_task.py: build_batch_sampler). The training process only shuffles all batches.
I sincerely await your reply.
The text was updated successfully, but these errors were encountered:
dbkest
changed the title
[QUESTION] [TTS] num_elements_batch_sampler loses the randomness of the samples
[QUESTION] [TTS] 'num_elements_batch_sampler' loses the randomness of the samples
Mar 1, 2024
Good question.
The reason for this implementation is to make the balance of random shuffling and GPU memory usage.
Actually, we had an experiment before (7 years ago) for ASR between utterance-level shuffling and batch-level shuffling, and the difference was marginal (but this experiment causes different effective batch sizes, and the comparison could have been better).
Also, some people even sort it from short to long for all utterances and report that it is better (due to curriculum learning effects).
So, the entirely random shuffling may not be needed.
However, this is an old experience.
Nowadays, many technologies have changed, and we may have different conclusions. It's worth revisiting.
Also, we started to use fixed-length utterances (with padding) in some projects, where we can perform random shuffling for all utterances.
It would be great if you could do some investigations.
Hello, Could you help me solve a question, please? In the script espnet2/samplers/num_elements_batch_sampler.py, the method of dividing batches based on bin is implemented, and each batch loses the randomness of the samples. Does that greatly affect the training effect (although there is randomness between batches)? For TTS task, it will first sort the data globally and divide each batch of samples based on the sorting and bin. This process will be completed before training and will only be executed once (espnet2/tasks/abs_task.py: build_batch_sampler). The training process only shuffles all batches.
I sincerely await your reply.
The text was updated successfully, but these errors were encountered: