You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thanks for kindly releasing the code for the paper. (Also congratulations on the acceptance in INTERSPEECH!)
While I was running the code, I encountered a significant issue - pYAAPT.yaapt extremely slow the training.
Here's how I found out such a bottleneck on speed:
I tried to run train_f0_vq.py as specified in README.
However, training was too slow; looks like we need to train an f0 vq model for 400000 steps, but a single epoch (about 700 steps) took 2657 seconds to run. GPU util was really low, and CPUs were running like crazy. (My server has 3080 Ti with 64 CPU cores.)
After that, a single epoch after the first epoch (for an initial caching) took only 36 seconds.
So my question is, how did you manage to run yaapt on-the-fly without caching? Though I succeeded in training the model fast enough, I shall need to disable caching again since it requires the _sample_interval method to sample the same interval for each audio (i.e. disabling the data augmentation via randomly choosing the interval).
The text was updated successfully, but these errors were encountered:
In our experiments we were able to finish 1 epoch in ~760 seconds, on the VCTK dataset. It might be possible that our naive implementation ran faster on our hardware.
Going forward, it seems that adding caching speeds up training! Another option is to add a preprocessing step that will extract pitch values from all wav samples + update the dataset to load preprocessed values instead of calculating them on the fly.
Hi, I think you might be training on some shared platform with a weak cpu. E.g. Google Colab container. When training on Colab, I got the same time as yours does. However, when I train on my 1080ti, I got the same training speed as the author.
Hi, thanks for kindly releasing the code for the paper. (Also congratulations on the acceptance in INTERSPEECH!)
While I was running the code, I encountered a significant issue -
pYAAPT.yaapt
extremely slow the training.Here's how I found out such a bottleneck on speed:
train_f0_vq.py
as specified in README.pYAAPT.yaapt
to be a cause for this. To test that, I forked a repository to add a caching functionality: https://github.com/seungwonpark/speech-resynthesisSo my question is, how did you manage to run yaapt on-the-fly without caching? Though I succeeded in training the model fast enough, I shall need to disable caching again since it requires the
_sample_interval
method to sample the same interval for each audio (i.e. disabling the data augmentation via randomly choosing the interval).The text was updated successfully, but these errors were encountered: