Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about chunks #16

Open
wangshiyao-1119 opened this issue Apr 15, 2021 · 6 comments
Open

Questions about chunks #16

wangshiyao-1119 opened this issue Apr 15, 2021 · 6 comments

Comments

@wangshiyao-1119
Copy link

Hello, thank you very much for your code.
I have a question.
Why do I need to split the .wav file into chunks? Is the chunk file the training set?
Which training set is more appropriate? Or do you have a trained model for reference?
Thank you very much!

@relativeflux
Copy link
Member

relativeflux commented Apr 15, 2021

Hi @wangshiyao-1119, thanks for your interest in the project. The folder containing the chunks is indeed the training set, but it also contains the validation set - which by default is 10% of the total (the entire dataset is randomly shuffled first, then split into the two sub-datasets. The shuffle is guaranteed to be the same distribution if you resume the same training after terminating).

We will be releasing some generated output from the Beethoven 32 Piano Sonatas dataset in the next few days - I will add a link to the README when ready. You can create your own dataset from archive.org. I concatenated all 32 sonatas together, then split into 8 second chunks (no overlap). I used the same hparams as in the original 2017 paper. if i recall correctly 1 rnn layer, frame sizes of 2 & 8, dimensionality of 1024 and seq len of 512. No skip connections because that that only applies with more than 1 rnn layer. Batch size 128. Trained on 1 NVIDIA GFORCE RTX 3080, c.20 min per epoch I believe.

The generted output from the Beethoven is quite impressive, but any large-enough dataset should be able to produce something interesting. It likes audio that is sonically fairly uniform, however - which is why the Beethoven works well. For the Beethoven the dataset was about 5000 chunks in size (no overlap).

@wangshiyao-1119
Copy link
Author

Hi @wangshiyao-1119, thanks for your interest in the project. The folder containing the chunks is indeed the training set, but it also contains the validation set - which by default is 10% of the total (the entire dataset is randomly shuffled first, then split into the two sub-datasets. The shuffle is guaranteed to be the same distribution if you resume the same training after terminating).

We will be releasing some generated output from the Beethoven 32 Piano Sonatas dataset in the next few days - I will add a link to the README when ready. You can create your own dataset from archive.org. I concatenated all 32 sonatas together, then split into 8 second chunks (no overlap). I used the same hparams as in the original 2017 paper. if i recall correctly 1 rnn layer, frame sizes of 2 & 8, dimensionality of 1024 and seq len of 512. No skip connections because that that only applies with more than 1 rnn layer. Batch size 128. Trained on 1 NVIDIA GFORCE RTX 3080, c.20 min per epoch I believe.

The generted output from the Beethoven is quite impressive, but any large-enough dataset should be able to produce something interesting. It likes audio that is sonically fairly uniform, however - which is why the Beethoven works well. For the Beethoven the dataset was about 5000 chunks in size (no overlap).

Thank you very much for your answer! I have started training with my own data set and look forward to the results!

@wangshiyao-1119
Copy link
Author

Hi @wangshiyao-1119, thanks for your interest in the project. The folder containing the chunks is indeed the training set, but it also contains the validation set - which by default is 10% of the total (the entire dataset is randomly shuffled first, then split into the two sub-datasets. The shuffle is guaranteed to be the same distribution if you resume the same training after terminating).

We will be releasing some generated output from the Beethoven 32 Piano Sonatas dataset in the next few days - I will add a link to the README when ready. You can create your own dataset from archive.org. I concatenated all 32 sonatas together, then split into 8 second chunks (no overlap). I used the same hparams as in the original 2017 paper. if i recall correctly 1 rnn layer, frame sizes of 2 & 8, dimensionality of 1024 and seq len of 512. No skip connections because that that only applies with more than 1 rnn layer. Batch size 128. Trained on 1 NVIDIA GFORCE RTX 3080, c.20 min per epoch I believe.

The generted output from the Beethoven is quite impressive, but any large-enough dataset should be able to produce something interesting. It likes audio that is sonically fairly uniform, however - which is why the Beethoven works well. For the Beethoven the dataset was about 5000 chunks in size (no overlap).

Hello, I have one more question. Can this project only run on the CPU? I am currently running very slowly on the CPU. Can the program run on the GPU? (Except for hyperparameter optimization)

@relativeflux
Copy link
Member

Hello, I have one more question. Can this project only run on the CPU? I am currently running very slowly on the CPU. Can the program run on the GPU? (Except for hyperparameter optimization)

Oh yes, of course! I wouldn'r bother runing it on the CPU at all. The Beethoven was trained on one of the latest NVIDIA GPUs, in the new Ampere generation. Have you tried running it on the provided Colab notebook? See the README for the link.

@wangshiyao-1119
Copy link
Author

Hello, I have one more question. Can this project only run on the CPU? I am currently running very slowly on the CPU. Can the program run on the GPU? (Except for hyperparameter optimization)

Oh yes, of course! I wouldn'r bother runing it on the CPU at all. The Beethoven was trained on one of the latest NVIDIA GPUs, in the new Ampere generation. Have you tried running it on the provided Colab notebook? See the README for the link.

Thanks again for the reply! Can the program run on my own GPU? I have installed CUDA 10.0. But when I run training and generate scripts, the program automatically uses the CPU.

@relativeflux
Copy link
Member

relativeflux commented Apr 19, 2021

TensorFlow should automatically use your GPU, if you have one. If you run Python (from within a conda environment, highly recommended) and import TensorFlow, what does the following give you:

tf.config.list_physical_devices('GPU')

You might see something like:

Could not load dynamic library 'libcublas.so.10'; dlerror: libcublas.so.10: cannot open shared object file: No such file or directory

That means TensorFlow can't find the relevant dynamic libraries.

Unfortunately the training script silences TensorFlow warnings, which is perhaps not very helpful. To view what TF is telling you uncomment the following lines at the top of the script as follows:

#os.environ["KMP_AFFINITY"] = "noverbose"
#os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

import tensorflow as tf
#tf.get_logger().setLevel('ERROR')
#tf.autograph.set_verbosity(3)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants