-
Notifications
You must be signed in to change notification settings - Fork 526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
My trained model is 4GB, how that? #22
Comments
because the default config is huage. 12*8(512channel) |
We haven't done a lot of ablative analysis yet to see how few channels we could get away with or how few layers. A lot of architecture decisions were made based on the early parts of the training curves which seem to favor bigger models. But if smaller models were trained for 500k iterations they might sound essentially as good. |
@hcwu1993 the trained model or the checkpoint file that is saved during training and includes includes the optimizer states? |
it should save model parameters and structure information according to the pytorch doc. |
By the way, the given model is 2GB. So the config is different from the paper? And i got a unusual result using this model. The F0 of generation wav is lower than the natural, it sounds like male voice. |
@hcwu1993 Unlike the checkpoints saved during draining that include optimizer states, the checkpoint we shared with the pretrained model only contains the model. Hence the difference in size. |
Your mel-spectrograms must have the same parameters (sampling_rate, filter_lenght, hop_length, win_length, mel_fmin, mel_fmax) as your model. The pretrained model we share was trained with "mel_fmax": 8000.0. If you trained your model before this update, it is possible that your model was trained with librosa's default: "mel_fmax": sampling_rate/2. |
@ Did your batch_size=24, train with fp16? |
No, we trained with FP32. |
Closing issue. Please re-open if needed. |
My trained model is 4GB, how that?
The text was updated successfully, but these errors were encountered: