Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is pretrained weight of discriminator of base model available? #5

Open
seastar105 opened this issue Nov 20, 2022 · 3 comments
Open

Comments

@seastar105
Copy link

Thanks for nice work. @bshall

I'm trying to train hifigan now, but it takes so long training it from scratch using other dataset.

If discriminator of base model is also available, I could start finetuning based on that vocoder. it seems that you released only generator. Could you also release discriminator weights?

@MuruganR96
Copy link

MuruganR96 commented Nov 20, 2022

@bshall Thank you for this great work

@bshall @seastar105 only loading generator weights and starting fine-tuning is a good idea?

def load_checkpoint(
    load_path,
    generator,
    discriminator,
    optimizer_generator,
    optimizer_discriminator,
    scheduler_generator,
    scheduler_discriminator,
    rank,
    logger,
    finetune=False,
):
    logger.info(f"Loading checkpoint from {load_path}")
    checkpoint = torch.load(load_path, map_location={"cuda:0": f"cuda:{rank}"})

    generator.load_state_dict(checkpoint)
    
    if not finetune:
        discriminator.load_state_dict(checkpoint["discriminator"]["model"])
        optimizer_generator.load_state_dict(checkpoint["generator"]["optimizer"])
        scheduler_generator.load_state_dict(checkpoint["generator"]["scheduler"])
        optimizer_discriminator.load_state_dict(
            checkpoint["discriminator"]["optimizer"]
        )
        scheduler_discriminator.load_state_dict(
            checkpoint["discriminator"]["scheduler"]
        )
    step = checkpoint.get("step", 1)
    loss = checkpoint.get("loss", float("inf"))
    return step, loss

Thanks

@seastar105
Copy link
Author

@mraj96 maybe discriminator will be too weak. but I'm gonna try it. training from scratch takes about 1 week in my environment. It sounds much doable for me.

@seastar105
Copy link
Author

@mraj96 finetuning or training hifigan takes really long time, so i gave up.

i've finetuned vocoder with only generator, but ~300 epoch trainig showed nothing difference between using base vocoder. using base vocoder works pretty good if you trained with clean dataset, such as kss, jsut. but my own dataset is quite noisy, but audible showed really poor quality. this makes finetuning necessary. so i gave up and trying to use hubert-soft and e2e model(e.g. VITS) together.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants