Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mismatched keys in pretrained JETS model #5237

Closed
iamanigeeit opened this issue Jun 16, 2023 · 11 comments
Closed

Mismatched keys in pretrained JETS model #5237

iamanigeeit opened this issue Jun 16, 2023 · 11 comments
Assignees
Labels
Bug bug should be fixed TTS Text-to-speech

Comments

@iamanigeeit
Copy link
Contributor

iamanigeeit commented Jun 16, 2023

Hello @imdanboy,

When i try to load the pretrained model from HuggingFace, it seems that tts.discriminator.msd.discriminators.0.layers should have weight_v and weight_g but it only has weight.

os.chdir(LJSPEECH_DIR)
pretrained_dir = LJSPEECH_DIR / "exp/tts_train_jets_raw_phn_tacotron_g2p_en_no_space"
pretrained_model_file = pretrained_dir / "train.total_count.ave_5best_new.pth"
pretrained_tts = Text2Speech.from_pretrained(
    train_config=pretrained_dir / "config.yaml",
    model_file=pretrained_model_file,
    device=device
)
pretrained_model = pretrained_tts.model
os.chdir(PWD)

Error:

RuntimeError: Error(s) in loading state_dict for ESPnetTTSModel:
	Missing key(s) in state_dict: "tts.discriminator.msd.discriminators.0.layers.0.0.weight_g", "tts.discriminator.msd.discriminators.0.layers.0.0.weight_v", "tts.discriminator.msd.discriminators.0.layers.1.0.weight_g", "tts.discriminator.msd.discriminators.0.layers.1.0.weight_v", "tts.discriminator.msd.discriminators.0.layers.2.0.weight_g", "tts.discriminator.msd.discriminators.0.layers.2.0.weight_v", "tts.discriminator.msd.discriminators.0.layers.3.0.weight_g", "tts.discriminator.msd.discriminators.0.layers.3.0.weight_v", "tts.discriminator.msd.discriminators.0.layers.4.0.weight_g", "tts.discriminator.msd.discriminators.0.layers.4.0.weight_v", "tts.discriminator.msd.discriminators.0.layers.5.0.weight_g", "tts.discriminator.msd.discriminators.0.layers.5.0.weight_v", "tts.discriminator.msd.discriminators.0.layers.6.0.weight_g", "tts.discriminator.msd.discriminators.0.layers.6.0.weight_v", "tts.discriminator.msd.discriminators.0.layers.7.weight_g", "tts.discriminator.msd.discriminators.0.layers.7.weight_v". 
	Unexpected key(s) in state_dict: "tts.discriminator.msd.discriminators.0.layers.0.0.weight", "tts.discriminator.msd.discriminators.0.layers.1.0.weight", "tts.discriminator.msd.discriminators.0.layers.2.0.weight", "tts.discriminator.msd.discriminators.0.layers.3.0.weight", "tts.discriminator.msd.discriminators.0.layers.4.0.weight", "tts.discriminator.msd.discriminators.0.layers.5.0.weight", "tts.discriminator.msd.discriminators.0.layers.6.0.weight", "tts.discriminator.msd.discriminators.0.layers.7.weight".

I think i can fix this by calculating weight_v and weight_g but this error only happened recently. All other parameters in the pretrained model use weight_v and weight_g.

@iamanigeeit iamanigeeit added the Bug bug should be fixed label Jun 16, 2023
@sw005320 sw005320 added the TTS Text-to-speech label Jun 17, 2023
@iamanigeeit
Copy link
Contributor Author

Seems like due to the weight norm bug being fixed in commit 1b45bdc

I propose to close #4595

@iamanigeeit
Copy link
Contributor Author

iamanigeeit commented Jun 17, 2023

@kan-bayashi The mismatched keys problem should apply to all HiFiGAN pretrained models. The below code should take the old .pth, compute the weight norm and write to a new .pth (keeping the module order).

import torch
import yaml
from collections import OrderedDict
from espnet2.gan_tts.hifigan.hifigan import HiFiGANMultiScaleDiscriminator as Msd

config_file = 'config.yaml'
pth_file = 'train.total_count.ave_5best.pth'
new_pth_file = 'train.total_count.ave_5best_new.pth'

state_dict = torch.load(pth_file)
msd_prefix = 'tts.discriminator.msd.'
msd_state_dict = {k[len(msd_prefix):]: v for (k, v) in state_dict.items() if k.startswith(msd_prefix)}

with open(config_file, 'r') as f:
    config = yaml.safe_load(f)
discrim_config = config['tts_conf']['discriminator_params']
scale_prefix = 'scale_'
msd_config = {}
for k, v in discrim_config.items():
    if not k.startswith('period'):
        if k.startswith(scale_prefix):
            key = k[len(scale_prefix):]
        else:
            key = k
        msd_config[key] = v

msd_config['discriminator_params']['use_weight_norm'] = False
msd = Msd(**msd_config)

for discrim in msd.discriminators:
    discrim.apply_weight_norm()

new_msd_state = msd.state_dict(prefix=msd_prefix)

before = True
before_keys, after_keys = [], []
for k in state_dict:
    if k.startswith(msd_prefix):
        before = False
        continue
    else:
        if before:
            before_keys.append(k)
        else:
            after_keys.append(k)

new_state_dict = OrderedDict()
for k in before_keys:
    new_state_dict[k] = state_dict[k]
for k in new_msd_state:
    new_state_dict[k] = new_msd_state[k]
for k in after_keys:
    new_state_dict[k] = state_dict[k]

for x in new_state_dict:
    print(x)

torch.save(new_state_dict, new_pth_file)

@imdanboy
Copy link
Contributor

I faced the same problem and had to checkout to old commit (specified at huggingface) 😅

@sw005320
Copy link
Contributor

Thanks for the report.
@kan-bayashi, can you deal with this?

@kan-bayashi
Copy link
Member

I will try to fix without re-uploading all of models. Those are too much :(

@kan-bayashi
Copy link
Member

Fixed in #5240

@nellorebhanuteja
Copy link

@kan-bayashi Looks like this error still occurs when when you pip install espnet

@kan-bayashi
Copy link
Member

Not yet released. Please use the master.

@nellorebhanuteja
Copy link

I did

pip install git+https://github.com/espnet/espnet

Getting the following error while loading VITS model

RuntimeError: Error(s) in loading state_dict for ESPnetTTSModel:
        Missing key(s) in state_dict: "tts.discriminator.msd.discriminators.0.layers.0.0.weight", "tts.discriminator.msd.discriminators.0.layers.1.0.weight", "tts.discriminator.msd.discriminators.0.layers.2.0.weight", "tts.discriminator.msd.discriminators.0.layers.3.0.weight", "tts.discriminator.msd.discriminators.0.layers.4.0.weight", "tts.discriminator.msd.discriminators.0.layers.5.0.weight", "tts.discriminator.msd.discriminators.0.layers.6.0.weight", "tts.discriminator.msd.discriminators.0.layers.7.weight". 
        Unexpected key(s) in state_dict: "tts.discriminator.msd.discriminators.0.layers.0.0.weight_g", "tts.discriminator.msd.discriminators.0.layers.0.0.weight_v", "tts.discriminator.msd.discriminators.0.layers.1.0.weight_g", "tts.discriminator.msd.discriminators.0.layers.1.0.weight_v", "tts.discriminator.msd.discriminators.0.layers.2.0.weight_g", "tts.discriminator.msd.discriminators.0.layers.2.0.weight_v", "tts.discriminator.msd.discriminators.0.layers.3.0.weight_g", "tts.discriminator.msd.discriminators.0.layers.3.0.weight_v", "tts.discriminator.msd.discriminators.0.layers.4.0.weight_g", "tts.discriminator.msd.discriminators.0.layers.4.0.weight_v", "tts.discriminator.msd.discriminators.0.layers.5.0.weight_g", "tts.discriminator.msd.discriminators.0.layers.5.0.weight_v", "tts.discriminator.msd.discriminators.0.layers.6.0.weight_g", "tts.discriminator.msd.discriminators.0.layers.6.0.weight_v", "tts.discriminator.msd.discriminators.0.layers.7.weight_g", "tts.discriminator.msd.discriminators.0.layers.7.weight_v". 

@kan-bayashi
Copy link
Member

It seems your case is slightly different from the above case.
You try to load weights with norm but the model does not use norm.
Maybe you already changed config not to use use_weight_norm or use_spectral_norm.
If not, please provide the reproducible codes.

@nellorebhanuteja
Copy link

Right.
I changed the value of use_weight_norm. Issue resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug bug should be fixed TTS Text-to-speech
Projects
None yet
Development

No branches or pull requests

5 participants